You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If two training jobs are run at the same time, eventually they will attempt to evaluate results at the same time. This will cause a crash due to the fact that CocoEvaluator uses a hard-coded temporary file name, ./temp.json.
Writing the file ./temp.json would also fail if the user ran the training script from a read-only filesystem.
To Reproduce
Steps to reproduce the behavior:
Start two training jobs on the same machine in the same directory.
Wait
Crash
Expected behavior
No crash.
A simple fix is to use a unique temporary file, then there will be no conflict. Here is a patch:
From 6ff05c5028657a84b89a86e548258bc9a94bbf74 Mon Sep 17 00:00:00 2001
From: Andrew Lavin <andrew@subdivision.ai>
Date: Sat, 23 Oct 2021 10:34:03 -0700
Subject: [PATCH] Modified CocoEvaluator to dump coco predictions to a unique
temporary file.
---
effdet/evaluator.py | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/effdet/evaluator.py b/effdet/evaluator.py
index b923655..366b4e4 100644
--- a/effdet/evaluator.py
+++ b/effdet/evaluator.py
@@ -8,6 +8,8 @@ import numpy as np
from .distributed import synchronize, is_main_process, all_gather_container
from pycocotools.cocoeval import COCOeval
+from tempfile import NamedTemporaryFile
+import os
# FIXME experimenting with speedups for OpenImages eval, it's slow
#import pyximport; py_importer, pyx_importer = pyximport.install(pyimport=True)
@@ -100,8 +102,10 @@ class CocoEvaluator(Evaluator):
if not self.distributed or dist.get_rank() == 0:
assert len(self.predictions)
coco_predictions, coco_ids = self._coco_predictions()
- json.dump(coco_predictions, open('./temp.json', 'w'), indent=4)
- results = self.coco_api.loadRes('./temp.json')
+ with NamedTemporaryFile(prefix='coco_', suffix='.json', delete=False, mode='w') as tmpfile:
+ json.dump(coco_predictions, tmpfile, indent=4)
+ results = self.coco_api.loadRes(tmpfile.name)
+ os.unlink(tmpfile.name)
coco_eval = COCOeval(self.coco_api, results, 'bbox')
coco_eval.params.imgIds = coco_ids # score only ids we've used
coco_eval.evaluate()
--
2.17.1
The text was updated successfully, but these errors were encountered:
Describe the bug
If two training jobs are run at the same time, eventually they will attempt to evaluate results at the same time. This will cause a crash due to the fact that CocoEvaluator uses a hard-coded temporary file name,
./temp.json
.Writing the file
./temp.json
would also fail if the user ran the training script from a read-only filesystem.To Reproduce
Steps to reproduce the behavior:
Expected behavior
No crash.
A simple fix is to use a unique temporary file, then there will be no conflict. Here is a patch:
The text was updated successfully, but these errors were encountered: