You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
concurrent.futures.process._RemoteTraceback:
"""Traceback (most recent call last): File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker r = call_item.fn(*call_item.args, **call_item.kwargs) File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/process.py", line 198, in _process_chunk return [fn(*args) for args in chunk] File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/process.py", line 198, in <listcomp> return [fn(*args) for args in chunk] File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['srun', '-p', 'gpu_p2', '--gres=gpu:4', '--ntasks=2', '--ntasks-per-node=4', '--cpus-per-task=2', '--kill-on-bad-exit=1', '-A', 'dsd@v100', '--time=14:00:00', '--job-name=detr_apeus_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0_train', 'python', '-u', '/gpfsdswork/projects/rech/dsd/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/site-p$ckages/mmdet/.mim/tools/train.py', '/gpfswork/rech/dsd/ugo97nm/log/work_dirs/APEUS/prior_3_step/detr_prior_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.f$ozen_stages_0/detr_apeus_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0.py', '--launcher', 'slurm', '--work-dir', '/gpfswork/rech/dsd/ugo97$m/log/work_dirs/APEUS/prior_3_step/detr_prior_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0']' returned non-zero exit status 1. """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./slurm/gridsearch_prior.py", line 32, in<module>
other_args=random_args,
File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/site-packages/mim/commands/gridsearch.py", line 412, in gridsearch
executor.map(subprocess.check_call, cmds)):
File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/process.py", line 483, in _chain_from_iterable_of_lists
forelementin iterable:
File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
yield fs.pop().result()
File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/_base.py", line 435, in result
returnself.__get_result()
File "/linkhome/rech/genevb01/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
subprocess.CalledProcessError: Command '['srun', '-p', 'gpu_p2', '--gres=gpu:4', '--ntasks=2', '--ntasks-per-node=4', '--cpus-per-task=2', '--kill-on-bad-exit=1', '-A', 'dsd@v100', '--time=14:00:00', '--job-name=detr_apeus_search_optimize
r.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0_train', 'python', '-u', '/gpfsdswork/projects/rech/dsd/ugo97nm/.conda/envs/apeus_vid_2/lib/python3.7/site-pa
ckages/mmdet/.mim/tools/train.py', '/gpfswork/rech/dsd/ugo97nm/log/work_dirs/APEUS/prior_3_step/detr_prior_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.fr
ozen_stages_0/detr_apeus_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0.py', '--launcher', 'slurm', '--work-dir', '/gpfswork/rech/dsd/ugo97n
m/log/work_dirs/APEUS/prior_3_step/detr_prior_search_optimizer.lr_0.0001_data.samples_per_gpu_8_model.bbox_head.add_norm_prior_True_model.bbox_head.num_query_2_model.backbone.frozen_stages_0']' returned non-zero exit status 1.
I'm not sure if the issue is related to my call to the API
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello, I'm running a gridsearch using slurm with this script
It works fine, but after a while I got this error
I'm not sure if the issue is related to my call to the API
Beta Was this translation helpful? Give feedback.
All reactions