Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'LAD is already registered in models' #9

Closed
Rogersiy opened this issue Oct 9, 2022 · 3 comments
Closed

KeyError: 'LAD is already registered in models' #9

Rogersiy opened this issue Oct 9, 2022 · 3 comments

Comments

@Rogersiy
Copy link

Rogersiy commented Oct 9, 2022

When I want to run this model on my server:Ubuntu 20.4, python 3.7 CUDA 10.1,I create a bash with all the sh commands you mentioned. But I encounter this error , I have no ideas to solve it.

File "/home/ouc/CodeFiles/CvFiles/CoLAD/ccdet/models/detectors/__init__.py", line 3, in <module> from .lad import LAD File "/home/ouc/CodeFiles/CvFiles/CoLAD/ccdet/models/detectors/lad.py", line 8, in <module> class LAD(SingleStageDetector): File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/mmcv/utils/registry.py", line 312, in _register module_class=cls, module_name=name, force=force) File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/mmcv/utils/registry.py", line 246, in _register_module raise KeyError(f'{name} is already registered ' KeyError: 'LAD is already registered in models'

@chuong98
Copy link
Contributor

Hi, I believe this error is due you install the MMDet newer version that the one we used, and MMDet already implemented LAD.
Therefore, LAD is already registered within MMDET. If that is true, there is two approaches:

  • Install older version of MMDET before LAD was supported.
  • Clone our code, and rename LAD to something else.

@Rogersiy
Copy link
Author

Thank you very much for solving my problem.
I am sorry to bother you again. When I train the is model, I encounter another error: Input contains NaN. Could you help me to see what is the cause of this problem?

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').
Killing subprocess 8595
Traceback (most recent call last):
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/ouc/anaconda3/envs/lad/bin/python', '-u', '/home/ouc/CodeFiles/CvFiles/mmdetection/mmdet/.mim/tools/train.py', '--local_rank=0', 'configs/lad/paa_lad_r50_r101p1x_1x_coco.py', '--launcher', 'pytorch', '--work-dir', 'checkpoints/lad/paa_lad_r50_r101p1x_1x_coco', '--seed', '0', '--deterministic']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/ouc/anaconda3/envs/lad/bin/mim", line 8, in
sys.exit(cli())
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/mim/commands/train.py", line 108, in cli
other_args=other_args)
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/site-packages/mim/commands/train.py", line 259, in train
cmd, env=dict(os.environ, MASTER_PORT=str(port)))
File "/home/ouc/anaconda3/envs/lad/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)

@chuong98
Copy link
Contributor

Hi, without the code it is really hard to debug. I would suggest to use import pdb; pdb.set_trace() , and inspect where the NAN happens, which could be at the loss value or from the data set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants