Replies: 1 comment 1 reply
-
Look here, what's your config file? Did you pass the |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Help foreward
Hi somebody seeing this! I met a problem when I was training my own dataset. It shows:
2022-11-04 17:33:12,441 - mmcls - INFO - Set random seed to 666820090, deterministic: False
2022-11-04 17:33:12,566 - mmcls - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': ['Conv2d']}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}]
2022-11-04 17:33:12,666 - mmcls - INFO - initialize LinearClsHead with init_cfg {'type': 'Normal', 'layer': 'Linear', 'std': 0.01}
Traceback (most recent call last):
File "/environment/miniconda3/lib/python3.7/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
File "/home/featurize/mmclassification/mmcls/datasets/base_dataset.py", line 51, in init
self.data_infos = self.load_annotations()
File "/home/featurize/mmclassification/mmcls/datasets/my_filelist.py", line 12, in load_annotations
assert isinstance(self.ann_file, str)
AssertionError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/train.py", line 205, in
main()
File "tools/train.py", line 178, in main
datasets = [build_dataset(cfg.data.train)]
File "/home/featurize/mmclassification/mmcls/datasets/builder.py", line 55, in build_dataset
dataset = build_from_cfg(cfg, DATASETS, default_args)
File "/environment/miniconda3/lib/python3.7/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
AssertionError: CarbonateNet:
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 20530) of binary: /environment/miniconda3/bin/python
Traceback (most recent call last):
File "/environment/miniconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/environment/miniconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/environment/miniconda3/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
tools/train.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure):
[0]:
time : 2022-11-04_17:33:18
host : sc.10086.cn
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 20530)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
I'm sure I coded my own datasets in the file "mmcls/datasets" and added the information of my own dataset. I'm so confused why this error could happend
Beta Was this translation helpful? Give feedback.
All reactions