'no mudule named ''config' #15

qijindao · 2021-03-30T14:36:33Z

Hi! I have questions to disturb you.
When trying to run train_net.py, I have no way to solve 'from config import config'.when the error exists'no mudule named ''config',I try to 'pip install config'.But there are still errors.I have searched for some way,but no way works.Can you help me ?

chensnathan · 2021-03-30T15:47:46Z

Hi,
You can use pods_train --num-gpus 8 instead of directly running with train_net.py.

BTW, could you provide more details about how you install YOLOF and how you train with YOLOF?

qijindao · 2021-03-31T01:34:46Z

Thank you for your reply! I appreciate it.I find the pods_train,but it is not .py file,so i don't know how to use it.
My environment is torch1.6 python3.8.When I try to run with train_net.py,I consistently install many modules according to error prompt.I also met the problem about cvpods,I just used 'python setup.py develop' according to the instruction.

qijindao · 2021-03-31T02:20:48Z

sorry,I haven't expressed my meaning clearly.I want to say' you means i needn't care about train_net.py although errors exists. What i need to do is use the instuction 'pods_train -- num-gpus 1''

chensnathan · 2021-03-31T04:50:13Z

pods_train is a shell script, you can use it directly with pods_train --num-gpus 8 in the directory (e.g., YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x).

BTW, you can find the pods_train file in YOLOF/tools/.

tangjiuqi097 · 2021-03-31T06:22:27Z

@qijindao
Hi, you can try:

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py  -- num-gpus 8

or

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
pods_train  -- num-gpus 8

qijindao · 2021-03-31T07:03:47Z

pods_train is a shell script, you can use it directly with pods_train --num-gpus 8 in the directory (e.g., YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x).

BTW, you can find the pods_train file in YOLOF/tools/.

Thank you for your reply.

qijindao · 2021-03-31T07:06:16Z

@qijindao
Hi, you can try:

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py  -- num-gpus 8

or

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
pods_train  -- num-gpus 8

Thank you for your reply.Have you trained the code successfully?I may have some questions

qijindao · 2021-03-31T07:40:03Z

我爆显存了，根据以往经验都是更改batchsize的大小，但是在这个文件夹里一直没有找到有关batchsize的代码，不知道是不是我漏读了

chensnathan · 2021-03-31T07:47:25Z

Could you provide more details about how you train with YOLOF?

qijindao · 2021-03-31T07:54:07Z

根据目录，我将coco2017的数据集放在datasets文件夹里。根据
cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py -- num-gpus 1
指令运行来执行训练

chensnathan · 2021-03-31T08:20:59Z

YOLOF_res50_C5 needs 5.2~5.3G to train. If your GPU's memory is less than that, you should reduce the IMS_PER_DIVECE in the config.py file.

qijindao · 2021-03-31T09:30:23Z

好的，非常感谢你。因为我的电脑只有一个gpu。当我把config里面的devices改为1的时候，程序可以跑了。但是跑了一会时间，就出现了新的错误AssertionError: Box regression deltas become infinite or NaN!

tangjiuqi097 · 2021-03-31T10:22:30Z

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

qijindao · 2021-03-31T10:47:19Z

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

log.txt

tangjiuqi097 · 2021-03-31T11:21:15Z

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

log.txt

Hi, the cvpods can automatically adjust the learning rate and iterations if you use a different number of gpus.
However, the default setting is 8 images per GPU, if you use 1 image per GPU, you need to decrease the base learning rate by a factor of 8 and increase the iteration (as well as the warmup iteration) by a factor of 8.
And you should also replace the Batchnorm with Groupnrom.

qijindao · 2021-03-31T12:45:08Z

Ok,thank you for your detailed reply.I can roughly understand your instruction.I am still uncertain of some code.First,in the runnning instruction'pods_train -- num-gpus 8' ,is '8' of 'gpus 8' the id of gpu in a computer? Or, is '8' of 'gpus 8' the quantity of gpu in a computer.Second, IMS_PER_DIVECE=8 means 8 images per GPU? Three,Do the values of IMS_PER_BATCH and IMS_PER_DIVECE have to be proportional? After many experiments of mine, I feel as if the ratio is equal to 8 to get through.Idon't know why.

tangjiuqi097 · 2021-04-01T03:08:48Z

@qijindao

In 'pods_train -- num-gpus 8', "8" means that it uses a total of 8 GPUs.
Yes, IMS_PER_DIVECE=8 means 8 images per GPU
IMS_PER_BATCH = IMS_PER_DIVECE * num-gpus

qijindao · 2021-04-01T03:12:16Z

@qijindao

In 'pods_train -- num-gpus 8', "8" means that it uses a total of 8 GPUs.

Yes, IMS_PER_DIVECE=8 means 8 images per GPU

IMS_PER_BATCH = IMS_PER_DIVECE * num-gpus

Thank you very much! I get it!

chensnathan closed this as completed Apr 1, 2021

zcl912 mentioned this issue Apr 8, 2021

/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda [](int)->auto::operator()(int)->auto: block: [0,0,0], thread: [121,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed. #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'no mudule named ''config' #15

'no mudule named ''config' #15

qijindao commented Mar 30, 2021

chensnathan commented Mar 30, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Apr 1, 2021 •

edited

qijindao commented Apr 1, 2021

'no mudule named ''config' #15

'no mudule named ''config' #15

Comments

qijindao commented Mar 30, 2021

chensnathan commented Mar 30, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

qijindao commented Mar 31, 2021

chensnathan commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Mar 31, 2021

qijindao commented Mar 31, 2021

tangjiuqi097 commented Apr 1, 2021 • edited

qijindao commented Apr 1, 2021

tangjiuqi097 commented Apr 1, 2021 •

edited