Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'no mudule named ''config' #15

Closed
qijindao opened this issue Mar 30, 2021 · 18 comments
Closed

'no mudule named ''config' #15

qijindao opened this issue Mar 30, 2021 · 18 comments

Comments

@qijindao
Copy link

Hi! I have questions to disturb you.
When trying to run train_net.py, I have no way to solve 'from config import config'.when the error exists'no mudule named ''config',I try to 'pip install config'.But there are still errors.I have searched for some way,but no way works.Can you help me ?

@chensnathan
Copy link
Collaborator

Hi,
You can use pods_train --num-gpus 8 instead of directly running with train_net.py.

BTW, could you provide more details about how you install YOLOF and how you train with YOLOF?

@qijindao
Copy link
Author

Thank you for your reply! I appreciate it.I find the pods_train,but it is not .py file,so i don't know how to use it.
My environment is torch1.6 python3.8.When I try to run with train_net.py,I consistently install many modules according to error prompt.I also met the problem about cvpods,I just used 'python setup.py develop' according to the instruction.

@qijindao
Copy link
Author

sorry,I haven't expressed my meaning clearly.I want to say' you means i needn't care about train_net.py although errors exists. What i need to do is use the instuction 'pods_train -- num-gpus 1''

@chensnathan
Copy link
Collaborator

pods_train is a shell script, you can use it directly with pods_train --num-gpus 8 in the directory (e.g., YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x).

BTW, you can find the pods_train file in YOLOF/tools/.

@tangjiuqi097
Copy link

@qijindao
Hi, you can try:

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py  -- num-gpus 8

or

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
pods_train  -- num-gpus 8

@qijindao
Copy link
Author

pods_train is a shell script, you can use it directly with pods_train --num-gpus 8 in the directory (e.g., YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x).

BTW, you can find the pods_train file in YOLOF/tools/.

Thank you for your reply.

@qijindao
Copy link
Author

@qijindao
Hi, you can try:

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py  -- num-gpus 8

or

cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
pods_train  -- num-gpus 8

Thank you for your reply.Have you trained the code successfully?I may have some questions

@qijindao
Copy link
Author

我爆显存了,根据以往经验都是更改batchsize的大小,但是在这个文件夹里一直没有找到有关batchsize的代码,不知道是不是我漏读了

@chensnathan
Copy link
Collaborator

Could you provide more details about how you train with YOLOF?

@qijindao
Copy link
Author

根据目录,我将coco2017的数据集放在datasets文件夹里。根据
cd YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x
python YOLOF/tools/train_net.py -- num-gpus 1
指令运行来执行训练

@chensnathan
Copy link
Collaborator

YOLOF_res50_C5 needs 5.2~5.3G to train. If your GPU's memory is less than that, you should reduce the IMS_PER_DIVECE in the config.py file.

@qijindao
Copy link
Author

好的,非常感谢你。因为我的电脑只有一个gpu。当我把config里面的devices改为1的时候,程序可以跑了。但是跑了一会时间,就出现了新的错误AssertionError: Box regression deltas become infinite or NaN!

@tangjiuqi097
Copy link

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

@qijindao
Copy link
Author

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

log.txt

@tangjiuqi097
Copy link

@qijindao Can you provide you log file? It is at YOLOF/playground/detection/coco/yolof/yolof.res50.C5.1x/log/log.txt.
BTW, I think it is because you modify the batch size but dose not modify the learning rate or the warmup iterations.

log.txt

Hi, the cvpods can automatically adjust the learning rate and iterations if you use a different number of gpus.
However, the default setting is 8 images per GPU, if you use 1 image per GPU, you need to decrease the base learning rate by a factor of 8 and increase the iteration (as well as the warmup iteration) by a factor of 8.
And you should also replace the Batchnorm with Groupnrom.

@qijindao
Copy link
Author

Ok,thank you for your detailed reply.I can roughly understand your instruction.I am still uncertain of some code.First,in the runnning instruction'pods_train -- num-gpus 8' ,is '8' of 'gpus 8' the id of gpu in a computer? Or, is '8' of 'gpus 8' the quantity of gpu in a computer.Second, IMS_PER_DIVECE=8 means 8 images per GPU? Three,Do the values of IMS_PER_BATCH and IMS_PER_DIVECE have to be proportional? After many experiments of mine, I feel as if the ratio is equal to 8 to get through.Idon't know why.

@tangjiuqi097
Copy link

tangjiuqi097 commented Apr 1, 2021

@qijindao

  1. In 'pods_train -- num-gpus 8', "8" means that it uses a total of 8 GPUs.
  2. Yes, IMS_PER_DIVECE=8 means 8 images per GPU
  3. IMS_PER_BATCH = IMS_PER_DIVECE * num-gpus

@qijindao
Copy link
Author

qijindao commented Apr 1, 2021

@qijindao

  1. In 'pods_train -- num-gpus 8', "8" means that it uses a total of 8 GPUs.
  2. Yes, IMS_PER_DIVECE=8 means 8 images per GPU
  3. IMS_PER_BATCH = IMS_PER_DIVECE * num-gpus

Thank you very much! I get it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants