Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Each validation, traing gets stuck. #22

Closed
jianlong-yuan opened this issue Feb 15, 2019 · 8 comments
Closed

Each validation, traing gets stuck. #22

jianlong-yuan opened this issue Feb 15, 2019 · 8 comments

Comments

@jianlong-yuan
Copy link

INFO:main: Val epoch: 28 [500/654] Mean IoU: 0.329
INFO:main: Val epoch: 28 [510/654] Mean IoU: 0.333
INFO:main: Val epoch: 28 [520/654] Mean IoU: 0.334
INFO:main: Val epoch: 28 [530/654] Mean IoU: 0.334
INFO:main: Val epoch: 28 [540/654] Mean IoU: 0.334
INFO:main: Val epoch: 28 [550/654] Mean IoU: 0.336
INFO:main: Val epoch: 28 [560/654] Mean IoU: 0.337
INFO:main: Val epoch: 28 [570/654] Mean IoU: 0.337
INFO:main: Val epoch: 28 [580/654] Mean IoU: 0.337
INFO:main: Val epoch: 28 [590/654] Mean IoU: 0.337
INFO:main: Val epoch: 28 [600/654] Mean IoU: 0.338
INFO:main: Val epoch: 28 [610/654] Mean IoU: 0.338
INFO:main: Val epoch: 28 [620/654] Mean IoU: 0.339
INFO:main: Val epoch: 28 [630/654] Mean IoU: 0.340
INFO:main: Val epoch: 28 [640/654] Mean IoU: 0.341
INFO:main: Val epoch: 28 [650/654] Mean IoU: 0.340
INFO:main: IoUs: [0.71555132 0.79084598 0.38018856 0.58270573 0.49350641 0.53325564
0.33868351 0.24003405 0.35049824 0.38725722 0.53552238 0.43335014
0.52524775 0.13147147 0.06738203 0.45353059 0.10944007 0.35642416
0.13853911 0.24521033 0.2119736 0.52864194 0.25071423 0.31138197
0.40138153 0.23264673 0.31995535 0.16649904 0.06017227 0.3065639
0.61248457 0.18471847 0.68977884 0.36473194 0.31989985 0.28686686
0.0092722 0.17552748 0.12222387 0.28322877]
INFO:main: Val epoch: 28 Mean IoU: 0.341
saving
INFO:main: New best value 0.3412, was 0.3211
saving done
starting *********

@DrSleep
Copy link
Owner

DrSleep commented Feb 18, 2019

Can you provide more information?

@jianlong-yuan
Copy link
Author

Without more information. If i set num_workers of DataLoader 0, training will not get stuck, but if i set num_workers of DataLoader bigger than 0, training will get stuck

@DrSleep
Copy link
Owner

DrSleep commented Feb 18, 2019

Without more information I cannot help. Try to find similar issues in other PyTorch-based projects.

@DrSleep DrSleep closed this as completed Feb 18, 2019
@jianlong-yuan
Copy link
Author

Thank you. This issue is a common problem in the pytorch. https://github.com/pytorch/pytorch/issues/1355 This is the same issue in the pytorch based projects.

@zhouyuan888888
Copy link

@jianlong-yuan I came across this problem too,Have you ever sloved this problem~*_*

@st2yang
Copy link

st2yang commented Apr 18, 2019

@zhouyuan888888
For those who met the similar issue, my program also got stuck when I tried to run with provided shell file. Then I turned to pycharm, then the program runs smoothly. I am also confused about the reason...

@jianlong-yuan
Copy link
Author

set dataloader num works =1, this will work

@wangq95
Copy link

wangq95 commented May 26, 2019

@ jianlong-yuan

set dataloader num works =1, this will work
Hi, I have tried to set the num_workers in src/config.py to 1, the training procedure was still blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants