Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using --mode=visualize after trained the model, but I got the wrong pred_x.png #35

Closed
ChinaYi opened this issue May 16, 2017 · 11 comments
Closed

Comments

@ChinaYi
Copy link

ChinaYi commented May 16, 2017

the output file pred_x.png has no color except black.

@ChinaYi
Copy link
Author

ChinaYi commented May 16, 2017

ok. I've seen the same problem in the previous issues. I'll try to add the batchsize up to 16,and try again. And I wish it will work.

@yeshenlin
Copy link

@ChinaYi Maybe you can use "logs/images/Image_Cmaped.ipynb" to visualize your output.

@ChinaYi
Copy link
Author

ChinaYi commented May 16, 2017

@yeshenlin thanks, I've tried this, but did not work, actually, it turns an all-black png to an all-blue one.

@shekkizh
Copy link
Owner

@ChinaYi Is it possible that the model hasn't been trained for enough epochs?

@ChinaYi
Copy link
Author

ChinaYi commented May 21, 2017

@shekkizh thanks for your help. My iteration times are set to 1e+5, and the best valid_loss is 0.566078. Results are similiar to groud_truth, but not enough comparing to your report. Is there any possible reasons?

Uploading pred_5.png…

@ChinaYi ChinaYi closed this as completed May 24, 2017
@placentian
Copy link

Hi @ChinaYi : I am experiencing the same problem as you previously got. After training for 100K (default) iterations, the prediction image is just black. Did you do anything special to fix this issue?

Step: 99800, Train_loss:3.27614
Step: 99810, Train_loss:4.61153
Step: 99820, Train_loss:3.18354
Step: 99830, Train_loss:2.48784
Step: 99840, Train_loss:2.74431
Step: 99850, Train_loss:3.29662
Step: 99860, Train_loss:3.19142
Step: 99870, Train_loss:4.58435
Step: 99880, Train_loss:2.86515
Step: 99890, Train_loss:3.80537
Step: 99900, Train_loss:2.84969
Step: 99910, Train_loss:3.47618
Step: 99920, Train_loss:3.40244
Step: 99930, Train_loss:3.72526
Step: 99940, Train_loss:3.00145
Step: 99950, Train_loss:3.51405
Step: 99960, Train_loss:2.82784
Step: 99970, Train_loss:2.75893
Step: 99980, Train_loss:2.42421
Step: 99990, Train_loss:3.21162
Step: 100000, Train_loss:2.66081

@ChinaYi
Copy link
Author

ChinaYi commented May 26, 2017

@placentian it seems that your loss has been fluctuated with the iteration times and haven't reached the bottom. This is not a good training process. You'd better enlarge the batch size and try to reduce to learning rate. I have been suffered from the same issues like you, you got black picture since you have a large loss. cheer up!

@placentian
Copy link

Thanks a lot for your encouragement, @ChinaYi . Do I also need to change the iteration to some other numbers (like 200K or 500K)?

What kind of reasonable learning rate and batch size would you recommend by any chance?

@placentian
Copy link

I got better results after reducing the learingrate but they are not as good as @shekkizh 's results. Dont know what else we should try.

@ChinaYi
Copy link
Author

ChinaYi commented Jun 1, 2017

@placentian can you plot you loss curve? Or paste the Train_loss for each iteration

@CLegendBuptSun
Copy link

Maybe I can explain why we got black outputs.
When I run the code on 1 GPU Titan Xp (memory 12GB), the original code's performance is good.(Batchsize=2 ,lr=1e-4,itr=30000)
When I run the code on 2 GPU Titan Xp (memory 24GB), the original code's performance is terrible.(Batchsize=2 ,lr=1e-4,itr=30000).But I changed the Batchsize=8 and lr=1e-4 when itr<20000,lr=1e-5 when lr>20000,the performance is better than run on 1GPU.
So your configuration of computer may affect your results.
So try to change these basic arguments, you will get the satisfied output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants