-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
train loss #5
Comments
老哥,你训练到最后loss收敛到多少,我训练自己的模型最后loss收敛到0.17应该不正常吧 |
@seeyouagain111 老铁, 你训练12个小时能迭代多少次?我只能迭代10000多次, 作者怎么13个小时能训练41000次?我也是2块1080ti。 |
|
@seeyouagain111 麻烦帮我看一下,谢谢了!!! |
可能dataloaderworker你设置的太小了 一般是4 8 16这样 |
我什么都没改,pull下来直接run,41000 iterations大概要26个小时。2块1080 ti,batch_size_per_gpu=4, input_size=(768,768) |
老哥,你是怎么train的。为什么我pull 下来然后提交脚本run,一直报RuntimeError: Ninja is required to load C++ extensions的错。 |
You have to build Ninja first. |
Could you please explain a little bit more in detail? Thank you. Because I am using the server of the school to run the code. And also, I try to install ninja by using pip install ninja, but it still does not work. I highly appreciate your help. |
As the introduction states, Ninja is a low-level assembler. So you can't simply use pip to "install" it. Instead, you can download the binary file from |
Do you mean to put this line "export PATH=$PATH:/path/to/ninja_folder" in your job submission script file? |
Many thanks to you. |
Yes, before you run the python command. That's what I do if running on a cluster. |
I unzipped the ninja-linux.zip in the folder containing the codes, and then export the location of the folder containing ninja binary file to PATH by adding this line "export PATH=$PATH:/path/to/ninja_folder" to the job submission script. But it still does not work. |
So instead of doing |
This error is caused by not having numba library. |
Thank you so much for all your help. |
This is probably because your gcc version is old. Maybe update gcc and try again? |
Still not working. Thank you so much. By the way, how much of the meanIOU can you achieve in the validate sets? |
Also, could you please share your steps to get the model running? Thank you so much. |
I can achieve an on-par performance with the one the author claims. That's ~80.5. Running the code is pretty straightforward if you set the environment correctly (which probably has annoyed you). |
Thank you so much for your time and help. Will let you know if I set the environment correctly and have the program running. Have a good day. |
I finally have the code running. First, I used the method mentioned in this link https://www.jianshu.com/p/d118615c1943 to build ninja. Then, I add this line "set path = ($path /somepath/DeepLabv3plus_cityscape/ninja)" before the python command. Because you can use export PATH=$PATH:/somepath/Deeplabv3plus_cityscape/ in bash shell, however, this line does not work in csh shell script (i.e., submission job script file). Last, thank you so much for all your time and help. |
@pgu-nd |
|
@pgu-nd 您好老师,我想问一下为什么当我执行 $ CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 evaluate.py 这条代码时, |
what about your train loss at last , 0.1xxx or0.01xxxx?
The text was updated successfully, but these errors were encountered: