-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapshot generated after training not generating same results as provided snapshot #8
Comments
Update: With Pytorch 1.0 and recent NMR: So, as can be seen, tri_loss starts off same in both cases but in Pytorch 1.0 version, tri_loss does not decrease as opposed to with Pytorch 0.3 version where tri_loss goes down from 0.205 to 0.058 in 20 iterations. In Pytorch 1.0 case, it almost seems like its slightly increasing. I trained this further and observed that the "tri_loss" increased to around 0.3 by 3rd epoch. |
Hi, If that is not feasible, you could perhaps look at the commits to rasterize.py in NMR repository, and selectively undo the changes in (one or more of) the commits to try to get behavior consistent with ours. I think there have only been 3 commits that changed the rasterize.py since the version we used (see history here), and maybe undoing the changes there would help. |
I undid the 2 commits right after version 1.1.0. The most recent commit is for compatibility with chainer 5.0. Will try to install version 1.1.0 of NMR. Did not work before but lets see. |
Update: Tried version 1.1.0 of NMR with cupy 2.3 and chainer 3.3.0. Did not work. |
Works on a different GPU. [K80]. The exact versions required cant be installed on Tesla V100. |
@BornInWater hello,could you tell me more details about how to train this? Do you use it in car? |
Hi,
When I tested the model using the snapshot provided, I got the below result
But when I trained the model while following the instructions provided, similar snapshot is not able to generate similar output
I checked with snapshots later in the training as well, similar output as above.
The only thing I have changed is that I ported the code to Torch 1.0 and had to make one small change:
replace things like self.total_loss.data[0] with self.total_loss.item(). I am sure this is not the reason but just mentioned it for the sake of completion.
Could you suggest what might I be doing wrong ?
EDIT:
The losses are way higher as compared to the loss_log I got when I downloaded the snapshot.
The text was updated successfully, but these errors were encountered: