Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch warning on lr_scheduler.step() and PIL Image error on Dataloader #3

Closed
ruili3 opened this issue Aug 16, 2021 · 4 comments
Closed

Comments

@ruili3
Copy link

ruili3 commented Aug 16, 2021

Hello,

Thanks for the impressive work! I've cloned the code and setting the environment as required (pytorch 1.7.1, torchvision 0.8.2). When running the code on KITTI (w/o depth hints), I encountered two problems.

  1. There is a warning 'UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.' Consider I use the 1.7.1 version torch, do I need to change the order in run_epoch.py as suggested by the warning?

  2. An error shows in line 231 of mono_dataset.py: inputs[("depth_gt", scale)] = self.resizescale
    'TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>'
    I transform 'depth_gt' to PIL Image format and the problem is settled. I wonder if the error is an individual case for me, or how do you handle this problem.

  3. There is another small error when using grid_sample(): 'UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.' Do I need to change the 'align_corners' parameter or just leave it unchanged?

I wonder how do you settle these issues in your implementation. Thank you a lot!

@MichaelRamamonjisoa
Copy link
Collaborator

Hello,
Thanks for your interest :)
Regarding the two warnings (1 and 3), you can ignore them, as training worked fine without changing those.
Regarding the dataloading error, I am not getting this on my machine, could you please send me your full command line to train waveletmonodepth?
I suspect this has to do with the fact that I do not have the "velodyne_points/data/{:010d}.bin" in my KITTI folder:

"velodyne_points/data/{:010d}.bin".format(int(frame_index)))

A quick fix would be to make check_depth return False. Let me know if it works for you, if so I will update this.

@ruili3
Copy link
Author

ruili3 commented Aug 17, 2021

Hi Michael,

The full command line is:
python train.py --data_path <data_path> --log_dir <log_path> --encoder_type resnet --num_layers 50 --width 640 --height 192 --model_name wavelet_S_HR_DH --frame_ids 0 --use_stereo --split eigen_full --use_wavelets.

Yes, I have "velodyne_points/data/{:010d}.bin" in my folder. When I force 'check_depth()' to return False, the pipeline now works fine. But I don't know whether this setting will influence other parts of the training. Thanks for your reply :D

@MichaelRamamonjisoa
Copy link
Collaborator

Hi,

Thanks for the command line!

Yes, the bug comes from there. I removed parts of the original code base that were using ground truth depth for train time depth evaluation. In the original code, this is only used to monitor depth prediction performances while the network is being trained with reprojection loss (+ regularization). In the end, monitoring that loss was enough for me to check that training was working well.

I will update the dataloader accordingly. I'll leave this issue opened until I do so, but you should be good to go with the quick fix we discussed.

Thanks!

@MichaelRamamonjisoa
Copy link
Collaborator

Fixed with commit 052c09e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants