New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FutureWarning and UserWarning #73
Comments
Hello!
This means the model is probably unstable and some values are becoming InF. Normally gradient clipping will set those values to max at the defined values, but looks like this behavior is changing. You can try lowering the learning rate a bit to see if the error goes away, otherwise, there might be some other configuration causing these values to become infinite.
This is normal when using autocasting, no problem. |
How long is an epoch taking? Regarding the GPU utilization in Windows task manager, please check: #72 |
The epoch lasts 250-400 seconds(I don't know whether it is long but I think for a long time). Please tell us the video card is normally used? Because the uneven graphic is surprising. (Results from GPU Z and from WIndows Task Manager very different) UPD. My bad, i dont check this issue(#72), how to include monitoring of cuda in windows task manager? |
How many images are you using for training? The time to complete a full epoch depends on how many images you have. More concretely, one epoch means that all the images have been used and it has to load all the images again with the dataloader. |
D:\traiNNer\codes\models\base_model.py:921: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
self.grad_clip(
C:\Python39\lib\site-packages\torch\optim\lr_scheduler.py:129: UserWarning: Detected call of
lr_scheduler.step()
beforeoptimizer.step()
. In PyTorch 1.1.0 and later, you should call them in the opposite order:optimizer.step()
beforelr_scheduler.step()
. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-ratewarnings.warn("Detected call of
lr_scheduler.step()
beforeoptimizer.step()
. "How fix this?
The text was updated successfully, but these errors were encountered: