Gradients blowing up when F.dropout is used with Conv1D #18169
Labels
awaiting response (this tag is deprecated)
This tag is deprecated while we figure out what to do with it
module: nn
Related to torch.nn
馃悰 Bug
The gradients of the Conv1D layer are blowing up when dropout is used after the convolution. I noticed this after migrating my code. The problem occurs on Pytorch 1.0.0 and 1.0.1.
The problem does not occur if:
To Reproduce
Steps to reproduce the behavior:
Conv class:
Model class:
Printing out grad norms:
Expected behavior
With dropout=False for the ConvBlocks, the gradients for CNN.x look okay.
With dropout=True for the ConvBlocks, the gradients for CNN.x look a lot bigger. (Note, I tried setting dropout=True for the "fcX" (fully connected) layers but gradients did not blow up in the same way).
Environment
PyTorch version: 1.0.1.post2
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 2.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: TITAN Xp
GPU 1: Quadro P400
Nvidia driver version: 384.130
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.15.4
[pip] torch==1.0.1.post2
[pip] torchvision==0.2.2
[conda] blas 1.0 mkl
[conda] mkl 2019.1 144
[conda] mkl_fft 1.0.6 py27hd81dba3_0
[conda] mkl_random 1.0.2 py27hd81dba3_0
[conda] pytorch 1.0.1 py2.7_cuda9.0.176_cudnn7.4.2_2 pytorch
[conda] torchvision 0.2.2 py_3 pytorch
Additional context
The text was updated successfully, but these errors were encountered: