New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only CUDA tensors are supported for cudnn.BatchNormalization! #219
Comments
You can have nn modules running on the GPU, and you can mix nn and cudnn modules, but they all should be on gpu in this case. |
@fmassa Yes, I did this. When I use cudnn.convert(model, cudnn, function(module) return torch.type(module):find('SpatialBatchNormalization') end), it works. |
Could you write a small working example that illustrates the issue? |
Maybe nn.Identity()() caused the problem.require 'nngraph'
require 'cutorch'
require 'cunn'
require 'cudnn'
local input = nn.Identity()()
local features = nn.Sequential()
features:add(nn.SpatialConvolution(1,8,5,5,1,1,2,2))
features:add(nn.Abs())
features:add(nn.SpatialBatchNormalization(8,nil,nil,false))
features:add(nn.Tanh())
features:add(nn.SpatialAveragePooling(5,5,2,2,2,2))
features:add(nn.SpatialConvolution(8,1,5,5,1,1,2,2))
features:add(nn.Tanh())
features:add(nn.SpatialAveragePooling(5,5,2,2,2,2))
local classifier = nn.Sequential()
classifier:add(nn.View(64*1*1))
classifier:add(nn.Linear(64, 2))
classifier:add(nn.LogSoftMax())
local model = nn.gModule({input},{classifier(features(input))})
local x = torch.rand(3,1,32, 32):type('torch.CudaTensor')
model:cuda()
cudnn.convert(model, cudnn)
--cudnn.convert(model, cudnn, function(module) return torch.type(module):find('SpatialBatchNormalization') end)
local y = model:forward(x)
print(y) |
Hi all, The same network was working before I updated to cuDNN v5.. |
@iN1k1 that is quite strange. During training, did you have a batch size of 1? |
Nope.. but I discovered that is something related to threads. I started my coding on the top of your imagenet-multiGPU.torch example. If I run the procedure with a single thread, than everything works. But, if I increase the number of threads, then I'm not only getting |
@byronwwang The problem seems to come from converting nn type batch norm to cudnn type batch norm. Initialized as an nn batch norm layer, it has no initialized bias and weight, hence, the assertion fails while checking the bias/weight's type after converted to cudnn type layer.. |
The previous layer of SpatialBatchNormalization is nn.Abs which can not be covert to cudnn.Abs(). Does this cause the problem?
The text was updated successfully, but these errors were encountered: