New error (sadly) #981

holgafreak · 2016-09-27T03:54:49Z

got further, but noiw this one:

In 1 module of nn.Sequential:
/home/xxx/torch/install/share/lua/5.1/nn/THNN.lua:110: bad argument #3 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THFloatTensor *')
stack traceback:
[C]: in function 'v'
/home/xxx/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
...in/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:96: in function <...in/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:92>

yesterday didn't have this error either

-m

soumith · 2016-09-27T03:56:09Z

have you made sure that your model is typecasted to CUDA (or that your input is Float?)

holgafreak · 2016-09-27T04:05:01Z

yes it is

supakjk · 2016-09-27T21:30:54Z

Same problem for ClassNLLCriterion after the update. (It worked fine until yesterday.)
luajit: /xxxxx/torch/install/share/lua/5.1/nn/THNN.lua:110: bad argument #3 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THCudaLongTensor *')
stack traceback:
[C]: in function 'v'
/xxxxx/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'ClassNLLCriterion_updateOutput'
.../torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:41: in function 'updateOutput'
...torch/install/share/lua/5.1/nn/CrossEntropyCriterion.lua:13: in function 'forward'
xxxxx.lua:99: in main chunk
[C]: at 0x00405bb0

soumith · 2016-09-27T21:34:42Z

@supakjk update nn and cunn both.

soumith · 2016-09-27T21:34:54Z

@holgafreak can you give a small test case for this?

holgafreak · 2016-09-27T21:54:35Z

@soumith after updating nn and cunn everything goes ok with my code.
But for some reason the labels go wild, and i'm getting this assertion cur_target > 0 and cur_target <= n_classes in v-function. (I'm just remembering something like this type of error, it's on the linux-thing). Labels on my code should just be 1 and 2, and they are not changed anywhere. Printing them the values are ints, but mostly >10000.
tried on os x and here's one from the demos train-on-cifar:
qlua: /Users/mjkoskin/torch-cl/install/share/lua/5.1/nn/THNN.lua:807: Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /Users/mjkoskin/torch-cl/extra/nn/lib/THNN/generic/ClassNLLCriterion.c:31

supakjk · 2016-09-27T23:28:52Z

Even after updating nn and cunn, mine still doesn't work.

require 'nn'
torch.setdefaulttensortype('torch.FloatTensor')
crit = nn.CrossEntropyCriterion()
input = torch.Tensor():rand(5)
label = 3
crit:forward(input, label)

The CPU version works without problem but the following code (CUDA version) produces the error message I mentioned above.

require 'cunn'
torch.setdefaulttensortype('torch.FloatTensor')
crit = nn.CrossEntropyCriterion():cuda()
input = torch.CudaTensor():randn(5)
label = 3
crit:forward(input, label)

1byxero · 2016-09-30T07:33:35Z

I was trying this tutorial of torch and was trying to execute CNN for cifar10 given on the page and I encountered a similar error... Please help
`/home/cuda/torch/install/share/lua/5.1/nn/THNN.lua:110: bad argument #3 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THCudaLongTensor *')

stack traceback:
[C]: in function 'v'
/home/cuda/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'ClassNLLCriterion_updateOutput'
...uda/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:41: in function 'forward'
...da/torch/install/share/lua/5.1/nn/StochasticGradient.lua:35: in function 'train'
[string "_RESULT={trainer:train(trainset)}"]:1: in main chunk
[C]: in function 'xpcall'
/home/cuda/torch/install/share/lua/5.1/trepl/init.lua:652: in function 'repl'
...cuda/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x00406670 `

supakjk · 2016-09-30T07:40:41Z

The problem seems because the target variable is not correctly set to be a CudaLongTensor.
As a temporary solution, I did something like the following. Hope there would be certain update or comments regarding this issue.
local theCrit = nn.CrossEntropyCriterion():cuda()
theCrit.nll.target = torch.CudaLongTensor{theCrit.nll.target[1]}

GenTxt · 2016-09-30T16:00:12Z

Hi supakjk:

I have the same error and torch/lua is new to me. Could you explain where I add the code for your temporary solution?

Is it in the module(s) listed in the error or the lua script I'm trying to run, or both?

An example would be appreciated.

Thanks

Error:

/home/gentxt/torch/install/share/lua/5.1/nn/THNN.lua:110: bad argument #3 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THCudaLongTensor *')
stack traceback:
[C]: in function 'v'
/home/gentxt/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'ClassNLLCriterion_updateOutput'
...txt/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:41: in function 'forward'
sample.lua:182: in function 'sample'
sample.lua:236: in main chunk
[C]: in function 'dofile'
...usr/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405e40

Torch/lua works fine with torch-rnn, https://github.com/karpathy/char-rnn etc. but doesn't work with the current code I'm trying to run. I've updated everything and the error continues.

supakjk · 2016-09-30T19:18:22Z

I mean, when you get an instance of the cuda version of ClassNLLCriterion, change the target field of that instance to be a CudaLongTensor (probably initially a CudaTensor)
I think it should be automatically done but the current source code missed that part.

soumith · 2016-09-30T19:29:10Z

it is automatically done in the source code: https://github.com/torch/nn/blob/master/ClassNLLCriterion.lua#L36

soumith · 2016-09-30T19:29:55Z

actually, i realized that i missed the non-batch case. I'm fixing it now.

soumith · 2016-10-01T22:25:30Z

this should be fixed now in master, and reinstalling the "nn" package will make it go away.

5ade793

luarocks install nn

dkbemisIII · 2016-10-02T06:04:26Z

This still breaks for me, after updating. The code posted by @supakjk above gives the same error. torch.cudaLong is nil for me. Do you mean:

self.target = target.cudaLong and self.target:cudaLong() or self.target:cuda()

supakjk · 2016-10-02T06:05:48Z

It should be torch.CudaLong (not torch.cudaLong.)

dkbemisIII · 2016-10-02T06:07:07Z

Don't think so. target.cudaLong checks to make sure the conversion function is present for the tensor. Possibly 'torch.CudaLongTensor' could serve the same purpose, but indirectly. torch.CudaLong is still nil.

supakjk · 2016-10-02T06:08:59Z

My mistake. I mean torch.CudaLongTensor.
After updating all the related ones (nn,cunn,torch,cutorch,etc), my test code above worked fine.

dkbemisIII · 2016-10-02T16:41:37Z

That's a little surprising. There was a first fix that should have worked, but broke compatibility when there was no CudaLong. The next update seems to have rebroken the initial issue (at least for me), because of the typo.

If you update nn, it works for you?

soumith · 2016-10-02T16:43:04Z

there was my patch which was the attempted fix. then @mys007 sent a fix for back-compat, but it was broken. i then pushed another fix on top of his fix that keeps back-compat and works for master too.

soumith · 2016-10-02T16:43:27Z

if you now update nn, it should (fingers-crossed) work for you

dkbemisIII · 2016-10-02T17:06:49Z

Seems good now. Thanks.

uahsan3 · 2016-10-04T02:06:44Z

When I rebuild cunn, I get the following:

[ 14%] Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir/THCUNN_generated_SpatialDilatedConvolution.cu.o
Building NVCC (Device) object lib/THCUNN/CMakeFiles/THCUNN.dir/THCUNN_generated_AbsCriterion.cu.o
/tmp/luarocks_cunn-scm-1-258/cunn/lib/THCUNN/RReLU.cu(68): error: identifier "THCRandom_generatorStates" is undefined

1 error detected in the compilation of "/tmp/tmpxft_00003e0f_00000000-7_RReLU.cpp1.ii".
CMake Error at THCUNN_generated_RReLU.cu.o.cmake:267 (message):
Error generating file
/tmp/luarocks_cunn-scm-1-258/cunn/build/lib/THCUNN/CMakeFiles/THCUNN.dir//./THCUNN_generated_RReLU.cu.o

make[2]: *** [lib/THCUNN/CMakeFiles/THCUNN.dir/THCUNN_generated_RReLU.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [lib/THCUNN/CMakeFiles/THCUNN.dir/all] Error 2
make: *** [all] Error 2

Error: Build error: Failed building.

Any suggestions?

soumith · 2016-10-04T03:31:02Z

@uahsan3 this comes because of an outdated cutorch version.
luarocks install cutorch
luarocks install cunn

yuvalpinter · 2016-10-17T21:02:20Z

@soumith thanks!

eriche2016 · 2016-10-20T12:03:03Z

Hi, I use nn.LookupTable module , but came accross the similiar error above, can anyone fix this?

eriche2016 · 2016-10-20T12:10:52Z

my erorr message is below:
ome/xxx/torch/install/bin/luajit: /home/xxx/.luarocks/share/lua/5.1/nn/THNN.lua:109: bad argument #2 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THCudaLongTensor *')
stack traceback:
[C]: in function 'v'
/home/xxx/.luarocks/share/lua/5.1/nn/THNN.lua:109: in function 'LookupTable_accGradParameters'
/home/xxx/.luarocks/share/lua/5.1/nn/LookupTable.lua:85: in function 'accGradParameters'
/home/xxx/.luarocks/share/lua/5.1/nn/Module.lua:32: in function 'backward'
./misc_saver2_reg_atten_ws/LanguageModel.lua:738: in function 'updateGradInput'
/home/xxx/.luarocks/share/lua/5.1/nn/Module.lua:31: in function 'backward'
train_reg_on_att.lua:496: in function 'lossFun'
train_reg_on_att.lua:574: in main chunk
[C]: in function 'dofile'
.../hxw/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

eriche2016 · 2016-10-20T12:14:37Z

an i have update my torch, nn, cutorch, cunn to the latest versions. any idea?

fmassa · 2016-10-20T12:20:45Z

@eriche2016 pass a CudaLongTensor instead of a CudaTensor, and you should be fine.

eriche2016 · 2016-10-20T12:43:51Z

@fmassa Still, i got the same error when doing backward pass. Below is the test code, check it.

th> model = nn.LookupTable(4, 3)
                                                                      [0.0001s]
th> model:cuda()
nn.LookupTable
                                                                      [0.0031s]
th> model:forward(torch.CudaLongTensor({1}))
 0.3487  1.1548 -0.7722
[torch.CudaTensor of size 1x3]

                                                                      [0.0016s]
th> model:backward(torch.CudaLongTensor{1}, torch.CudaTensor(1, 3))
/home/xxx/.luarocks/share/lua/5.1/nn/THNN.lua:109: bad argument #2 to 'v' (cannot convert 'struct THCudaTensor *' to 'struct THCudaLongTensor *')
stack traceback:
        [C]: in function 'v'
        /home/xxx/.luarocks/share/lua/5.1/nn/THNN.lua:109: in function 'LookupTable_accGradParameters'
        /home/xxx/.luarocks/share/lua/5.1/nn/LookupTable.lua:85: in function 'accGradParameters'
        /home/xxx/.luarocks/share/lua/5.1/nn/Module.lua:32: in function 'backward'
        [string "_RESULT={model:backward(torch.CudaLongTensor{..."]:1: in main chunk
        [C]: in function 'xpcall'
        /home/xxx/torch/install/share/lua/5.1/trepl/init.lua:650: in function 'repl'
        .../xxx/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
        [C]: at 0x00406670

fmassa · 2016-10-20T12:48:22Z

@eriche2016 It seems that you don't have the latest nn. Line 85 in LookupTable.lua does not match yours.

eriche2016 · 2016-10-20T12:53:23Z

@fmassa I update my nn with command below:
luarocks install nn
I suppose it is the latest version of nn.

fmassa · 2016-10-20T12:58:35Z

@eriche2016 Then I don't understand your error message. The distro package points latest nn to this line, which doesn't correspond to the error you are facing.
I'd check if the installation was successful, or if you had errors during compilation.

eriche2016 · 2016-10-20T12:58:45Z

I open my LookupTable.lua file, and it is the latest. see blow.

got any idea to solve this problem?

fmassa · 2016-10-20T13:01:05Z

@eriche2016 there seems to be something wrong with your setup. The error message that you show corresponds to a line of comment.

eriche2016 · 2016-10-20T13:10:11Z

Oh, i got it, the error its sit on the file in:
/home/xxx/.luarocks/share/lua/5.1/nn/LookupTable.lua:85
not in /home/xxx/torch/ folder, which contains the latest nn. I will have to fix this

eriche2016 · 2016-10-20T13:36:57Z

@fmassa thank u very much for your patience. problem solved

GenTxt mentioned this issue Sep 30, 2016

Sampling code yoonkim/lstm-char-cnn#11

Open

soumith closed this as completed Oct 1, 2016

harshithabk mentioned this issue Aug 8, 2018

Neuron trained with Human face data fails to execute in this code leehomyc/Faster-High-Res-Neural-Inpainting#39

Open

New error (sadly) #981

New error (sadly) #981

Comments

holgafreak commented Sep 27, 2016

soumith commented Sep 27, 2016

holgafreak commented Sep 27, 2016

supakjk commented Sep 27, 2016

soumith commented Sep 27, 2016

soumith commented Sep 27, 2016

holgafreak commented Sep 27, 2016

supakjk commented Sep 27, 2016 • edited Loading

1byxero commented Sep 30, 2016

supakjk commented Sep 30, 2016 • edited Loading

GenTxt commented Sep 30, 2016

supakjk commented Sep 30, 2016

soumith commented Sep 30, 2016

soumith commented Sep 30, 2016

soumith commented Oct 1, 2016

dkbemisIII commented Oct 2, 2016

supakjk commented Oct 2, 2016

dkbemisIII commented Oct 2, 2016

supakjk commented Oct 2, 2016

dkbemisIII commented Oct 2, 2016

soumith commented Oct 2, 2016

soumith commented Oct 2, 2016

dkbemisIII commented Oct 2, 2016

uahsan3 commented Oct 4, 2016

soumith commented Oct 4, 2016

yuvalpinter commented Oct 17, 2016

eriche2016 commented Oct 20, 2016

eriche2016 commented Oct 20, 2016 • edited Loading

eriche2016 commented Oct 20, 2016

fmassa commented Oct 20, 2016

eriche2016 commented Oct 20, 2016 • edited Loading

fmassa commented Oct 20, 2016

eriche2016 commented Oct 20, 2016 • edited Loading

fmassa commented Oct 20, 2016

eriche2016 commented Oct 20, 2016 • edited Loading

fmassa commented Oct 20, 2016

eriche2016 commented Oct 20, 2016

eriche2016 commented Oct 20, 2016

supakjk commented Sep 27, 2016 •

edited

Loading

supakjk commented Sep 30, 2016 •

edited

Loading

eriche2016 commented Oct 20, 2016 •

edited

Loading

eriche2016 commented Oct 20, 2016 •

edited

Loading

eriche2016 commented Oct 20, 2016 •

edited

Loading

eriche2016 commented Oct 20, 2016 •

edited

Loading