GPU ISSUE: OneHot.lua:17: attempt to call method 'long' (a nil value) #2

mschonwe · 2015-05-21T21:47:33Z

Training proceeds fine on CPU ("-gpuid -1"), but errors as follows on GPU ("-gpuid 0"):

th train.lua -data_dir data/tinyshakespeare

using CUDA on GPU 0...
loading data files...
cutting off end of data so that the batches/sequences divide evenly
reshaping tensor...
data load done. Number of batches in train: 211, val: 11, test: 1
vocab size: 65
creating an LSTM with 2 layers
number of parameters in the model: 154165
cloning criterion
cloning softmax
cloning embed
cloning rnn
/home/username/torch/install/bin/luajit: ./util/OneHot.lua:17: attempt to call method 'long' (a nil value
stack traceback:
   ./util/OneHot.lua:17: in function 'forward'
   train.lua:172: in function 'opfunc'
   /home/username/torch/install/share/lua/5.1/optim/rmsprop.lua:36: in function 'rmsprop'
   train.lua:226: in main chunk
   [C]: in function 'dofile'
   .../username/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
   [C]: at 0x00406640

karpathy · 2015-05-21T21:50:43Z

hmmm, I'm not exactly sure what could be causing this. What GPU do you have? Is this a recent install of Torch and its dependencies? Is CUDA up to date?

mschonwe · 2015-05-21T22:09:06Z

GPU is a GTX 980
Cuda 7.0

I ran a git pull in torch, and 'update.sh' before posting the issue. I'm not sure about 'its dependancies', but I'll start tracking that down.

Most recently I've been working on the neon code / NervanaGPU, which I got running on the GPU, but maybe this introduced a conflict.

I should also say, thanks so much for posting this project!

karpathy · 2015-05-21T22:11:07Z

Can you try remove the call to long()? Or replace it with int() or short() ?

hsheil · 2015-05-21T22:15:23Z

I also got an Optim / rmsprop error (train.lua:226: attempt to call field 'rmsprop' (a nil value)) when training.

Pulling and rebuilding latest torch fixed this for me. I guess I was running a build of torch locally that was older than Mar 23rd when rmsprop was exported via init.lua. Can confirm that the code runs fine both on GPU and CPU after I upgraded to latest torch.

karpathy · 2015-05-21T22:17:02Z

@hsheil awesome thank you!

mschonwe · 2015-05-21T22:38:16Z

Working now :)
Running torch/install.sh did the trick. (Looks like I also had some file permissions set as root, so I fixed those too -- perhaps this is why torch/update.sh didn't get it going.)
Thanks again.

mschonwe closed this as completed May 21, 2015

mschonwe reopened this May 21, 2015

mschonwe closed this as completed May 21, 2015

karpathy mentioned this issue May 24, 2015

Crash on shakespeare sample #8

Closed

YafahEdelman mentioned this issue Jun 3, 2015

Word Level Encodings #16

Open

enicon mentioned this issue Jun 5, 2015

error in sample.lua: bad argument #2 to '?' (invalid multinomial distribution (sum of probabilities <= 0) at /root/torch/pkg/torch/lib/TH/generic/THTensorRandom.c:109) #28

Open

akarshinfrrd mentioned this issue Aug 29, 2017

attempt to call method 'double' (a nil value) paucarre/tiefvision#64

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU ISSUE: OneHot.lua:17: attempt to call method 'long' (a nil value) #2

GPU ISSUE: OneHot.lua:17: attempt to call method 'long' (a nil value) #2

mschonwe commented May 21, 2015

karpathy commented May 21, 2015

mschonwe commented May 21, 2015

karpathy commented May 21, 2015

hsheil commented May 21, 2015

karpathy commented May 21, 2015

mschonwe commented May 21, 2015

GPU ISSUE: OneHot.lua:17: attempt to call method 'long' (a nil value) #2

GPU ISSUE: OneHot.lua:17: attempt to call method 'long' (a nil value) #2

Comments

mschonwe commented May 21, 2015

karpathy commented May 21, 2015

mschonwe commented May 21, 2015

karpathy commented May 21, 2015

hsheil commented May 21, 2015

karpathy commented May 21, 2015

mschonwe commented May 21, 2015