You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi I tried to reproduce the training from this research paper. I was able to successfully train the encoder. However, I get the following error when training the decoder. Any help is appreciated.
==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: bad argument #2 to '?' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c:64) stack traceback: [C]: at 0x7ff8eee7d610 [C]: in function '__newindex' ./train.lua:97: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
Another command used to start training the decoder and its resulting error:
==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:67: Step: 0ms In 2 module of nn.Sequential: /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: bad argument #1 to 'copy' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.cu:10) stack traceback: [C]: in function 'copy' /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: in function </home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:21> [C]: in function 'xpcall' /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/ubuntu/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' ./train.lua:108: in function 'opfunc' /home/ubuntu/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'adam' ./train.lua:123: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
The text was updated successfully, but these errors were encountered:
You are using same cachepath for encoder and decoder in the first approach. Try using a different location of datacache. Currently its using the dataset which it saved while training encoder.
Image and label resolution needs to be the same for decoder, so second approach is incorrect.
Hi I tried to reproduce the training from this research paper. I was able to successfully train the encoder. However, I get the following error when training the decoder. Any help is appreciated.
Command used to train the encoder
th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/encoder.lua --save /home/ubuntu/ENet-training/train/trained_model/ --imHeight 360 --imWidth 480 --labelHeight 45 --labelWidth 60 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/
Command used to start training the decoder and its resulting error:
th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/decoder.lua --save /home/ubuntu/ENet-training/train/trained_decoder/ --imHeight 360 --imWidth 480 --labelHeight 360 --labelWidth 480 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/ --CNNEncoder /home/ubuntu/ENet-training/train/trained_model/model-299.net
Error:
==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: bad argument #2 to '?' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c:64) stack traceback: [C]: at 0x7ff8eee7d610 [C]: in function '__newindex' ./train.lua:97: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
Another command used to start training the decoder and its resulting error:
th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/decoder.lua --save /home/ubuntu/ENet-training/train/trained_decoder/ --imHeight 45 --imWidth 60 --labelHeight 360 --labelWidth 480 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/ --CNNEncoder /home/ubuntu/ENet-training/train/trained_model/model-299.net
Error:
==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:67: Step: 0ms In 2 module of nn.Sequential: /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: bad argument #1 to 'copy' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.cu:10) stack traceback: [C]: in function 'copy' /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: in function </home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:21> [C]: in function 'xpcall' /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/ubuntu/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' ./train.lua:108: in function 'opfunc' /home/ubuntu/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'adam' ./train.lua:123: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670
The text was updated successfully, but these errors were encountered: