Sizes do not match error #2

syed-ahmed · 2016-06-29T15:29:12Z

Hi I tried to reproduce the training from this research paper. I was able to successfully train the encoder. However, I get the following error when training the decoder. Any help is appreciated.

Command used to train the encoder

th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/encoder.lua --save /home/ubuntu/ENet-training/train/trained_model/ --imHeight 360 --imWidth 480 --labelHeight 45 --labelWidth 60 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/

Command used to start training the decoder and its resulting error:

th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/decoder.lua --save /home/ubuntu/ENet-training/train/trained_decoder/ --imHeight 360 --imWidth 480 --labelHeight 360 --labelWidth 480 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/ --CNNEncoder /home/ubuntu/ENet-training/train/trained_model/model-299.net

Error:

==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: bad argument #2 to '?' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c:64) stack traceback: [C]: at 0x7ff8eee7d610 [C]: in function '__newindex' ./train.lua:97: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

Another command used to start training the decoder and its resulting error:

th run.lua --dataset cv --datapath /home/ubuntu/SegNet-Tutorial/CamVid/ --model models/decoder.lua --save /home/ubuntu/ENet-training/train/trained_decoder/ --imHeight 45 --imWidth 60 --labelHeight 360 --labelWidth 480 --cachepath /home/ubuntu/ENet-training/train/dataset_cache/ --CNNEncoder /home/ubuntu/ENet-training/train/trained_model/model-299.net

Error:

==> Training: epoch # 1 [batchSize = 2] /home/ubuntu/torch/install/bin/luajit: /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:67: Step: 0ms In 2 module of nn.Sequential: /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: bad argument #1 to 'copy' (sizes do not match at /home/ubuntu/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.cu:10) stack traceback: [C]: in function 'copy' /home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:39: in function </home/ubuntu/torch/install/share/lua/5.1/nn/JoinTable.lua:21> [C]: in function 'xpcall' /home/ubuntu/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/ubuntu/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' ./train.lua:108: in function 'opfunc' /home/ubuntu/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'adam' ./train.lua:123: in function 'train' run.lua:77: in main chunk [C]: in function 'dofile' ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

The text was updated successfully, but these errors were encountered:

codeAC29 · 2016-06-29T16:13:05Z

You are using same cachepath for encoder and decoder in the first approach. Try using a different location of datacache. Currently its using the dataset which it saved while training encoder.

Image and label resolution needs to be the same for decoder, so second approach is incorrect.

syed-ahmed · 2016-06-29T16:29:52Z

Thanks! That worked :)

syed-ahmed closed this as completed Jun 29, 2016

ramonss mentioned this issue Oct 6, 2016

Problem when training decoder #18

Closed

ramonss mentioned this issue Oct 25, 2016

Training with other datasets with different image size #19

Closed

MikepX99 mentioned this issue Jan 6, 2017

Error Training Decoder #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sizes do not match error #2

Sizes do not match error #2

syed-ahmed commented Jun 29, 2016 •

edited

Loading

codeAC29 commented Jun 29, 2016 •

edited

Loading

syed-ahmed commented Jun 29, 2016

Sizes do not match error #2

Sizes do not match error #2

Comments

syed-ahmed commented Jun 29, 2016 • edited Loading

Command used to train the encoder

Command used to start training the decoder and its resulting error:

Error:

Another command used to start training the decoder and its resulting error:

Error:

codeAC29 commented Jun 29, 2016 • edited Loading

syed-ahmed commented Jun 29, 2016

syed-ahmed commented Jun 29, 2016 •

edited

Loading

codeAC29 commented Jun 29, 2016 •

edited

Loading