meet error when running LSTM #137

StillKeepTry · 2018-04-03T01:58:32Z

I am trying to run the LSTM model. The command is CUDA_VISIBLE_DEVICES=0 python train.py data-bin/iwslt14.tokenized.de-en --optim adam --lr 0.0003125 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 --save-dir checkpoints/lstm/ --arch lstm_wiseman_iwslt_de_en, and meet error:

| [de] dictionary: 20111 types
| [en] dictionary: 14619 types
| data-bin/iwslt14.tokenized.de-en train 160215 examples
| data-bin/iwslt14.tokenized.de-en valid 7282 examples
| model lstm_wiseman_iwslt_de_en, criterion CrossEntropyCriterion
| num. model params: 14159387
| training on 1 GPUs
| max tokens per GPU = 4000 and max sentences per GPU = None
| epoch 001: 0%| | 0/996 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 29, in
main(args)
File "train.py", line 23, in main
singleprocess_main(args)
File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 79, in main
train(args, trainer, dataset, epoch, batch_offset)
File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 138, in train
log_output = trainer.train_step(sample)
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 94, in train_step
loss, sample_sizes, logging_outputs, ooms_fwd = self._forward(sample)
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 152, in _forward
raise e
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 142, in forward
loss, sample_size, logging_output = self.criterion(self.model, sample)
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/criterions/cross_entropy.py", line 28, in forward
net_output = model(**sample['net_input'])
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/models/fairseq_model.py", line 43, in forward
encoder_out = self.encoder(src_tokens, src_lengths)
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/models/lstm.py", line 103, in forward
left_to_right=True,
File "/data/kaitao/workplaces/fairseq-py/fairseq/utils.py", line 294, in convert_padding_direction
if pad_mask.max() == 0:
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 125, in bool
torch.typename(self.data) + " is ambiguous")
RuntimeError: bool value of Variable objects containing non-empty torch.cuda.ByteTensor is ambiguous

The PyTorch version is 0.3.0, and i modify this part code as this. Would you like to give some better suggestions.

The text was updated successfully, but these errors were encountered:

myleott · 2018-04-09T15:36:15Z

The latest version of fairseq requires PyTorch >= 0.4.0, which requires building PyTorch from source. Please follow the instructions here: https://github.com/pytorch/pytorch#from-source.

That error is because the semantics of max() changed in PyTorch at some point so that it returns a Tensor instead of a literal. You can add .item() after the .max() to get the old behavior:

>>> x = torch.rand(4, 4)
>>> x.max()

 0.9556
[torch.FloatTensor of size ()]

>>> x.max().item()
0.9556044340133667
>>>

* add MMI to AIShell * fix MMI decode graph * export model * typo * fix code style * typo

myleott closed this as completed Apr 9, 2018

myleott added a commit that referenced this issue Jun 26, 2018

Use --lrshrink as the reduction factor in ReduceLROnPlateau (#137)

6725592

yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023

add phone based LF-MMI training to AIShell recipe (facebookresearch#137)

89b8420

* add MMI to AIShell * fix MMI decode graph * export model * typo * fix code style * typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

meet error when running LSTM #137

meet error when running LSTM #137

StillKeepTry commented Apr 3, 2018

myleott commented Apr 9, 2018

meet error when running LSTM #137

meet error when running LSTM #137

Comments

StillKeepTry commented Apr 3, 2018

myleott commented Apr 9, 2018