Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meet error when running LSTM #137

Closed
StillKeepTry opened this issue Apr 3, 2018 · 1 comment
Closed

meet error when running LSTM #137

StillKeepTry opened this issue Apr 3, 2018 · 1 comment

Comments

@StillKeepTry
Copy link

I am trying to run the LSTM model. The command is CUDA_VISIBLE_DEVICES=0 python train.py data-bin/iwslt14.tokenized.de-en --optim adam --lr 0.0003125 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 --save-dir checkpoints/lstm/ --arch lstm_wiseman_iwslt_de_en, and meet error:

| [de] dictionary: 20111 types
| [en] dictionary: 14619 types
| data-bin/iwslt14.tokenized.de-en train 160215 examples
| data-bin/iwslt14.tokenized.de-en valid 7282 examples
| model lstm_wiseman_iwslt_de_en, criterion CrossEntropyCriterion
| num. model params: 14159387
| training on 1 GPUs
| max tokens per GPU = 4000 and max sentences per GPU = None
| epoch 001: 0%| | 0/996 [00:00<?, ?it/s]Traceback (most recent call last):
File "train.py", line 29, in
main(args)
File "train.py", line 23, in main
singleprocess_main(args)
File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 79, in main
train(args, trainer, dataset, epoch, batch_offset)
File "/data/kaitao/workplaces/fairseq-py/singleprocess_train.py", line 138, in train
log_output = trainer.train_step(sample)
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 94, in train_step
loss, sample_sizes, logging_outputs, ooms_fwd = self._forward(sample)
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 152, in _forward
raise e
File "/data/kaitao/workplaces/fairseq-py/fairseq/trainer.py", line 142, in forward
loss, sample_size, logging_output
= self.criterion(self.model, sample)
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/criterions/cross_entropy.py", line 28, in forward
net_output = model(**sample['net_input'])
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/models/fairseq_model.py", line 43, in forward
encoder_out = self.encoder(src_tokens, src_lengths)
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/data/kaitao/workplaces/fairseq-py/fairseq/models/lstm.py", line 103, in forward
left_to_right=True,
File "/data/kaitao/workplaces/fairseq-py/fairseq/utils.py", line 294, in convert_padding_direction
if pad_mask.max() == 0:
File "/data/kaitao/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 125, in bool
torch.typename(self.data) + " is ambiguous")
RuntimeError: bool value of Variable objects containing non-empty torch.cuda.ByteTensor is ambiguous

The PyTorch version is 0.3.0, and i modify this part code as this. Would you like to give some better suggestions.

@myleott
Copy link
Contributor

myleott commented Apr 9, 2018

The latest version of fairseq requires PyTorch >= 0.4.0, which requires building PyTorch from source. Please follow the instructions here: https://github.com/pytorch/pytorch#from-source.

That error is because the semantics of max() changed in PyTorch at some point so that it returns a Tensor instead of a literal. You can add .item() after the .max() to get the old behavior:

>>> x = torch.rand(4, 4)
>>> x.max()

 0.9556
[torch.FloatTensor of size ()]

>>> x.max().item()
0.9556044340133667
>>>

@myleott myleott closed this as completed Apr 9, 2018
yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023
* add MMI to AIShell

* fix MMI decode graph

* export model

* typo

* fix code style

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants