Creating pull request of hacks I needed to run sample.py on cuda #178

DavidLKing · 2018-11-11T03:59:01Z

When I run even the basic sample.py script under examples on a cuda enabled machine, I still get errors that not all the vectors are cuda vectors (some are still cpu). You can ignore my edits in sample.py, but I did annotate in each place where I needed to add vector = vector.cuda() to make sample.py run. This occurs even when torch.device('cuda') is called, which should not be the case in PyTorch 0.4.0+.

Thank you, and feel free to follow up with any questions.

pskrunner14 · 2018-11-11T04:51:18Z

Hi @DavidLKing did you try the develop branch? The necessary changes for moving tensors to CUDA have already been made there. Master branch is yet to be updated.

DavidLKing · 2018-11-11T20:28:49Z

Hi @pskrunner14! I did, but it still fails when I try to run sample.py with the same error:

(pytorch-seq2seq) david@Arjuna:~/bin/git/pytorch-seq2seq$ python examples/sample.py --train_path data/toy_reverse/train/data.txt --dev_path data/toy_reverse/dev/data.txt 
/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='elementwise_mean' instead.
  warnings.warn(warning.format(ret))
2018-11-11 15:23:53,307 root         INFO     Namespace(dev_path='data/toy_reverse/dev/data.txt', expt_dir='./experiment', load_checkpoint=None, log_level='info', resume=False, train_path='data/toy_reverse/train/data.txt')
/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))
/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
  "num_layers={}".format(dropout, num_layers))
2018-11-11 15:23:56,082 seq2seq.trainer.supervised_trainer INFO     Optimizer: Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.001
    weight_decay: 0
), Scheduler: None
Traceback (most recent call last):
  File "examples/sample.py", line 129, in <module>
    resume=opt.resume)
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/seq2seq-0.1.6-py3.6.egg/seq2seq/trainer/supervised_trainer.py", line 186, in train
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/seq2seq-0.1.6-py3.6.egg/seq2seq/trainer/supervised_trainer.py", line 103, in _train_epoches
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/seq2seq-0.1.6-py3.6.egg/seq2seq/trainer/supervised_trainer.py", line 55, in _train_batch
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/seq2seq-0.1.6-py3.6.egg/seq2seq/models/seq2seq.py", line 48, in forward
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/seq2seq-0.1.6-py3.6.egg/seq2seq/models/EncoderRNN.py", line 68, in forward
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 110, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/david/miniconda2/envs/pytorch-seq2seq/lib/python3.6/site-packages/torch/nn/functional.py", line 1110, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #3 'index'

I'm fine with this not being a full merge. Would an issue make more sense?

pskrunner14 · 2018-11-11T20:52:11Z

Hey @DavidLKing are you sure you have the latest changes? It seems to be working fine on my side:

psk@ubuntu:~/projects/pytorch-seq2seq$ python examples/sample.py $TRAIN_SRC $TRAIN_TGT $DEV_SRC $DEV_TGT
2018-11-12 02:19:04,671:root:INFO: train_source: data/toy_reverse/train/src.txt
2018-11-12 02:19:04,671:root:INFO: train_target: data/toy_reverse/train/tgt.txt
2018-11-12 02:19:04,671:root:INFO: dev_source: data/toy_reverse/dev/src.txt
2018-11-12 02:19:04,671:root:INFO: dev_target: data/toy_reverse/dev/tgt.txt
2018-11-12 02:19:04,671:root:INFO: experiment_directory: ./experiment
2018-11-12 02:19:06,860:seq2seq.trainer.supervised_trainer:INFO: Optimizer: Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    initial_lr: 0.001
    lr: 0.001
    weight_decay: 0
), Scheduler: <torch.optim.lr_scheduler.StepLR object at 0x7fd4f1c03c88>
Train Perplexity: 8.6506: 100%|██████████████████████████████████████| 157/157 [00:03<00:00, 45.63it/s]
2018-11-12 02:19:10,454:seq2seq.trainer.supervised_trainer:INFO: Finished epoch 1: Train Perplexity: 6.4669, Dev Perplexity: 4.7727, Accuracy: 0.6091
Train Perplexity: 1.1459: 100%|██████████████████████████████████████| 157/157 [00:03<00:00, 47.59it/s]
2018-11-12 02:19:13,900:seq2seq.trainer.supervised_trainer:INFO: Finished epoch 2: Train Perplexity: 2.4010, Dev Perplexity: 4.4814, Accuracy: 0.7656
Train Perplexity: 1.0013: 100%|██████████████████████████████████████| 157/157 [00:03<00:00, 46.45it/s]
2018-11-12 02:19:17,427:seq2seq.trainer.supervised_trainer:INFO: Finished epoch 3: Train Perplexity: 1.2013, Dev Perplexity: 1.0010, Accuracy: 1.0000
['3', '2', '1', '<eos>']

DavidLKing · 2018-11-11T21:17:27Z

Okay, it seems there have been more drastic changes. I was using an older version of sample.py. The newer version runs fine with the develop branch. I'll need to update my own project, but I think we're good to go. Thanks for the time! When's the next release scheduled?

pskrunner14 · 2018-11-12T18:32:17Z

Hi @DavidLKing yes there have been quite a few changes in the core API. Hopefully by next month it will be ready, still have a few bugs in the new features.

Creating pull request of hacks I needed to run sample.py on cuda

9204a67

pskrunner14 closed this Nov 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating pull request of hacks I needed to run sample.py on cuda #178

Creating pull request of hacks I needed to run sample.py on cuda #178

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 11, 2018

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 11, 2018

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 12, 2018

Creating pull request of hacks I needed to run sample.py on cuda #178

Creating pull request of hacks I needed to run sample.py on cuda #178

Conversation

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 11, 2018

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 11, 2018

DavidLKing commented Nov 11, 2018

pskrunner14 commented Nov 12, 2018