Bug in reinforce_model causing Index Error: Size mismatch #381

sidhantls · 2020-11-19T12:10:26Z

🐛 Bug

When using Reinforce in reinforce_model.py as mentioned, it leads to a size mismatch IndexError in the last batch of the first epoch.

To Reproduce

Code sample

Ran as suggested in source:

from pl_bolts.models.rl.reinforce_model import Reinforce
from pytorch_lightning import Trainer
model = Reinforce("CartPole-v0")
trainer = Trainer()
trainer.fit(model)

Error

The error occurs in the last batch of the first epoch.

~\AppData\Local\Programs\Miniconda3\envs\torch\lib\site-packages\pl_bolts\models\rl\reinforce_model.py in loss(self, states, actions, scaled_rewards)
    207         # policy loss
    208         log_prob = log_softmax(logits, dim=1)
--> 209         log_prob_actions = scaled_rewards * log_prob[range(self.batch_size), actions]
    210         loss = -log_prob_actions.mean()
    211 

IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [8], [4]

Environment

PyTorch Version (e.g., 1.0): 1.6.0
OS (e.g., Linux): Windows
How you installed PyTorch (conda, pip, source): conda
Python version: 3.7.8

Additional Info

I can submit a PR to fix this. The error is occurring because in the loss method (line 209), self.batch_size is being referred to and the size of the last batch is less than self.batch_size. Using the length of the sample instead can fix this

The text was updated successfully, but these errors were encountered:

akihironitta · 2020-11-22T12:22:36Z

@sid-sundrani Thank you for reporting the issue!

Let me leave the full message in stderr/stdout just for the record:

/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/wandb/util.py:35: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
  from collections import namedtuple, Mapping, Sequence
/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/wandb/vendor/graphql-core-1.1/graphql/type/directives.py:55: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
  assert isinstance(locations, collections.Iterable), 'Must provide locations for directive.'
GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name | Type | Params
------------------------------
0 | net  | MLP  | 898   
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  warnings.warn(*args, **kwargs)
Epoch 0: : 0it [00:00, ?it/s]/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/common/agents.py:134: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  probabilities = F.softmax(self.net(states)).squeeze(dim=-1)
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The {log:dict keyword} was deprecated in 0.9.1 and will be removed in 1.0.0
Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
  warnings.warn(*args, **kwargs)
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The {progress_bar:dict keyword} was deprecated in 0.9.1 and will be removed in 1.0.0
Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
  warnings.warn(*args, **kwargs)
Epoch 0: : 14221it [00:23, 613.08it/s, loss=21.935, v_num=182, episodes=135, reward=24, avg_reward=67.5] Traceback (most recent call last):
  File "/tmp/kwa.py", line 5, in <module>
    trainer.fit(model)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 469, in fit
    results = self.accelerator_backend.train()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/cpu_accelerator.py", line 59, in train
    results = self.train_or_test()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/accelerator.py", line 66, in train_or_test
    results = self.trainer.train()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 521, in train
    self.train_loop.run_training_epoch()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 539, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 691, in run_training_batch
    self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 477, in optimizer_step
    self.trainer.accelerator_backend.optimizer_step(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/accelerator.py", line 114, in optimizer_step
    model_ref.optimizer_step(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/core/lightning.py", line 1409, in optimizer_step
    optimizer.step(closure=optimizer_closure, *args, **kwargs)
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/torch/optim/adam.py", line 62, in step
    loss = closure()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 681, in train_step_and_backward_closure
    result = self.training_step_and_backward(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 770, in training_step_and_backward
    result = self.training_step(split_batch, batch_idx, opt_idx, hiddens)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 324, in training_step
    training_step_output = self.trainer.accelerator_backend.training_step(args)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/cpu_accelerator.py", line 67, in training_step
    output = self.trainer.model.training_step(*args)
  File "/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/reinforce_model.py", line 237, in training_step
    loss = self.loss(states, actions, scaled_rewards)
  File "/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/reinforce_model.py", line 218, in loss
    log_prob_actions = scaled_rewards * log_prob[range(self.batch_size), actions]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [8], [5]
Exception ignored in: <function tqdm.__del__ at 0x7ff6bc6cd310>
Traceback (most recent call last):
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1128, in __del__
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1341, in close
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1520, in display
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

sidhantls added the help wanted Extra attention is needed label Nov 19, 2020

Borda added the fix fixing issues... label Nov 19, 2020

Borda added this to To do in Reinforcement Learning Nov 20, 2020

sidhantls mentioned this issue Nov 20, 2020

fix last batch index error reinforce #389

Merged

4 tasks

akihironitta self-assigned this Nov 22, 2020

akihironitta moved this from To do to In progress in Reinforcement Learning Nov 22, 2020

Borda closed this as completed in #389 Nov 22, 2020

Reinforcement Learning automation moved this from In progress to Done Nov 22, 2020

Borda added this to the v0.3 milestone Jan 18, 2021

Borda added bug Something isn't working and removed fix fixing issues... labels Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in reinforce_model causing Index Error: Size mismatch #381

Bug in reinforce_model causing Index Error: Size mismatch #381

sidhantls commented Nov 19, 2020 •

edited

akihironitta commented Nov 22, 2020

Bug in reinforce_model causing Index Error: Size mismatch #381

Bug in reinforce_model causing Index Error: Size mismatch #381

Comments

sidhantls commented Nov 19, 2020 • edited

🐛 Bug

To Reproduce

Code sample

Error

Environment

Additional Info

akihironitta commented Nov 22, 2020

sidhantls commented Nov 19, 2020 •

edited