Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in reinforce_model causing Index Error: Size mismatch #381

Closed
sidhantls opened this issue Nov 19, 2020 · 1 comment 路 Fixed by #389
Closed

Bug in reinforce_model causing Index Error: Size mismatch #381

sidhantls opened this issue Nov 19, 2020 · 1 comment 路 Fixed by #389
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed
Milestone

Comments

@sidhantls
Copy link
Contributor

sidhantls commented Nov 19, 2020

馃悰 Bug

When using Reinforce in reinforce_model.py as mentioned, it leads to a size mismatch IndexError in the last batch of the first epoch.

To Reproduce

Code sample

Ran as suggested in source:

from pl_bolts.models.rl.reinforce_model import Reinforce
from pytorch_lightning import Trainer
model = Reinforce("CartPole-v0")
trainer = Trainer()
trainer.fit(model)

Error

The error occurs in the last batch of the first epoch.

~\AppData\Local\Programs\Miniconda3\envs\torch\lib\site-packages\pl_bolts\models\rl\reinforce_model.py in loss(self, states, actions, scaled_rewards)
    207         # policy loss
    208         log_prob = log_softmax(logits, dim=1)
--> 209         log_prob_actions = scaled_rewards * log_prob[range(self.batch_size), actions]
    210         loss = -log_prob_actions.mean()
    211 

IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [8], [4]

Environment

  • PyTorch Version (e.g., 1.0): 1.6.0
  • OS (e.g., Linux): Windows
  • How you installed PyTorch (conda, pip, source): conda
  • Python version: 3.7.8

Additional Info

I can submit a PR to fix this. The error is occurring because in the loss method (line 209), self.batch_size is being referred to and the size of the last batch is less than self.batch_size. Using the length of the sample instead can fix this

@sidhantls sidhantls added the help wanted Extra attention is needed label Nov 19, 2020
@Borda Borda added the fix fixing issues... label Nov 19, 2020
@Borda Borda added this to To do in Reinforcement Learning Nov 20, 2020
@akihironitta akihironitta self-assigned this Nov 22, 2020
@akihironitta
Copy link
Contributor

@sid-sundrani Thank you for reporting the issue!

Let me leave the full message in stderr/stdout just for the record:

/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/wandb/util.py:35: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
  from collections import namedtuple, Mapping, Sequence
/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/wandb/vendor/graphql-core-1.1/graphql/type/directives.py:55: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
  assert isinstance(locations, collections.Iterable), 'Must provide locations for directive.'
GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name | Type | Params
------------------------------
0 | net  | MLP  | 898   
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The dataloader, train dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 8 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.
  warnings.warn(*args, **kwargs)
Epoch 0: : 0it [00:00, ?it/s]/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/common/agents.py:134: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  probabilities = F.softmax(self.net(states)).squeeze(dim=-1)
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The {log:dict keyword} was deprecated in 0.9.1 and will be removed in 1.0.0
Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
  warnings.warn(*args, **kwargs)
/home/nitta/work/pytorch-lightning/pytorch_lightning/utilities/distributed.py:45: UserWarning: The {progress_bar:dict keyword} was deprecated in 0.9.1 and will be removed in 1.0.0
Please use self.log(...) inside the lightningModule instead.

# log on a step or aggregate epoch metric to the logger and/or progress bar
# (inside LightningModule)
self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
  warnings.warn(*args, **kwargs)
Epoch 0: : 14221it [00:23, 613.08it/s, loss=21.935, v_num=182, episodes=135, reward=24, avg_reward=67.5] Traceback (most recent call last):
  File "/tmp/kwa.py", line 5, in <module>
    trainer.fit(model)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 469, in fit
    results = self.accelerator_backend.train()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/cpu_accelerator.py", line 59, in train
    results = self.train_or_test()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/accelerator.py", line 66, in train_or_test
    results = self.trainer.train()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 521, in train
    self.train_loop.run_training_epoch()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 539, in run_training_epoch
    batch_output = self.run_training_batch(batch, batch_idx, dataloader_idx)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 691, in run_training_batch
    self.optimizer_step(optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 477, in optimizer_step
    self.trainer.accelerator_backend.optimizer_step(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/accelerator.py", line 114, in optimizer_step
    model_ref.optimizer_step(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/core/lightning.py", line 1409, in optimizer_step
    optimizer.step(closure=optimizer_closure, *args, **kwargs)
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/torch/optim/adam.py", line 62, in step
    loss = closure()
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 681, in train_step_and_backward_closure
    result = self.training_step_and_backward(
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 770, in training_step_and_backward
    result = self.training_step(split_batch, batch_idx, opt_idx, hiddens)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/trainer/training_loop.py", line 324, in training_step
    training_step_output = self.trainer.accelerator_backend.training_step(args)
  File "/home/nitta/work/pytorch-lightning/pytorch_lightning/accelerators/cpu_accelerator.py", line 67, in training_step
    output = self.trainer.model.training_step(*args)
  File "/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/reinforce_model.py", line 237, in training_step
    loss = self.loss(states, actions, scaled_rewards)
  File "/home/nitta/work/pytorch-lightning-bolts/pl_bolts/models/rl/reinforce_model.py", line 218, in loss
    log_prob_actions = scaled_rewards * log_prob[range(self.batch_size), actions]
IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [8], [5]
Exception ignored in: <function tqdm.__del__ at 0x7ff6bc6cd310>
Traceback (most recent call last):
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1128, in __del__
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1341, in close
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1520, in display
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1131, in __repr__
  File "/home/nitta/.pyenv/versions/miniconda3-latest/envs/pl/lib/python3.8/site-packages/tqdm/std.py", line 1481, in format_dict
TypeError: cannot unpack non-iterable NoneType object

@akihironitta akihironitta moved this from To do to In progress in Reinforcement Learning Nov 22, 2020
Reinforcement Learning automation moved this from In progress to Done Nov 22, 2020
@Borda Borda added this to the v0.3 milestone Jan 18, 2021
@Borda Borda added bug Something isn't working and removed fix fixing issues... labels Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Development

Successfully merging a pull request may close this issue.

3 participants