Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

num_beams error in GPT2DoubleHead model #6319

Closed
vibhavagarwal5 opened this issue Aug 7, 2020 · 2 comments · Fixed by #6735
Closed

num_beams error in GPT2DoubleHead model #6319

vibhavagarwal5 opened this issue Aug 7, 2020 · 2 comments · Fixed by #6735
Assignees

Comments

@vibhavagarwal5
Copy link

vibhavagarwal5 commented Aug 7, 2020

Environment info

  • transformers version: 2.9.1
  • Platform: Linux
  • Python version: 3.6
  • PyTorch version (GPU?): 1.5
  • Tensorflow version (GPU?):
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: Yes

Who can help

@LysandreJik @patil-suraj

Information

I am trying to use model.generate() for the GPT2DoubleHeadModel but the beam search is giving an error.
Setting the num_beams > 1 results in the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/hdd1/vibhav/anaconda3/envs/vesnli/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/home/hdd1/vibhav/anaconda3/envs/vesnli/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1125, in generate
    model_specific_kwargs=model_specific_kwargs,
  File "/home/hdd1/vibhav/anaconda3/envs/vesnli/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1481, in _generate_beam_search
    past = self._reorder_cache(past, beam_idx)
  File "/home/hdd1/vibhav/anaconda3/envs/vesnli/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1551, in _reorder_cache
    return tuple(layer_past.index_select(1, beam_idx) for layer_past in past)
  File "/home/hdd1/vibhav/anaconda3/envs/vesnli/lib/python3.7/site-packages/transformers/modeling_utils.py", line 1551, in <genexpr>
    return tuple(layer_past.index_select(1, beam_idx) for layer_past in past)
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

However, things are working fine for num_beams=1 and for GPT2LMHeadModel(both beam search and non beam search)

@adamlin120
Copy link

encountered the same issue

@patil-suraj
Copy link
Contributor

I think @patrickvonplaten might have some ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants