Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT Models Scoring error #1

Closed
VP007-py opened this issue Sep 18, 2020 · 2 comments
Closed

GPT Models Scoring error #1

VP007-py opened this issue Sep 18, 2020 · 2 comments

Comments

@VP007-py
Copy link

I tried scoring sentences with the models mentioned here . Every model works fine except for gpt2-117m-en-cased and gpt2-345m-en-cased. The following error pops up

Traceback (most recent call last):
  File "sample.py", line 16, in <module>
    print(scorer.score_sentences(["Hello world!"]))
  File "/home/pandramish.vinay/mlm-scoring/src/mlm/scorers.py", line 148, in score_sentences
    return self.score(corpus, **kwargs)[0]
  File "/home/pandramish.vinay/mlm-scoring/src/mlm/scorers.py", line 396, in score
    dataset = self.corpus_to_dataset(corpus)
  File "/home/pandramish.vinay/mlm-scoring/src/mlm/scorers.py", line 364, in corpus_to_dataset
    ids_masked = self._ids_to_masked(ids_original)
  File "/home/pandramish.vinay/mlm-scoring/src/mlm/scorers.py", line 329, in _ids_to_masked
    mask_token_id = self._vocab.token_to_idx[self._vocab.mask_token]
AttributeError: 'Vocab' object has no attribute 'mask_token'

Any fixes ?

@JulianSlzr
Copy link
Contributor

Thanks for filing the first issue! Sorry for the delayed response (didn't have e-mail notifications on 😓).

The issue is GPT-2 is an autoregressive LM and gives true log-likelihood scores. You need to use LMScorer, not MLMScorer. The README was unclear; my mistake.

I've updated the README and added pre-emptive error messages to MLMScorer, LMScorer, etc.; hope this helps.

@VP007-py
Copy link
Author

@JulianSlzr
Thanks for the fix and kudos to your awesome work !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants