Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce example #56

Closed
slvcsl opened this issue Aug 13, 2021 · 5 comments
Closed

Unable to reproduce example #56

slvcsl opened this issue Aug 13, 2021 · 5 comments

Comments

@slvcsl
Copy link

slvcsl commented Aug 13, 2021

Hi!
I am trying to use the library for some experiments on extractive summarization.

I installed it as suggested in the guide and downloaded the roberta-base-ext-sum model.
However, when I try to

from extractive import ExtractiveSummarizer
model =ExtractiveSummarizer.load_from_checkpoint("path/epoch=3.ckpt")

I get the following exception:

File "C:\Users\silvia\Desktop\transformersum\prova.py", line 4, in
model = ExtractiveSummarizer.load_from_checkpoint("C:/Users/silvia/Desktop/transformersum/models/epoch=3.ckpt")
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 153, in load_from_checkpoint
model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 201, in _load_model_state
keys = model.load_state_dict(checkpoint["state_dict"], strict=strict)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

How should I proceed?

@HHousen
Copy link
Owner

HHousen commented Aug 13, 2021

Hi. It seems like my latest changes to the documentation did not build correctly so the information about this issue was not visible. To solve this issue please see this page, which will be on the ReadTheDocs documentation very soon. Essentially, set strict=False like so: model = ExtractiveSummarizer.load_from_checkpoint("distilroberta-base-ext-sum.ckpt", strict=False) and that should solve it.

@HHousen
Copy link
Owner

HHousen commented Aug 13, 2021

The fix to solve this problem is now in the documentation.

@slvcsl
Copy link
Author

slvcsl commented Aug 16, 2021

Hi @HHousen! Thanks for your quick reply.
Now the model loads fine.

However, when I try to summarize a string, using

from extractive import ExtractiveSummarizer
model = ExtractiveSummarizer.load_from_checkpoint("my/path", strict=False) #ok
source = "This is just a try. Let's see if it works"
summary = model.predict(source)

I get:

C:\Users\silvia\Anaconda3\envs\TransformerSum\lib\site-packages\pytorch_lightning\core\saving.py:205: UserWarning: Found keys that are in the model state dict but not in the checkpoint: ['word_embedding_model.embeddings.position_ids']
rank_zero_warn(
Traceback (most recent call last):
File "C:\Users\silvia\Desktop\transformersum\prova.py", line 7, in
summary = model.predict(source)
File "C:\Users\silvia\Desktop\transformersum.\src\extractive.py", line 1177, in predict
nlp.add_pipe(sentencizer)
File "C:\Users\silvia\Anaconda3\envs\TransformerSum\lib\site-packages\spacy\language.py", line 758, in add_pipe
raise ValueError(err)
ValueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy.pipeline.sentencizer.Sentencizer object at 0x000001B26E43A240> (name: 'None').

  • If you created your component with nlp.create_pipe('name'): remove nlp.create_pipe and call nlp.add_pipe('name') instead.

  • If you passed in a component like TextCategorizer(): call nlp.add_pipe with the string name instead, e.g. nlp.add_pipe('textcat').

  • If you're using a custom component: Add the decorator @Language.component (for function components) or @Language.factory (for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name'). You can then run nlp.add_pipe('your_name') to add it to the pipeline.

Process finished with exit code 1

@HHousen
Copy link
Owner

HHousen commented Aug 23, 2021

Try opening the extractive.py file and changing line 1177 from nlp.add_pipe(sentencizer) to nlp.add_pipe("sentencizer"). Then, delete the previous line (line 1176). If this works I will merge this change to the master branch.

@slvcsl
Copy link
Author

slvcsl commented Aug 24, 2021

Thanks, it does work perfectly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants