Unable to reproduce example #56

slvcsl · 2021-08-13T16:28:22Z

Hi!
I am trying to use the library for some experiments on extractive summarization.

I installed it as suggested in the guide and downloaded the roberta-base-ext-sum model.
However, when I try to

from extractive import ExtractiveSummarizer
model =ExtractiveSummarizer.load_from_checkpoint("path/epoch=3.ckpt")

I get the following exception:

File "C:\Users\silvia\Desktop\transformersum\prova.py", line 4, in
model = ExtractiveSummarizer.load_from_checkpoint("C:/Users/silvia/Desktop/transformersum/models/epoch=3.ckpt")
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 153, in load_from_checkpoint
model = cls._load_model_state(checkpoint, strict=strict, **kwargs)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\pytorch_lightning\core\saving.py", line 201, in _load_model_state
keys = model.load_state_dict(checkpoint["state_dict"], strict=strict)
File "C:\Users\silvia\Anaconda3\envs\transformersum\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ExtractiveSummarizer:
Missing key(s) in state_dict: "word_embedding_model.embeddings.position_ids".

How should I proceed?

The text was updated successfully, but these errors were encountered:

HHousen · 2021-08-13T17:08:22Z

Hi. It seems like my latest changes to the documentation did not build correctly so the information about this issue was not visible. To solve this issue please see this page, which will be on the ReadTheDocs documentation very soon. Essentially, set strict=False like so: model = ExtractiveSummarizer.load_from_checkpoint("distilroberta-base-ext-sum.ckpt", strict=False) and that should solve it.

HHousen · 2021-08-13T17:32:50Z

The fix to solve this problem is now in the documentation.

slvcsl · 2021-08-16T07:56:51Z

Hi @HHousen! Thanks for your quick reply.
Now the model loads fine.

However, when I try to summarize a string, using

from extractive import ExtractiveSummarizer
model = ExtractiveSummarizer.load_from_checkpoint("my/path", strict=False) #ok
source = "This is just a try. Let's see if it works"
summary = model.predict(source)

I get:

C:\Users\silvia\Anaconda3\envs\TransformerSum\lib\site-packages\pytorch_lightning\core\saving.py:205: UserWarning: Found keys that are in the model state dict but not in the checkpoint: ['word_embedding_model.embeddings.position_ids']
rank_zero_warn(
Traceback (most recent call last):
File "C:\Users\silvia\Desktop\transformersum\prova.py", line 7, in
summary = model.predict(source)
File "C:\Users\silvia\Desktop\transformersum.\src\extractive.py", line 1177, in predict
nlp.add_pipe(sentencizer)
File "C:\Users\silvia\Anaconda3\envs\TransformerSum\lib\site-packages\spacy\language.py", line 758, in add_pipe
raise ValueError(err)
ValueError: [E966] nlp.add_pipe now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy.pipeline.sentencizer.Sentencizer object at 0x000001B26E43A240> (name: 'None').

If you created your component with nlp.create_pipe('name'): remove nlp.create_pipe and call nlp.add_pipe('name') instead.
If you passed in a component like TextCategorizer(): call nlp.add_pipe with the string name instead, e.g. nlp.add_pipe('textcat').
If you're using a custom component: Add the decorator @Language.component (for function components) or @Language.factory (for class components / factories) to your custom component and assign it a name, e.g. @Language.component('your_name'). You can then run nlp.add_pipe('your_name') to add it to the pipeline.

Process finished with exit code 1

HHousen · 2021-08-23T20:04:23Z

Try opening the extractive.py file and changing line 1177 from nlp.add_pipe(sentencizer) to nlp.add_pipe("sentencizer"). Then, delete the previous line (line 1176). If this works I will merge this change to the master branch.

slvcsl · 2021-08-24T08:20:43Z

Thanks, it does work perfectly!

slvcsl closed this as completed Aug 24, 2021

HHousen added a commit that referenced this issue Aug 24, 2021

Use spacy new 'add_pipe' syntax (Fixes #56)

ece40df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce example #56

Unable to reproduce example #56

slvcsl commented Aug 13, 2021 •

edited

HHousen commented Aug 13, 2021

HHousen commented Aug 13, 2021

slvcsl commented Aug 16, 2021 •

edited

HHousen commented Aug 23, 2021

slvcsl commented Aug 24, 2021

Unable to reproduce example #56

Unable to reproduce example #56

Comments

slvcsl commented Aug 13, 2021 • edited

HHousen commented Aug 13, 2021

HHousen commented Aug 13, 2021

slvcsl commented Aug 16, 2021 • edited

HHousen commented Aug 23, 2021

slvcsl commented Aug 24, 2021

slvcsl commented Aug 13, 2021 •

edited

slvcsl commented Aug 16, 2021 •

edited