Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quick fix SummarizationPipeline error messages #14618

Merged
merged 2 commits into from
Dec 7, 2021
Merged

quick fix SummarizationPipeline error messages #14618

merged 2 commits into from
Dec 7, 2021

Conversation

NouamaneTazi
Copy link
Member

Fix error messages to avoid spam errors, and errors of type:
Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)

Fix error messages to avoid spam errors, and errors of type:
`Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)`
@LysandreJik
Copy link
Member

Pinging @Narsil and @patrickvonplaten

@LysandreJik
Copy link
Member

Thanks for your contribution @NouamaneTazi :)

You have an error in your code quality; do you mind running the following commands at the root of your clone?

pip install -e ".[quality"]
make fixup

This should fix most quality issues and let you know if there are some the scripts cannot resolve.

@Narsil
Copy link
Contributor

Narsil commented Dec 6, 2021

@NouamaneTazi .

I am not sure you PR achieves your desired goal:

You are changing another error than the error you're posting in the description.
You report max_length is set to .... and are modifying another warning. Actually the error you're modifying is different and linked to using min_length and having a length very well smaller than min_length.

FWIW. Summarization usually ingests large texts, and tries to generate something between min_length and max_length of size as a summary.

If input_length < min_length or input_length < max_length then the summary is likely to be not very summarized (since the original string already fits the requirements, or worse, the summary is supposed to expand on it. (Since original text is smaller than the supposedly min_length.

Do that clarify why the warnings are originally here ? If yes, then maybe we could clarify the warning instead to make them self sufficient ?

The warnings originated here 9c683ef (tried to get a better rationale for the // 2 but doesn't seem to be explained)

@NouamaneTazi
Copy link
Member Author

Hello @Narsil , thanks for your clarification.
Indeed, since I tried to make the quick fix using Github web IDE, it didn't save all my changes. Here are my intended changes:

  • assert max_length > min_length
  • assert input_length > max_length (if not suggest max_length = input_length // 2)
    Do you think there is more checks to do?

@Narsil
Copy link
Contributor

Narsil commented Dec 6, 2021

I think it should be fine to check both, but would that avoid the spam you were referring to ?

@NouamaneTazi
Copy link
Member Author

Yes, because before the solution proposed wasn't clear:
Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)
Whereas now, we do propose a max_length = input_length // 2 which should stop showing the error if fixed.

@Narsil
Copy link
Contributor

Narsil commented Dec 6, 2021

Makes sense.

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@patrickvonplaten patrickvonplaten merged commit 2171695 into huggingface:master Dec 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants