-
-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
increase default max_tokens for older non-chat OpenAI models so NER/spancat works #236
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable overall. Some notes:
- You mention setting
max_tokens
to 100, but the actual values are >2k. Why the divergence between description and code? - Have you run the external tests with this change?
- The docs on spacy.io should be updated as well.
@honnibal mentioned we should increase the default a little higher if possible. OpenAI docs on managing tokens (https://platform.openai.com/docs/guides/gpt/managing-tokens) say that the prompt + response tokens cannot exceed the model's context width so I scaled down the default max_tokens for the models by context window. This should allow the NER/Spancat tasks to work in most cases |
External tests all run correctly here but yeah docs need to be updated still, will add a link to a PR shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we change the default settings like this for these models, don't we have to bump all the versions?
Hm...I'd say we don't have to, but we mention it in the release notes. |
Hmm. Then at the very least we should have this on |
Fair, changed to |
Added a docs PR. I think we can merge both the docs PR and this one? |
It looks like the external tests are failing with an error about writing to a frozen dict... |
The only remaining failing external test should be fixed with the new |
new docs PR: explosion/spaCy#12961 |
Description
The default
max_tokens
for the LLM response of the old completions endpoint for OpenAI is16
tokens. This often causes too short of an output for the NER/SpanCat tasks (and of course for longer tasks like summarization).Setting the default at the API level higher value is a probably a good bet as a default but we should also probably update the docs here.
This is only required for the legacy models, the chat completion models set this to Infinity by default.
Corresponding documentation PR
explosion/spaCy#12961
Types of change
enhancement
Checklist
tests
andusage_examples/tests
, and all new and existing tests passed. This includespytest
ran with--external
)pytest
ran with--gpu
)