Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase default max_tokens for older non-chat OpenAI models so NER/spancat works #236

Merged
merged 11 commits into from
Sep 7, 2023

Conversation

kabirkhan
Copy link
Contributor

@kabirkhan kabirkhan commented Jul 31, 2023

Description

The default max_tokens for the LLM response of the old completions endpoint for OpenAI is 16 tokens. This often causes too short of an output for the NER/SpanCat tasks (and of course for longer tasks like summarization).

Setting the default at the API level higher value is a probably a good bet as a default but we should also probably update the docs here.

This is only required for the legacy models, the chat completion models set this to Infinity by default.

Corresponding documentation PR

explosion/spaCy#12961

Types of change

enhancement

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran all tests in tests and usage_examples/tests, and all new and existing tests passed. This includes
    • all external tests (i. e. pytest ran with --external)
    • all tests requiring a GPU (i. e. pytest ran with --gpu)
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

Copy link
Collaborator

@rmitsch rmitsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable overall. Some notes:

  • You mention setting max_tokens to 100, but the actual values are >2k. Why the divergence between description and code?
  • Have you run the external tests with this change?
  • The docs on spacy.io should be updated as well.

spacy_llm/models/rest/openai/registry.py Outdated Show resolved Hide resolved
@kabirkhan kabirkhan changed the title default max_tokens to 100 for older non-chat OpenAI models so NER/spancat works increase default max_tokens for older non-chat OpenAI models so NER/spancat works Aug 1, 2023
@kabirkhan
Copy link
Contributor Author

@honnibal mentioned we should increase the default a little higher if possible. OpenAI docs on managing tokens (https://platform.openai.com/docs/guides/gpt/managing-tokens) say that the prompt + response tokens cannot exceed the model's context width so I scaled down the default max_tokens for the models by context window. This should allow the NER/Spancat tasks to work in most cases

spacy_llm/models/rest/openai/registry.py Outdated Show resolved Hide resolved
spacy_llm/models/rest/openai/registry.py Outdated Show resolved Hide resolved
@kabirkhan
Copy link
Contributor Author

External tests all run correctly here but yeah docs need to be updated still, will add a link to a PR shortly

Copy link
Member

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we change the default settings like this for these models, don't we have to bump all the versions?

@rmitsch
Copy link
Collaborator

rmitsch commented Aug 7, 2023

When we change the default settings like this for these models, don't we have to bump all the versions?

Hm...I'd say we don't have to, but we mention it in the release notes.

@svlandeg
Copy link
Member

svlandeg commented Aug 9, 2023

When we change the default settings like this for these models, don't we have to bump all the versions?

Hm...I'd say we don't have to, but we mention it in the release notes.

Hmm. Then at the very least we should have this on develop and not main?

@rmitsch
Copy link
Collaborator

rmitsch commented Aug 11, 2023

Hmm. Then at the very least we should have this on develop and not main?

Fair, changed to develop.

@rmitsch rmitsch changed the base branch from main to develop August 11, 2023 11:29
@rmitsch rmitsch added the feat/model Feature: models label Aug 11, 2023
@rmitsch
Copy link
Collaborator

rmitsch commented Aug 11, 2023

Added a docs PR. I think we can merge both the docs PR and this one?

@svlandeg svlandeg added Test external Run external tests Test GPU Run GPU tests labels Aug 16, 2023
@svlandeg
Copy link
Member

It looks like the external tests are failing with an error about writing to a frozen dict...

@svlandeg
Copy link
Member

svlandeg commented Sep 6, 2023

The only remaining failing external test should be fixed with the new confection release.

@svlandeg
Copy link
Member

svlandeg commented Sep 6, 2023

new docs PR: explosion/spaCy#12961

@svlandeg svlandeg mentioned this pull request Sep 6, 2023
3 tasks
@rmitsch rmitsch merged commit 3be42e3 into develop Sep 7, 2023
11 checks passed
@svlandeg svlandeg deleted the kab/config-max-tokens branch September 7, 2023 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/model Feature: models Test external Run external tests Test GPU Run GPU tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants