Skip to content

Conversation

@harrysalmon
Copy link

Makes the reason for output truncation clearer as per #730

Also adds a comment to the predict file where the max tokens are modified before the call is made.

Makes the reason for output truncation clearer as per stanfordnlp#730

Also adds a comment to the predict file where the max tokens
are modified before the call is made.
keys = list(kwargs.keys()) + list(dsp.settings.lm.kwargs.keys())
max_tokens_key = "max_tokens" if "max_tokens" in keys else "max_output_tokens"
new_kwargs = {
**kwargs,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the value set by the user effects the maximum tokens passed and returned from the model is quite unclear here, would anyone be able to clarify how 101:118 relate to the eventual values passed to different models, I'll have a crack at clearing this up. Might it be possible to do the 75 checking upstream of here?

@okhat
Copy link
Collaborator

okhat commented Mar 30, 2024

This is mainly during retries etc. It's not very straightforward to change.

@okhat
Copy link
Collaborator

okhat commented Mar 30, 2024

Basically, the user's max_tokens is used at first, then max_tokens / 2 if we need more fields, but 75 is sort of like a min_max_tokens. Setting a large enough max_tokens could be all that's required here?

@harrysalmon
Copy link
Author

Thanks for taking a look @okhat

Yeah the main thing for me is putting it into the init so it's clear to the user that they need to set it. What do you think of this part?

In terms of the retries, do you mean if max_tokens doesn't leave the LLM enough tokens to pass the full input tokens, it'll retry with max_tokens / 2 and then with 75?

@arnavsinghvi11
Copy link
Collaborator

@harrysalmon just following up on this. I think we can merge the change but leave the existing retry behavior. lmk if it's ready to merge.

@arnavsinghvi11
Copy link
Collaborator

Closing for now to avoid breaking cache errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants