Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Include matched stop string/token in responses #2976

Merged
merged 3 commits into from
Mar 26, 2024

Conversation

njhill
Copy link
Collaborator

@njhill njhill commented Feb 22, 2024

Currently a finish_reason of "stop" is returned if any of the following are encountered:

  • One of the provided stop strings
  • One of the provided stop tokens
  • The EOS token

It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?).

This PR adds a stop_reason field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred iff finish_reason=="stop" and stop_reason=None.

I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.

Thanks @sahilsuneja1 for adding a test

@njhill njhill force-pushed the return-stop-str branch 2 times, most recently from 0700062 to b9c7685 Compare February 22, 2024 15:51
@njhill njhill changed the title Include specific stop reason in responses Include matched stop string/token in responses Feb 22, 2024
@njhill njhill force-pushed the return-stop-str branch 2 times, most recently from 84c73da to bb6f831 Compare February 26, 2024 22:29
@njhill njhill marked this pull request as ready for review February 26, 2024 22:33
@njhill
Copy link
Collaborator Author

njhill commented Feb 26, 2024

@simon-mo would be good to get your thoughts on this simple addition. It would be useful for us even if it's not exposed in the openai API responses. Also not sure what the best field name is, perhaps stop_match instead of stop_reason?

@simon-mo
Copy link
Collaborator

I think this is very simple and useful! Naming it as stop_reason is fine, or maybe finish_reason_stop_object: {"kind": "eos"|"stop_string"|"stop_token_id":, "value": None| str | int}

@njhill
Copy link
Collaborator Author

njhill commented Feb 27, 2024

Thanks @simon-mo!

Naming it as stop_reason is fine, or maybe finish_reason_stop_object: {"kind": "eos"|"stop_string"|"stop_token_id":, "value": None| str | int}

Will defer to your judgment on this! I like the simplicity of a single field but agree that more explicit could be better.

joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024
Include matched stop string/token in responses

[Cherry-picked from open upstream PR vllm-project/vllm#2976]

Currently a finish_reason of "stop" is returned if any of the following are encountered:
- One of the provided stop strings
- One of the provided stop tokens
- The EOS token

It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?).

This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None).

I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.

Signed-off-by: Joe Runde <joseph.runde@ibm.com>
joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024
Include matched stop string/token in responses

[Cherry-picked from open upstream PR vllm-project/vllm#2976]

Currently a finish_reason of "stop" is returned if any of the following are encountered:
- One of the provided stop strings
- One of the provided stop tokens
- The EOS token

It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?).

This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None).

I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.

Signed-off-by: Joe Runde <joseph.runde@ibm.com>
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024
joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 12, 2024
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
@njhill njhill force-pushed the return-stop-str branch 2 times, most recently from 2e129bd to 560f090 Compare March 20, 2024 20:34
@njhill njhill changed the title Include matched stop string/token in responses [Misc] Include matched stop string/token in responses Mar 25, 2024
njhill and others added 2 commits March 25, 2024 10:38
Currently a finish_reason of "stop" is returned if any of the following are encountered:
- One of the provided stop strings
- One of the provided stop tokens
- The EOS token

It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?).

This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None).

I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.
@simon-mo
Copy link
Collaborator

What you have is totally fine as long as it's well documented.

@njhill
Copy link
Collaborator Author

njhill commented Mar 25, 2024

Thanks @simon-mo, I just pushed a small update to add descriptions to the new stop_reason openai response fields in protocol.py, and this is already documented in the CompletionOutput docstring. Is there anywhere else you think we should add some doc for this?

@simon-mo simon-mo merged commit dfeb2ec into vllm-project:main Mar 26, 2024
32 checks passed
xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024
)

Co-authored-by: Sahil Suneja <sahilsuneja@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants