[Bugfix][Frontend] Cleanup "fix chat logprobs" #5026

DarkLight1337 · 2024-05-24T09:05:51Z

Follow-up to #5029 and #5031:

Fix finish_reason being too strict (SequenceStatus.get_finished_reason can return 'abort' which is not in OpenAI spec)
Check strict equality instead of upper-bounding number of logprobs returned by Chat Completions API.
Remove unnecessary logprobs check in test_single_completion and test_single_chat_session since that is already covered by the newly-added tests.

Future work

As mentioned in #5031, the selected logprob should always be in top_logprobs for Completions API (this is not necessary for Chat Completions API though). However, this is not yet implemented in vLLM core.

Side note

I've noticed that the Pydantic models for the Embeddings API (#3734) are not inheriting from the stricter OpenAIBaseModel, and their return values have unnecessary information. The Embeddings API tests are also not conforming to OpenAI's spec. Maybe we can address this in a separate PR.

Etelis · 2024-05-24T16:08:58Z

Hopefully my PR still makes sense after this (#5031)

DarkLight1337 · 2024-05-24T16:24:56Z

Hopefully my PR still makes sense after this (#5031)

At this moment, this PR only address the second issue in #5031. See if the newly added tests can catch the other two issues as well.

Edit: The tests managed to additionally catch the first issue. I have applied the fix to this PR accordingly.

…llm-project#4795)

DarkLight1337 · 2024-05-30T10:51:46Z

@simon-mo This PR should be the final one to resolve the current chat logprobs issue.

sroy745

Thanks for the pr!! Apologies in advance if my comments are off.

sroy745 · 2024-05-31T21:18:57Z

tests/entrypoints/test_openai_server.py

-    assert completion.choices[0].finish_reason == "length"
+
+    choice = completion.choices[0]
+    assert len(choice.text) >= 5


nit : since we are doing a more stricter check about completion.usage (L:174) wonder if we can have a more strict equality check in the len(choice.text) for consistency here and in other places?

It seems that we can't check the length of the string using strict equality as each token may correspond to multiple characters. I'll keep the range check then.

… checks

sroy745

Thanks for trying out my comment. LGTM from my understanding of the issue. Thanks!

simon-mo · 2024-06-03T17:44:12Z

To confirm, for OpenAI API, if you specify logprobs without top_logprobs, what would happen? We are throwing an error here, if OpenAI API is not throwing error, we should follow their behavior.

DarkLight1337 · 2024-06-04T05:14:49Z

To confirm, for OpenAI API, if you specify logprobs without top_logprobs, what would happen? We are throwing an error here, if OpenAI API is not throwing error, we should follow their behavior.

It returns an empty list of logprobs, as if top_logprobs=0. This is consistent with the current implementation in vLLM (which was updated by #5029 to allow top_logprobs=None).

simon-mo · 2024-06-04T16:40:25Z

Sorry I am reading this

            if request.logprobs and request.top_logprobs is not None:
                assert top_logprobs is not None, (
                    "top_logprobs must be provided when logprobs "
                    "is requested")

and it seems we require users to set top_logprobs to a number, when logprobs is specified. And #5029 did not default the number to 0. I am missing how is this consistent with OAI behavior when logprobs is specified and top_logprobs is missing, the request will still succeed.

DarkLight1337 · 2024-06-04T23:39:56Z

Sorry I am reading this
            if request.logprobs and request.top_logprobs is not None:
                assert top_logprobs is not None, (
                    "top_logprobs must be provided when logprobs "
                    "is requested")
and it seems we require users to set top_logprobs to a number, when logprobs is specified. And #5029 did not default the number to 0. I am missing how is this consistent with OAI behavior when logprobs is specified and top_logprobs is missing, the request will still succeed.

The naming and error message are confusing. I have pushed a commit to clear up the confusion.

DarkLight1337 · 2024-06-10T14:34:12Z

Originally, I had some fixes to #5135 in this PR, but those have been superseded by #5319. To avoid confusion, I reverted this PR to the state before merging #5135, and then merged the fixes from #5319.

The current state of this PR is effectively merging the current main branch into 908cac4, so no further errors should arise. @simon-mo feel free to merge this if my previous comment resolves your question.

* upstream/main: (126 commits) [Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026) [Bugfix] OpenAI entrypoint limits logprobs while ignoring server defined --max-logprobs (vllm-project#5312) [Misc] Various simplifications and typing fixes (vllm-project#5368) [ci] Fix Buildkite agent path (vllm-project#5392) [Doc] Add documentation for FP8 W8A8 (vllm-project#5388) Bump version to v0.5.0 (vllm-project#5384) [Docs] Alphabetically sort sponsors (vllm-project#5386) [Docs] Add Docs on Limitations of VLM Support (vllm-project#5383) [ci] Mount buildkite agent on Docker container to upload benchmark results (vllm-project#5330) [ci] Use small_cpu_queue for doc build (vllm-project#5331) [Bugfix] Fix LLaVA-NeXT (vllm-project#5380) [Feature][Frontend]: Continued `stream_options` implementation also in CompletionRequest (vllm-project#5319) [Model] Initial support for LLaVA-NeXT (vllm-project#4199) [Misc] Improve error message when LoRA parsing fails (vllm-project#5194) [misc][typo] fix typo (vllm-project#5372) [Frontend][Misc] Enforce Pixel Values as Input Type for VLMs in API Server (vllm-project#5374) [Misc] Update to comply with the new `compressed-tensors` config (vllm-project#5350) [Bugfix] Fix KeyError: 1 When Using LoRA adapters (vllm-project#5164) [Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (vllm-project#5047) [mis][ci/test] fix flaky test in test_sharded_state_loader.py (vllm-project#5361) ...

Fix logprobs for chat completion API

2e19b90

DarkLight1337 force-pushed the openai-logprobs branch from f01bba0 to 36fc320 Compare May 24, 2024 14:11

Update and fix tests

08e41d7

DarkLight1337 force-pushed the openai-logprobs branch from 36fc320 to 08e41d7 Compare May 24, 2024 14:11

Fix and refine tests

bbd4415

DarkLight1337 added 3 commits May 25, 2024 02:50

Fix incorrect parameters to _create_chat_logprobs

504dd49

Allow logprobs=True when top_logprobs=0 or top_logprobs=None (v…

390e93d

…llm-project#4795)

Refine tests and fix them

cbed5ec

DarkLight1337 force-pushed the openai-logprobs branch from 3bac2c4 to cbed5ec Compare May 25, 2024 02:51

DarkLight1337 mentioned this pull request May 25, 2024

[Bug]: OpenAI LogProbs format for Chat-Completion is incorrect #5008

Closed

Merge branch 'upstream' into openai-logprobs

518ff5f

DarkLight1337 mentioned this pull request May 27, 2024

[Bugfix] logprobs is not compatible with the OpenAI spec #4795 #5031

Merged

DarkLight1337 added 7 commits May 28, 2024 13:10

Use stricter test for Chat Completions API

a72b33c

Merge branch 'upstream' into openai-logprobs

d18287a

Merge branch 'upstream' into openai-logprobs

4cb9068

Update tests

5ed37cd

Apply formatter

edeb3f6

Remove unused typevar

6584a51

Remove unnecessary disable

2b6b3d8

DarkLight1337 changed the title ~~[Bugfix][Frontend] Fix format of returned logprobs for OpenAI Chat Completions API~~ [Bugfix][Frontend] Fix some issues left behind by #5029 and #5031 May 30, 2024

DarkLight1337 changed the title ~~[Bugfix][Frontend] Fix some issues left behind by #5029 and #5031~~ [Bugfix][Frontend] Cleanup "fix chat logprobs" May 30, 2024

Update test_single_chat_session

fcf4d6f

DarkLight1337 requested a review from simon-mo May 31, 2024 05:32

sroy745 reviewed May 31, 2024

View reviewed changes

DarkLight1337 added 2 commits June 1, 2024 02:48

Use strict equality tests for length, and remove unnecessary non-null…

fed335b

… checks

Revert use strict equality tests

ecef584

sroy745 approved these changes Jun 1, 2024

View reviewed changes

Fix bad types caused by reassignment of same variable

72d58e1

DarkLight1337 mentioned this pull request Jun 3, 2024

[CI/Build] Simplify OpenAI server setup in tests #5100

Merged

DarkLight1337 added a commit to DarkLight1337/vllm-rocm that referenced this pull request Jun 4, 2024

Fix tests w.r.t. vllm-project#5026

57d65eb

DarkLight1337 mentioned this pull request Jun 4, 2024

[Frontend] Add OpenAI Vision API Support #5237

Merged

DarkLight1337 added 2 commits June 4, 2024 23:36

Merge branch 'upstream' into openai-logprobs

8226295

Fix confusing assertion and variable name

908cac4

DarkLight1337 mentioned this pull request Jun 7, 2024

[Feature][Frontend]: Continued stream_options implementation also in CompletionRequest #5319

Merged

Merge branch 'upstream' into openai-logprobs

7de6cf7

DarkLight1337 force-pushed the openai-logprobs branch from eb9dc12 to 7de6cf7 Compare June 10, 2024 14:28

simon-mo approved these changes Jun 10, 2024

View reviewed changes

zhuohan123 merged commit 640052b into vllm-project:main Jun 11, 2024
101 of 103 checks passed

DarkLight1337 deleted the openai-logprobs branch June 11, 2024 05:41

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 12, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)

4be9ac5

joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)

e05a535

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)

229c0d3

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)

3926849

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)

7205f0b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix][Frontend] Cleanup "fix chat logprobs" #5026

[Bugfix][Frontend] Cleanup "fix chat logprobs" #5026

DarkLight1337 commented May 24, 2024 •

edited

Loading

Etelis commented May 24, 2024 •

edited

Loading

DarkLight1337 commented May 24, 2024 •

edited

Loading

DarkLight1337 commented May 30, 2024 •

edited

Loading

sroy745 left a comment

sroy745 May 31, 2024

DarkLight1337 Jun 1, 2024

DarkLight1337 Jun 1, 2024

sroy745 left a comment

simon-mo commented Jun 3, 2024

DarkLight1337 commented Jun 4, 2024 •

edited

Loading

simon-mo commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

DarkLight1337 commented Jun 10, 2024

[Bugfix][Frontend] Cleanup "fix chat logprobs" #5026

[Bugfix][Frontend] Cleanup "fix chat logprobs" #5026

Conversation

DarkLight1337 commented May 24, 2024 • edited Loading

Future work

Side note

Etelis commented May 24, 2024 • edited Loading

DarkLight1337 commented May 24, 2024 • edited Loading

DarkLight1337 commented May 30, 2024 • edited Loading

sroy745 left a comment

Choose a reason for hiding this comment

sroy745 May 31, 2024

Choose a reason for hiding this comment

DarkLight1337 Jun 1, 2024

Choose a reason for hiding this comment

DarkLight1337 Jun 1, 2024

Choose a reason for hiding this comment

sroy745 left a comment

Choose a reason for hiding this comment

simon-mo commented Jun 3, 2024

DarkLight1337 commented Jun 4, 2024 • edited Loading

simon-mo commented Jun 4, 2024

DarkLight1337 commented Jun 4, 2024

DarkLight1337 commented Jun 10, 2024

DarkLight1337 commented May 24, 2024 •

edited

Loading

Etelis commented May 24, 2024 •

edited

Loading

DarkLight1337 commented May 24, 2024 •

edited

Loading

DarkLight1337 commented May 30, 2024 •

edited

Loading

DarkLight1337 commented Jun 4, 2024 •

edited

Loading