[Bugfix] fix crash if max_tokens=None #2570

NikolaBorisov · 2024-01-24T00:25:27Z

Passing max_tokens=None is allowed in the API,
but after recent changes it started crashing.

Passing max_tokens=None is allowed in the API, but after recent changes it started crashing.

NikolaBorisov · 2024-01-24T00:26:47Z

Here is the exception without this patch

  File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 299, in app
    raise e
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 294, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/workspace/vllm/entrypoints/openai/api_server.py", line 143, in create_completion
    generator = await openai_serving_completion.create_completion(
  File "/workspace/vllm/entrypoints/openai/serving_completion.py", line 237, in create_completion
    sampling_params = request.to_sampling_params()
  File "/workspace/vllm/entrypoints/openai/protocol.py", line 134, in to_sampling_params
    return SamplingParams(
  File "/workspace/vllm/sampling_params.py", line 148, in __init__
    self._verify_args()
  File "/workspace/vllm/sampling_params.py", line 186, in _verify_args
    if self.max_tokens < 1:
TypeError: '<' not supported between instances of 'NoneType' and 'int'

zhuohan123 · 2024-01-24T00:58:03Z

If possible, can you add a small test in test_regression so that we can make sure this is correct in the future?

NikolaBorisov · 2024-01-24T02:35:48Z

If possible, can you add a small test in test_regression so that we can make sure this is correct in the future?

Just did. I also added a new test_sampler_parameters file, where I think this makes more sense. Let me know if you want to keep both or remove one. I don't know how to add this test to the buildkite...

zhuohan123

LGTM! Thank you for your contribution!

fix crash if max_tokens=None

43e7268

Passing max_tokens=None is allowed in the API, but after recent changes it started crashing.

NikolaBorisov changed the title ~~fix crash if max_tokens=None~~ [Bugfix] fix crash if max_tokens=None Jan 24, 2024

NikolaBorisov mentioned this pull request Jan 24, 2024

Fix crash in prepare_prompt #2500

Closed

NikolaBorisov added 3 commits January 23, 2024 18:20

Add tests

6bab518

Add test file for sampling params

b3fcf05

fix formatting

db93a93

zhuohan123 approved these changes Jan 24, 2024

View reviewed changes

zhuohan123 merged commit 3209b49 into vllm-project:main Jan 24, 2024
16 checks passed

NikolaBorisov added a commit to deepinfra/vllm that referenced this pull request Jan 31, 2024

[Bugfix] fix crash if max_tokens=None (vllm-project#2570)

33ef7a6

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

[Bugfix] fix crash if max_tokens=None (vllm-project#2570)

2c76465

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] fix crash if max_tokens=None #2570

[Bugfix] fix crash if max_tokens=None #2570

NikolaBorisov commented Jan 24, 2024

NikolaBorisov commented Jan 24, 2024

zhuohan123 commented Jan 24, 2024

NikolaBorisov commented Jan 24, 2024

zhuohan123 left a comment •

edited

Loading

[Bugfix] fix crash if max_tokens=None #2570

[Bugfix] fix crash if max_tokens=None #2570

Conversation

NikolaBorisov commented Jan 24, 2024

NikolaBorisov commented Jan 24, 2024

zhuohan123 commented Jan 24, 2024

NikolaBorisov commented Jan 24, 2024

zhuohan123 left a comment • edited Loading

Choose a reason for hiding this comment

zhuohan123 left a comment •

edited

Loading