TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' #516

pseudotensor · 2023-07-19T07:08:08Z

When using vLLM with MPT-30b-chat, run on single A100 80GB:

export MODEL=mosaicml/mpt-30b-chat
export HF_PORT=3000
export NCCL_IGNORE_DISABLED_P2P=1
python -m vllm.entrypoints.openai.api_server --port=$HF_PORT --host=0.0.0.0 --model $MODEL --seed=1234 &>> logs.$MODEL_NAME.vllm_inference_server.txt &

eventually hit below, after which all requests are not completed and are instead aborted:

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 436, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/middleware/cors.py", line 84, in __call__
    await self.app(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/routing.py", line 69, in app
    await response(scope, receive, send)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/responses.py", line 270, in __call__
    async with anyio.create_task_group() as task_group:
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/responses.py", line 273, in wrap
    await func()
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/starlette/responses.py", line 262, in stream_response
    async for chunk in self.body_iterator:
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 438, in completion_stream_generator
    async for res in result_generator:
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 148, in generate
    await self.engine_step(request_id)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 74, in engine_step
    request_outputs = self.engine.step()
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 253, in step
    self._decode_sequences(seq_groups)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 270, in _decode_sequences
    new_token, new_output_text = detokenize_incrementally(
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/vllm/transformers_utils/tokenizer.py", line 73, in detokenize_incrementally
    output_text = tokenizer.convert_tokens_to_string(output_tokens)
  File "/home/ubuntu/miniconda3/envs/h2ollm/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 533, in convert_tokens_to_string
    return self.backend_tokenizer.decoder.decode(tokens)
TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'

The text was updated successfully, but these errors were encountered:

pseudotensor · 2023-07-19T07:08:17Z

Related? #270

YangWang92 · 2023-07-27T08:32:35Z

I got the same error when I execute python benchmark_latency.py '--input-len', '32', '--output-len', '32', '--batch-size', '4', '--n', '1', '--tensor-parallel-size', '4', '--model', 'facebook/opt-66b', '--tokenizer', 'facebook/opt-66b'

SiriusNEO · 2023-07-31T04:06:35Z

@pseudotensor @YangWang92 This is because the inconsistent vocab_size between the model config and the tokenizer: #340

madaan · 2024-02-19T06:58:20Z

I got the same error but it was related to safetensors. It was fixed after converting the safetensors to normal pytorch checkpoints. Script

pseudotensor mentioned this issue Jul 19, 2023

vLLM: TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' h2oai/h2ogpt#501

Open

zhuohan123 added the bug Something isn't working label Jul 20, 2023

zhuohan123 closed this as completed Aug 7, 2023

ggbetz mentioned this issue May 8, 2024

Evaluate: openchat/openchat-3.5-0106-gemma logikon-ai/cot-eval#30

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' #516

TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' #516

pseudotensor commented Jul 19, 2023

pseudotensor commented Jul 19, 2023

YangWang92 commented Jul 27, 2023

SiriusNEO commented Jul 31, 2023 •

edited

madaan commented Feb 19, 2024 •

edited

TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' #516

TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString' #516

Comments

pseudotensor commented Jul 19, 2023

pseudotensor commented Jul 19, 2023

YangWang92 commented Jul 27, 2023

SiriusNEO commented Jul 31, 2023 • edited

madaan commented Feb 19, 2024 • edited

SiriusNEO commented Jul 31, 2023 •

edited

madaan commented Feb 19, 2024 •

edited