Skip to content

[Question]: When parsing a document under the maximum token limit preset by the model, an error is reported indicating that the input tokens exceed the model's limit. How should this issue be debugged and resolved? #6291

@biofer

Description

@biofer

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Describe your problem

To start all services using Docker, image version: infiniflow/ragflow:v0.17.2.
The model being used is SiliconFlow's BAAI/bge-m3, with the maximum token count set to 8192.
BAAI/bge-m3___OpenAI-API@OpenAI-API-Compatible
When parsing a document under the maximum token limit preset by the model, an error is reported indicating that the input tokens exceed the model's limit. How should this issue be debugged and resolved?
The front-end error is as follows:

Image

The log error is as follows:

2025-03-19 19:52:14,987 INFO     27 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2025-03-19T19:52:14.985+08:00", "boot_at": "2025-03-19T19:26:31.041+08:00", "pending": 1, "lag": 0, "done": 0, "failed": 0, "current": {"1aa9c6ce04b511f09f760242ac190006": {"id": "1aa9c6ce04b511f09f760242ac190006", "doc_id": "d8786200049211f0a6310242ac180006", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "c82c222e049211f0922e0242ac180006", "parser_id": "naive", "parser_config": {"auto_keywords": 5, "auto_questions": 0, "chunk_token_num": 512, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "graphrag": {"entity_types": ["drug", "tumor", "gene", "mutation", "category"], "method": "light", "resolution": "true", "use_graphrag": "true"}, "layout_recognize": "DeepDOC"}, "name": "\u975e\u5c0f\u7ec6\u80de\u80ba\u764c.pdf", "type": "pdf", "location": "\u975e\u5c0f\u7ec6\u80de\u80ba\u764c.pdf", "size": 18991802, "tenant_id": "abaf5490007311f0b5530242ac180006", "language": "Chinese", "embd_id": "BAAI/bge-m3___OpenAI-API@OpenAI-API-Compatible", "pagerank": 0, "kb_parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 512, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 5, "auto_questions": 0, "html4excel": false, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": "true", "entity_types": ["drug", "tumor", "gene", "mutation", "category"], "method": "light", "resolution": "true"}}, "img2txt_id": "", "asr_id": "", "llm_id": "deepseek-chat___OpenAI-API@OpenAI-API-Compatible", "update_time": 1742383633143, "task_type": "graphrag"}}}
2025-03-19 19:52:21,949 INFO     27 HTTP Request: POST http://192.168.1.56:9002/v1/chat/completions "HTTP/1.1 200 OK"
2025-03-19 19:52:22,342 INFO     27 set_progress(1aa9c6ce04b511f09f760242ac190006), progress: -1, progress_msg: 19:52:22 [ERROR][Exception]: Exceptions from Trio nursery (4 sub-exceptions) -- **ERROR**: Error code: 400 - {'error': {'message': '输入Token超过模型限制 (request id: 20250319195027616899205DVONcrl3)', 'type': 'upstream_error', 'param': '400', 'code': 'bad_response_status_code'}}
2025-03-19 19:52:22,342 ERROR    27 handle_task got exception for task {"id": "1aa9c6ce04b511f09f760242ac190006", "doc_id": "d8786200049211f0a6310242ac180006", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "c82c222e049211f0922e0242ac180006", "parser_id": "naive", "parser_config": {"auto_keywords": 5, "auto_questions": 0, "chunk_token_num": 512, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "graphrag": {"entity_types": ["drug", "tumor", "gene", "mutation", "category"], "method": "light", "resolution": "true", "use_graphrag": "true"}, "layout_recognize": "DeepDOC"}, "name": "\u975e\u5c0f\u7ec6\u80de\u80ba\u764c.pdf", "type": "pdf", "location": "\u975e\u5c0f\u7ec6\u80de\u80ba\u764c.pdf", "size": 18991802, "tenant_id": "abaf5490007311f0b5530242ac180006", "language": "Chinese", "embd_id": "BAAI/bge-m3___OpenAI-API@OpenAI-API-Compatible", "pagerank": 0, "kb_parser_config": {"layout_recognize": "DeepDOC", "chunk_token_num": 512, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "auto_keywords": 5, "auto_questions": 0, "html4excel": false, "raptor": {"use_raptor": false}, "graphrag": {"use_graphrag": "true", "entity_types": ["drug", "tumor", "gene", "mutation", "category"], "method": "light", "resolution": "true"}}, "img2txt_id": "", "asr_id": "", "llm_id": "deepseek-chat___OpenAI-API@OpenAI-API-Compatible", "update_time": 1742383633143, "task_type": "graphrag"}
  + Exception Group Traceback (most recent call last):
  |   File "/ragflow/rag/svr/task_executor.py", line 594, in handle_task
  |     await do_handle_task(task)
  |   File "/ragflow/rag/svr/task_executor.py", line 522, in do_handle_task
  |     await run_graphrag(task, task_language, with_resolution, with_community, chat_model, embedding_model, progress_callback)
  |   File "/ragflow/graphrag/general/index.py", line 94, in run_graphrag
  |     await resolve_entities(
  |   File "/ragflow/graphrag/general/index.py", line 231, in resolve_entities
  |     reso = await er(graph, callback=callback)
  |   File "/ragflow/graphrag/entity_resolution.py", line 101, in __call__
  |     async with trio.open_nursery() as nursery:
  |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_core/_run.py", line 1058, in __aexit__
  |     raise combined_error_from_nursery
  | exceptiongroup.ExceptionGroup: Exceptions from Trio nursery (4 sub-exceptions)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in _resolve_candidate
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    |     return msg_from_thread.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    |     return result.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    |     ret = context.run(sync_fn, *args)
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in <lambda>
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/graphrag/general/extractor.py", line 66, in _chat
    |     raise Exception(response)
    | Exception: **ERROR**: Error code: 400 - {'error': {'message': '输入Token超过模型限制 (request id: 20250319195027616899205DVONcrl3)', 'type': 'upstream_error', 'param': '400', 'code': 'bad_response_status_code'}}
    +---------------- 2 ----------------
    | Traceback (most recent call last):
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in _resolve_candidate
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    |     return msg_from_thread.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    |     return result.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    |     ret = context.run(sync_fn, *args)
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in <lambda>
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/graphrag/general/extractor.py", line 66, in _chat
    |     raise Exception(response)
    | Exception: **ERROR**: Error code: 400 - {'error': {'message': '输入Token超过模型限制 (request id: 20250319195027535236662bw7KpkuU)', 'type': 'upstream_error', 'param': '400', 'code': 'bad_response_status_code'}}
    +---------------- 3 ----------------
    | Traceback (most recent call last):
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in _resolve_candidate
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    |     return msg_from_thread.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    |     return result.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    |     ret = context.run(sync_fn, *args)
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in <lambda>
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/graphrag/general/extractor.py", line 66, in _chat
    |     raise Exception(response)
    | Exception: **ERROR**: Error code: 400 - {'error': {'message': '输入Token超过模型限制 (request id: 20250319195027892940097YDLhAIM4)', 'type': 'upstream_error', 'param': '400', 'code': 'bad_response_status_code'}}
    +---------------- 4 ----------------
    | Traceback (most recent call last):
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in _resolve_candidate
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 447, in to_thread_run_sync
    |     return msg_from_thread.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 373, in do_release_then_return_result
    |     return result.unwrap()
    |   File "/ragflow/.venv/lib/python3.10/site-packages/outcome/_impl.py", line 213, in unwrap
    |     raise captured_error
    |   File "/ragflow/.venv/lib/python3.10/site-packages/trio/_threads.py", line 392, in worker_fn
    |     ret = context.run(sync_fn, *args)
    |   File "/ragflow/graphrag/entity_resolution.py", line 176, in <lambda>
    |     response = await trio.to_thread.run_sync(lambda: self._chat(text, [{"role": "user", "content": "Output:"}], gen_conf))
    |   File "/ragflow/graphrag/general/extractor.py", line 66, in _chat
    |     raise Exception(response)
    | Exception: **ERROR**: Error code: 400 - {'error': {'message': '输入Token超过模型限制 (request id: 20250319195028971046021zbPawvoY)', 'type': 'upstream_error', 'param': '400', 'code': 'bad_response_status_code'}}
    +------------------------------------


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions