Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requested number of tokens exceed the max supported by the model #195

Closed
triptu opened this issue Jan 9, 2023 · 6 comments · Fixed by #199
Closed

requested number of tokens exceed the max supported by the model #195

triptu opened this issue Jan 9, 2023 · 6 comments · Fixed by #199

Comments

@triptu
Copy link

triptu commented Jan 9, 2023

Sample code

text = "...."    # > 40k characters

documents = [Document(text)]
llm = OpenAI(temperature=0.7, model_name="text-curie-001")
llm_predictor = LLMPredictor(llm)
prompt_helper = PromptHelper.from_llm_predictor(self.llm_predictor)

index = GPTListIndex(documents, llm_predictor=self.llm_predictor, prompt_helper=self.prompt_helper)
index.query(prompt, response_mode="tree_summarize")

Surprisingly this seems to be happening for me for all long texts. It doesn't happen when davinci is used though and went unnoticed at first due to #182. Anyway I can help in debugging? which function/file should I look into?

Stack Trace

File "/Users/tushar/PycharmProjects/transcription/transcription/gptindex.py", line 35, in _list_index_summarize
    return str(index.query(prompt, response_mode="tree_summarize"))
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/base.py", line 322, in query
    return query_runner.query(query_str, self._index_struct)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/query/query_runner.py", line 106, in query
    return query_obj.query(query_str, verbose=self._verbose)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/utils.py", line 113, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/query/base.py", line 233, in query
    response = self._query(query_str, verbose=verbose)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/query/base.py", line 222, in _query
    response_str = self._give_response_for_nodes(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/query/base.py", line 183, in _give_response_for_nodes
    response = self.response_builder.get_response(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/response/builder.py", line 239, in get_response
    return self._get_response_tree_summarize(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/response/builder.py", line 210, in _get_response_tree_summarize
    root_nodes = index_builder.build_index_from_nodes(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/indices/common/tree/base.py", line 103, in build_index_from_nodes
    new_summary, _ = self._llm_predictor.predict(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/langchain_helpers/chain_wrapper.py", line 96, in predict
    llm_prediction = self._predict(prompt, **prompt_args)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/gpt_index/langchain_helpers/chain_wrapper.py", line 82, in _predict
    llm_prediction = llm_chain.predict(**full_prompt_args)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 103, in predict
    return self(kwargs)[self.output_key]
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/base.py", line 146, in __call__
    raise e
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/base.py", line 142, in __call__
    outputs = self._call(inputs)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 87, in _call
    return self.apply([inputs])[0]
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 78, in apply
    response = self.generate(input_list)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/chains/llm.py", line 73, in generate
    response = self.llm.generate(prompts, stop=stop)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/llms/base.py", line 81, in generate
    raise e
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/llms/base.py", line 77, in generate
    output = self._generate(prompts, stop=stop)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/langchain/llms/openai.py", line 155, in _generate
    response = self.client.create(prompt=_prompts, **params)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/openai/api_resources/completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 115, in create
    response, _, api_key = requestor.request(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 181, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 396, in _interpret_response
    self._interpret_response_line(
  File "/Users/tushar/PycharmProjects/transcription/venv/lib/python3.9/site-packages/openai/api_requestor.py", line 429, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 2049 tokens, however you requested 3159 tokens (2903 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

@jerryjliu
Copy link
Collaborator

Oh interesting, it might be due to the from_llm_predictor_call function which doesn't infer the right max prompt size for curie, but i'll have time to take a look tonight. Btw you can manually set the max prompt size here: https://gpt-index.readthedocs.io/en/latest/how_to/custom_llms.html

@VivaLaPanda
Copy link
Contributor

Ran into this issue with base davinci and had to set a custom prompthelper with a lower max_input size

@jerryjliu
Copy link
Collaborator

@triptu @VivaLaPanda do you happen to have an example notebook / data I can try out? feel free to DM me as well (can join the discord here, https://discord.gg/58FeekwU, username is jerryjliu98)

@jerryjliu
Copy link
Collaborator

i'm able to repro a somewhat similar issue, will take a look! hopefully will have a fix in a bit

@jerryjliu
Copy link
Collaborator

ok i think i figured it out! so sorry about that, there was a bug in the way I counted tokens when splitting. I should have a fix soon

@triptu
Copy link
Author

triptu commented Jan 10, 2023

awesome! joined the discord, let me know if you still need a data sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants