Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to create an llm via HuggingFaceHub() but get "coroutine 'AsyncRunManager.on_text' was never awaited' #113

Open
maspotts opened this issue May 17, 2023 · 8 comments

Comments

@maspotts
Copy link

Hi: I'm trying to swap out:

llm = ChatOpenAI(temperature = temperature, model_name = 'gpt-3.5-turbo', max_tokens = max_output_size)

with:

llm = HuggingFaceHub(repo_id = 'mosaicml/mpt-7b-chat', model_kwargs = { "temperature": temperature, "max_length": max_output_size })

and then creating my index as usual via index = Docs(llm = llm). When I call index.query() with the index built with the ChatOpenAI() model it works, but when I call index.query() with the HuggingFaceHub() model I get this error:

/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py:110: RuntimeWarning: coroutine 'AsyncRunManager.on_text' was never awaited
  run_manager.on_text(_text, end="\n", verbose=self.verbose)

(although I'm not calling index.aquery()). Am I making an obvious mistake? I'm hoping there's a nice simple way to make this work, so I can continue my successful implementation using paperqa! Thanks in advance...

Mike

@whitead
Copy link
Owner

whitead commented May 18, 2023

Try setting key_filter = False when you run query

@maspotts
Copy link
Author

maspotts commented May 18, 2023

Thanks: tried it but got same error. Is this (specifying a HuggingFaceHub() model) expected to work? Or am I trying something that's not officially supported? I'm not yet familiar enough with langchain to know whether what I'm trying to do makes sense!

@whitead
Copy link
Owner

whitead commented May 18, 2023

The "error" you're showing is just a warning. Are you sure there is not another exception?

@maspotts
Copy link
Author

Ah, good point: I may have jumped the gun! I had assumed something had gone fatally wrong because the process hangs after emitting the warning. I've started it again and left it running; will see if it's completed or errored out or still hanging in the morning...

@maspotts
Copy link
Author

maspotts commented May 18, 2023

OK here's the final exception (actually two exceptions?):

NotImplementedError: Async generation not implemented for this LLM.
ValueError: Error raised by inference API: Model mosaicml/mpt-7b-chat time out

/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py:110: RuntimeWarning: coroutine 'AsyncRunManager.on_text' was never awaited
  run_manager.on_text(_text, end="\n", verbose=self.verbose)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Traceback (most recent call last):
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/qaprompts.py", line 91, in agenerate
    return await super().agenerate(input_list, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 90, in agenerate
    return await self.llm.agenerate_prompt(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 136, in agenerate_prompt
    return await self.agenerate(prompt_strings, stop=stop, callbacks=callbacks)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 266, in agenerate
    raise e
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 258, in agenerate
    await self._agenerate(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 395, in _agenerate
    await self._acall(prompt, stop=stop, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 363, in _acall
    raise NotImplementedError("Async generation not implemented for this LLM.")
NotImplementedError: Async generation not implemented for this LLM.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mike/src/chatbot/./chatbot", line 3187, in <module>
    answer = bot.query(args.query)
  File "/Users/mike/src/chatbot/./chatbot", line 1388, in query
    response = index.query(input_text, k = k, max_sources = max_sources)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/docs.py", line 378, in query
    return loop.run_until_complete(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/docs.py", line 412, in aquery
    answer = await self.aget_evidence(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/docs.py", line 311, in aget_evidence
    results = await asyncio.gather(*[process(doc) for doc in docs])
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/docs.py", line 298, in process
    context=await summary_chain.arun(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 262, in arun
    return (await self.acall(kwargs, callbacks=callbacks))[self.output_keys[0]]
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 180, in acall
    raise e
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/base.py", line 174, in acall
    await self._acall(inputs, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 195, in _acall
    response = await self.agenerate([inputs], run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/paperqa/qaprompts.py", line 93, in agenerate
    return self.generate(input_list, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/chains/llm.py", line 79, in generate
    return self.llm.generate_prompt(
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 127, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 199, in generate
    raise e
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 193, in generate
    self._generate(missing_prompts, stop=stop, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/base.py", line 377, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager)
  File "/Users/mike/.pyenv/versions/3.10.7/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/llms/huggingface_hub.py", line 111, in _call
    raise ValueError(f"Error raised by inference API: {response['error']}")
ValueError: Error raised by inference API: Model mosaicml/mpt-7b-chat time out

@whitead
Copy link
Owner

whitead commented May 25, 2023

Looks like the model timed out @maspotts - the open source models are a lot slower and paper-qa requires a lot of calls. I would recommend Claude or HuggingFace with a GPU instance if you're look to replace the OpenAI models. Llama can work too, but it's extremely slow

@maspotts
Copy link
Author

Thanks: (off-topic but) do I need to upgrade to a paid huggingface tier to get sufficient performance?

@whitead
Copy link
Owner

whitead commented Jun 10, 2023

Not sure, I have paid huggingface and I'm not sure what is available on free tier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants