Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

Closed
1 task done
bilgeyucel opened this issue Jan 23, 2024 · 1 comment · Fixed by #6915
Closed
1 task done

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

bilgeyucel opened this issue Jan 23, 2024 · 1 comment · Fixed by #6915
Assignees
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint type:bug Something isn't working

Comments

@bilgeyucel
Copy link
Contributor

bilgeyucel commented Jan 23, 2024

Describe the bug
When an unsupported model by TGI is used with HuggingFaceTGIGenerator, the warm_up works fine but when the component is running, it gets stuck in a loop until the provided API key has reached the free usage limit.

Error message
Here's the preview of the repeating error:

/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1491: UserWarning: API endpoint/model for text-generation is not served via TGI. Ignoring parameters ['watermark', 'stop', 'details', 'decoder_input_details'].
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1497: UserWarning: API endpoint/model for text-generation is not served via TGI. Parameter `details=True` will be ignored meaning only the generated text will be returned.
  warnings.warn(
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    285     try:
--> 286         response.raise_for_status()
    287     except HTTPError as e:

1796 frames
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 

HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base

The above exception was the direct cause of the following exception:

BadRequestError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in text_generation(self, prompt, details, stream, model, do_sample, max_new_tokens, best_of, repetition_penalty, return_full_text, seed, stop_sequences, temperature, top_k, top_p, truncate, typical_p, watermark, decoder_input_details)
   1510         try:
-> 1511             bytes_output = self.post(json=payload, model=model, task="text-generation", stream=stream)  # type: ignore
   1512         except HTTPError as e:

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in post(self, json, data, model, task, stream)
    239             try:
--> 240                 hf_raise_for_status(response)
    241                 return response.iter_lines() if stream else response.content

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    328             )
--> 329             raise BadRequestError(message, response=response) from e
    330 

BadRequestError:  (Request ID: xy6Melm2TRYU-xFxPPnk3)

Bad request:
The following `model_kwargs` are not used by the model: ['return_full_text', 'details', 'watermark', 'decoder_input_details', 'stop'] (note: typos in the generate arguments will also show up in this list)

During handling of the above exception, another exception occurred:

HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    285     try:
--> 286         response.raise_for_status()
    287     except HTTPError as e:

[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 

HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base

The above exception was the direct cause of the following exception:
...

The full error message: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing

Expected behavior
HuggingFaceTGIGenerator raises an error saying that this model is not supported by TGI and stops executing. It'd be better to get a warninf in the warm_up phase, but I'm not sure if this is doable.

Additional context
N/A

To Reproduce
Run this colab: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing

FAQ Check

System:

  • OS: Colab
  • GPU/CPU: GPU
  • Haystack version (commit or version number): Haystack 2.0-beta.5
  • DocumentStore: -
  • Reader: -
  • Retriever: -
@bilgeyucel bilgeyucel added type:bug Something isn't working 2.x Related to Haystack v2.0 labels Jan 23, 2024
@masci masci added the P1 High priority, add to the next sprint label Jan 24, 2024
@vblagoje
Copy link
Member

vblagoje commented Feb 5, 2024

@bilgeyucel I investigated this issue, and the proper solution seems to be using https://github.com/huggingface/text-generation-inference/tree/main/clients/python#inference-api-usage call and the approach:

from text_generation.inference_api import deployed_models

print(deployed_models())

to see what models are deployed in the free tier. We can do this in the warm_up method only once. If the model specified by the user is not on the list, we'll raise an exception explaining exactly that. Of course, we'll do this only when free-tier is used i.e when url param is None. cc @anakin87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 P1 High priority, add to the next sprint type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants