HuggingFaceTGIGenerator gets stuck when model is not supported #6816

bilgeyucel · 2024-01-23T14:10:07Z

Describe the bug
When an unsupported model by TGI is used with HuggingFaceTGIGenerator, the warm_up works fine but when the component is running, it gets stuck in a loop until the provided API key has reached the free usage limit.

Error message
Here's the preview of the repeating error:

/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1491: UserWarning: API endpoint/model for text-generation is not served via TGI. Ignoring parameters ['watermark', 'stop', 'details', 'decoder_input_details'].
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1497: UserWarning: API endpoint/model for text-generation is not served via TGI. Parameter `details=True` will be ignored meaning only the generated text will be returned.
  warnings.warn(
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    285     try:
--> 286         response.raise_for_status()
    287     except HTTPError as e:

1796 frames
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 

HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base

The above exception was the direct cause of the following exception:

BadRequestError                           Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in text_generation(self, prompt, details, stream, model, do_sample, max_new_tokens, best_of, repetition_penalty, return_full_text, seed, stop_sequences, temperature, top_k, top_p, truncate, typical_p, watermark, decoder_input_details)
   1510         try:
-> 1511             bytes_output = self.post(json=payload, model=model, task="text-generation", stream=stream)  # type: ignore
   1512         except HTTPError as e:

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in post(self, json, data, model, task, stream)
    239             try:
--> 240                 hf_raise_for_status(response)
    241                 return response.iter_lines() if stream else response.content

[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    328             )
--> 329             raise BadRequestError(message, response=response) from e
    330 

BadRequestError:  (Request ID: xy6Melm2TRYU-xFxPPnk3)

Bad request:
The following `model_kwargs` are not used by the model: ['return_full_text', 'details', 'watermark', 'decoder_input_details', 'stop'] (note: typos in the generate arguments will also show up in this list)

During handling of the above exception, another exception occurred:

HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    285     try:
--> 286         response.raise_for_status()
    287     except HTTPError as e:

[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
   1020         if http_error_msg:
-> 1021             raise HTTPError(http_error_msg, response=self)
   1022 

HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base

The above exception was the direct cause of the following exception:
...

The full error message: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing

Expected behavior
HuggingFaceTGIGenerator raises an error saying that this model is not supported by TGI and stops executing. It'd be better to get a warninf in the warm_up phase, but I'm not sure if this is doable.

Additional context
N/A

To Reproduce
Run this colab: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing

FAQ Check

Have you had a look at our new FAQ page?

System:

OS: Colab
GPU/CPU: GPU
Haystack version (commit or version number): Haystack 2.0-beta.5
DocumentStore: -
Reader: -
Retriever: -

The text was updated successfully, but these errors were encountered:

vblagoje · 2024-02-05T11:10:59Z

@bilgeyucel I investigated this issue, and the proper solution seems to be using https://github.com/huggingface/text-generation-inference/tree/main/clients/python#inference-api-usage call and the approach:

from text_generation.inference_api import deployed_models

print(deployed_models())

to see what models are deployed in the free tier. We can do this in the warm_up method only once. If the model specified by the user is not on the list, we'll raise an exception explaining exactly that. Of course, we'll do this only when free-tier is used i.e when url param is None. cc @anakin87

bilgeyucel added type:bug Something isn't working 2.x Related to Haystack v2.0 labels Jan 23, 2024

masci added the P1 High priority, add to the next sprint label Jan 24, 2024

masci assigned vblagoje Feb 5, 2024

vblagoje mentioned this issue Feb 5, 2024

fix: HuggingFaceTGIGenerator gets stuck when model is not supported #6915

Merged

vblagoje closed this as completed in #6915 Feb 6, 2024

This was referenced Mar 20, 2024

Text generation with some non TGI models hangs huggingface/huggingface_hub#2135

Closed

HF TGI generators are restricting too much the available models #7384

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

bilgeyucel commented Jan 23, 2024 •

edited

vblagoje commented Feb 5, 2024

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

HuggingFaceTGIGenerator gets stuck when model is not supported #6816

Comments

bilgeyucel commented Jan 23, 2024 • edited

vblagoje commented Feb 5, 2024

bilgeyucel commented Jan 23, 2024 •

edited