You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When an unsupported model by TGI is used with HuggingFaceTGIGenerator, the warm_up works fine but when the component is running, it gets stuck in a loop until the provided API key has reached the free usage limit.
Error message
Here's the preview of the repeating error:
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1491: UserWarning: API endpoint/model for text-generation is not served via TGI. Ignoring parameters ['watermark', 'stop', 'details', 'decoder_input_details'].
warnings.warn(
/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py:1497: UserWarning: API endpoint/model for text-generation is not served via TGI. Parameter `details=True` will be ignored meaning only the generated text will be returned.
warnings.warn(
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
285 try:
--> 286 response.raise_for_status()
287 except HTTPError as e:
1796 frames
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
1020 if http_error_msg:
-> 1021 raise HTTPError(http_error_msg, response=self)
1022
HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base
The above exception was the direct cause of the following exception:
BadRequestError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in text_generation(self, prompt, details, stream, model, do_sample, max_new_tokens, best_of, repetition_penalty, return_full_text, seed, stop_sequences, temperature, top_k, top_p, truncate, typical_p, watermark, decoder_input_details)
1510 try:
-> 1511 bytes_output = self.post(json=payload, model=model, task="text-generation", stream=stream) # type: ignore
1512 except HTTPError as e:
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/inference/_client.py](https://localhost:8080/#) in post(self, json, data, model, task, stream)
239 try:
--> 240 hf_raise_for_status(response)
241 returnresponse.iter_lines() if stream else response.content
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
328 )
--> 329 raise BadRequestError(message, response=response) from e
330
BadRequestError: (Request ID: xy6Melm2TRYU-xFxPPnk3)
Bad request:
The following `model_kwargs` are not used by the model: ['return_full_text', 'details', 'watermark', 'decoder_input_details', 'stop'] (note: typos in the generate arguments will also show up in this list)
During handling of the above exception, another exception occurred:
HTTPError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
285 try:
--> 286 response.raise_for_status()
287 except HTTPError as e:
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
1020 if http_error_msg:
-> 1021 raise HTTPError(http_error_msg, response=self)
1022
HTTPError: 400 Client Error: Bad Request for url: https://api-inference.huggingface.co/models/google/flan-t5-base
The above exception was the direct cause of the following exception:
...
Expected behavior
HuggingFaceTGIGenerator raises an error saying that this model is not supported by TGI and stops executing. It'd be better to get a warninf in the warm_up phase, but I'm not sure if this is doable.
from text_generation.inference_api import deployed_models
print(deployed_models())
to see what models are deployed in the free tier. We can do this in the warm_up method only once. If the model specified by the user is not on the list, we'll raise an exception explaining exactly that. Of course, we'll do this only when free-tier is used i.e when url param is None. cc @anakin87
Describe the bug
When an unsupported model by TGI is used with
HuggingFaceTGIGenerator
, thewarm_up
works fine but when the component is running, it gets stuck in a loop until the provided API key has reached the free usage limit.Error message
Here's the preview of the repeating error:
The full error message: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing
Expected behavior
HuggingFaceTGIGenerator raises an error saying that this model is not supported by TGI and stops executing. It'd be better to get a warninf in the warm_up phase, but I'm not sure if this is doable.
Additional context
N/A
To Reproduce
Run this colab: https://colab.research.google.com/drive/1Fy3DM1yn2-2f7zbsfzKsvfDe0PXQkFhS?usp=sharing
FAQ Check
System:
The text was updated successfully, but these errors were encountered: