# 1. Install the Hugging Face hub library

This will use the model hosting on the Hugging Face portal

https://huggingface.co/docs/huggingface_hub/index

### If using local machine - run the following cell

In [1]:
# !pip install huggingface_hub

### On Google Collab - run the following cell

In [2]:
# !pip install transformers torch huggingface_hub -q

# 2. Create the Inference Client

Client will use the model hosted on the Hugging Face portal

**Class**

https://huggingface.co/docs/huggingface_hub/v0.20.2/en/package_reference/inference_client#huggingface_hub.InferenceClient

**Supported tasks**

https://huggingface.co/docs/huggingface_hub/guides/inference#supported-tasks

**NOTE:**

Sometimes API calls fail due to heavy usage of the model on HF. If you get a invocation error, try a again!!

In [14]:
from huggingface_hub import InferenceClient
import getpass

# You will prompted for the HuggingFace token
print("Copy/paste HuggingFace token and hit <enter>")
HUGGINGFACEHUB_API_TOKEN = getpass.getpass()

Copy/paste HuggingFace token and hit <enter>


 ········


In [15]:
# Change the model name if you would like to try out a different model
model_name = "distilbert-base-uncased-finetuned-sst-2-english"

# Create the client
client = InferenceClient(model=model_name, token=HUGGINGFACEHUB_API_TOKEN)


# 3. List deployed models

Returns a subset of models for the specified framework

https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.list_deployed_models

### Note : Error "Bad Request"
The HF backend is throwing an error on the list_deployed_models call as of 03/01/2025. Pease check the status at the following thread. 
Please continue with the course as this API is not critical but good to know.

https://discuss.huggingface.co/t/huggingface-hub-client-giving-error-on-list-deployed-models/143678

**Note:**

An invalid framework throws an HTTP error.

In [None]:
# For a specific framework
framework = "text-generation-inference"  # "text-to-speech", 
deployed_models = client.list_deployed_models([framework])
print(deployed_models)

## Get all the deploymed models
# client = InferenceClient(token=HUGGINGFACEHUB_API_TOKEN)
# deployed_models = client.list_deployed_models(frameworks="all")
# print(deployed_models)


# 4. Check if a specific model is available as endpoint

In [None]:
model_id = "distilbert-base-uncased-finetuned-sst-2-english"

client.get_model_status(model_id)

# 5. Inference

In [18]:
%%time

text = "I loved the restaurant"

client.text_classification(text)

CPU times: total: 15.6 ms
Wall time: 218 ms


[TextClassificationOutputElement(label='POSITIVE', score=0.9998492002487183),
 TextClassificationOutputElement(label='NEGATIVE', score=0.00015075344708748162)]

In [19]:
%%time

text = "i hated it"

client.text_classification(text)

CPU times: total: 0 ns
Wall time: 107 ms


[TextClassificationOutputElement(label='NEGATIVE', score=0.9996846914291382),
 TextClassificationOutputElement(label='POSITIVE', score=0.00031535723246634007)]