# Deploy ML Models on Hub using Gradio

In [1]:
!pip install transformers
!pip install gradio
!pip install gradio_client

Collecting gradio
  Downloading gradio-4.27.0-py3-none-any.whl (17.1 MB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m17.1/17.1 MB[0m [31m34.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Collecting fastapi (from gradio)
  Downloading fastapi-0.110.2-py3-none-any.whl (91 kB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m91.9/91.9 kB[0m [31m7.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting ffmpy (from gradio)
  Downloading ffmpy-0.3.2.tar.gz (5.5 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gradio-client==0.15.1 (from gradio)
  Downloading gradio_client-0.15.1-py3-none-any.whl (313 kB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚î

**NOTE:** If you face any issues when making an API call to your own space, you can try to upgrade your version of gradio_client:
`pip install -U gradio_client`

In [2]:
from transformers.utils import logging
logging.set_verbosity_error()

import warnings
warnings.filterwarnings("ignore", message="Using the model-agnostic default `max_length`")

## ü§ó Spaces
- You can create an account on hugging face from [here](https://huggingface.co).

## APP 1: Image Captioning
Load the model and create an app interface using Gradio to perform Image Captioning.

In [3]:
import os
import gradio as gr
from transformers import pipeline

In [4]:
pipe = pipeline("image-to-text", model="Salesforce/blip-image-captioning-base")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

In [5]:
def launch(input):
    out = pipe(input)
    return out[0]['generated_text']

The **launch** function takes the input, call the pipeline, get the output and returns the generated text from the output.

In [6]:
iface = gr.Interface(launch,
                     inputs=gr.Image(type='pil'),
                     outputs="text")

This is the gradio interface, with inputs being **`gradio.Image()`** and output being **`text`** and then we launch the interface.

In [8]:
iface.launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://e4ef47342cfd4e9a5c.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




Here **`share=True`** will generate a link of our application that we can share with other users.

In [9]:
iface.close()

Closing server running on port: 7860


Now what if we want to deploy a larger model or we don't want to deploy it locally on our computer because we want to use our system for something else. This is possible using **Hugging Face Spaces**.<br>
Once we have confirmed that the app works locally, we will export the app directly into the Hugging Face Spaces by creating our own space.<br>

## How to create a Space on Hugging Face?
* First you need to create an account at HuggingFace website and make sure to connect to the main website.
* Then navigate to the menu at top-right of the window and go to **New Space** to create a space.
* Go to [https://huggingface.co/spaces](https://huggingface.co/spaces)

<img src="Images/create_new_space_00.png" width="800" height="300"><br>

- Click the button "create new space".
* Give a name to your space, example **"blip-image-captioning-api"**

<img src="Images/create_new_space_01.jpeg" width="600" height="500"><br>

* The choose a license, or use the default **apache-2.0** license.
* Then select the **Space SDK** as **Gradio**, you can choose other option also like: Streamlit, Docker and Static.

<img src="Images/create_new_space_02.png" width="600" height="200"><br>

* Then select the basic hardware and put it public so that everyone can use it and click on **Create Space**.

After creating the space we need to create two files: **`requirements.txt`** to list all the required files needed to run the space and the main file that we need to call **`app.py`**.

## Code to be uploaded on app.py file
```Python
import gradio as gr
from transformers import pipeline

pipe = pipeline("image-to-text", model="Salesforce/blip-image-captioning-base")

def launch(input):
    out = pipe(input)
    return out[0]['generated_text']

iface = gr.Interface(launch,
                     inputs=gr.Image(type='pil'),
                     outputs="text")

iface.launch()
```
- Notice that `iface.launch()` does not have `share=True`

- You will see that the app is still "Building" for a few minutes.
- You can click on the "App" menu to the left of the "Files" menu to see the console as the space is being built.

<img src="Images/app_tab.png" width="800" height="150"><br>

The UI of the APP will be like this:

<img src="Images/app_tab_02.png" width="600" height="400"><br>

## To use the APP via API
We can read the API Documentation of the APP that we have built.

```Python
from gradio_client import Client

client = Client("user_name/app_name")
result = client.predict(
		"path",	# filepath  in 'input' Image component
		api_name="/predict"
)
print(result)
```

- Run the above code on any python IDE to use the APP.

we can also inspect information in the API by calling **`client.view_api()`**. The output may look like this:

```
Client.predict() Usage Info
---------------------------
Named API endpoints: 1

 - predict(input, api_name="/predict") -> output
    Parameters:
     - [Image] input: filepath
    Returns:
     - [Textbox] output: str

```

## Host Private Spaces
In case it we want to host a private model, we can create a private space. And to access any private space via API we need a **token** to access the APP.

### Get an access token
- To get an access token, go to your profile (click on your profile icon).
- On your profile page, click the "Settings" button on the left.
- In your profile settings, on the left side menu, click "Access Tokens".
- Click "New token".
- In the pop-up, give a description of what the token is for.
- You can leave it as "read" (the other option is "write").
- Click "create new token".

### Modify the API call to include your access token

```Python
from gradio_client import Client

client = Client("user_name/app_name",
                hf_token=hf_access_token
               )
result = client.predict(
		"image_path",
		api_name="/predict"
)
print(result)
```

### Saving your access token securely
- It's recommended that you not hard code the access token.

```Python
HF_TOKEN="abc1234" # not recommended
```

- You can save your access token to a file ".env"

```
HF_ACCESS_TOKEN="abc123"
```

Then access that environment variable with the `dotenv` library

```Python
# !pip install python-dotenv # install library
from dotenv import load_dotenv, find_dotenv
import os
_ = load_dotenv(find_dotenv())
hf_access_token = os.getenv("HF_ACCESS_TOKEN")
```

### GPU Zero Space
We have been deployinh the app on CPU instance on the HF Spaces. If we want to deploy much larger model that cannot fit on a CPU instance or what if the CPU instance is a bit slow.<br>
There is a feature called **GPU zero space** that we can use within the HF spaces where we can spin free GPUs on demand for our spaces. [ZeroGPU Explorers](https://huggingface.co/zero-gpu-explorers)

<img src="Images/gpu_01.png" width="600" height="300"><br>

Search for ZeroGPU Explores on HF and request to join the organization. Once our request is accepted we directly have access to zero GPU feature.<br>
While creating a new space we can see the option of zero GPU appearing with dimension free.