# Deploy ML models on 🤗 Hub using Gradio

## Requirements

In [None]:
!pip install -q gradio transformers gradio_client

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m44.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m305.2/305.2 kB[0m [31m21.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.0/92.0 kB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.9/75.9 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.8/139.8 kB[0m [31m12.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m381.9/381.9 kB[0m [31m19.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.7/45.7 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.5/7.5 MB[0m [31m7.1 MB/

Welcome to our last lesson - ML deployment Lab using 🤗 Hub and Gradio libraries.

At the end of this lab, you will learn how to host a ML API on 🤗 Hub and call that API through a simple HTTPS `POST` command.

For this lesson, we will deploy BLIP model that you have covered on the multi-modal lab session

![image](https://miro.medium.com/v2/resize:fit:4800/format:webp/1*WDOadc-n8f5-cK5JF0AfJw.gif)

The model has been fine-tuned on several multimodal tasks, making it possible to perform three different tasks:

- Image captioning
- Visual question answering
- Image-text retrieval

## 1: Create a Gradio app on 🤗 Spaces

The first step is to make sure that you have an account at [hf.co](https://huggingface.co/), once this is done connect to your account, get into the main page of Hugging Face and click into your Profile (top right of the page) -> "New Space"


<img src="https://huggingface.co/datasets/ybelkada/documentation-images/resolve/main/spaces-1.png" width="50%" />



Once this is done, you will be redirected into this page

<img src="https://huggingface.co/datasets/ybelkada/documentation-images/resolve/main/spaces-2.png" width="50%" />

Make sure to select "Gradio" for th Space, we'll start over with a free-CPU instance. After deciding for a nice name for your Space, click into "Create Space" !

You have now an empty Space, the next step is to build the first blocks of your demo. Make sure to create two files `requirements.txt` and `app.py`. You can leave the `app.py` file empty for now but add `transformers` and `torch` in `requirements.txt` file.

You can either use git commands to push files into your Space, or use the web interface to create and modify files directly.

## 2- Build the `app.py` file

`app.py` can be relatively simple, for the sake of this lesson, we will imagine a simple case where users will query image web links to the app, and the app will be responsible of loading the image from the internet and get inference results from the model.

As seen during the previous lessons, you can use `pipeline` that should automatically take care of everything, including loading images from web.

In the cell below, use gradio to design a simple app, that takes raw text as input for image URL. You can use `pipeline`, more precisely [`image-to-text` pipeline](https://huggingface.co/docs/transformers/v4.36.0/en/main_classes/pipelines#transformers.ObjectDetectionPipeline) and use [`Salesforce/blip-image-captioning-base`](https://huggingface.co/Salesforce/blip-image-captioning-base) for the model.

### Solution

In [None]:
# Lab will be narrow, and should fit at most 70 characters in one line
#123456789#123456789#123456789#123456789#123456789#123456789#123456789

import gradio as gr
from transformers import pipeline

pipe = pipeline("image-to-text",
                model="Salesforce/blip-image-captioning-base")

def launch(input):
    out = pipe(input)
    return out[0]['generated_text']

iface = gr.Interface(launch,
                     inputs=gr.Image(type='pil'),
                     outputs="text")
iface.launch()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://ea14120e41cf5d7e4b.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




### Your code:

Once you confirmed that everything worked fine, you can copy-paste that snippet on the `app.py` file in your Space, and wait until your Space is successfully built.

## 3- Call your API using `gradio_client`

Now that if your demo works fine, you can call it simply using `Client` API from `gradio_client`.

We made a simple demo at [`ybelkada/blip-dlai-api`](https://huggingface.co/spaces/ybelkada/blip-dlai-api) that you can use to familiarize yourself with `gradio_client`.

If the space is down, you can duplicate the Space and run it on your own account.

At the bottom of the app, there is a section "use with API" that you can click and see how to use the gradio app as an API. Use the correct link in order to properly use it.

Inspect also the API using `client.view_api()` to understand how to use the client API.



### Your code:

### Solution

In [None]:
# Lab will be narrow, and should fit at most 70 characters in one line
#123456789#123456789#123456789#123456789#123456789#123456789#123456789
from gradio_client import Client

client = Client(
		 "https://ybelkada-blip-dlai-api.hf.space/--replicas/leu24/"
)
result = client.predict(
		"https://raw.githubusercontent.com/gradio-app/gradio/main/test/test_files/bus.png",
		api_name="/predict"
)

Loaded as API: https://ybelkada-blip-dlai-api.hf.space/--replicas/leu24/ ✔


In [None]:
client.view_api()

Client.predict() Usage Info
---------------------------
Named API endpoints: 1

 - predict(input, api_name="/predict") -> output
    Parameters:
     - [Image] input: filepath 
    Returns:
     - [Textbox] output: str 



In [None]:
result

'a red bus with a blue stripe on the side'

Amazing! We were able to call the gradio Space as an API. This means the whole compute requirement is completely offloaded to the Space, i.e. you don't need to load the model locally.

You can further accelerate this by creating a GPU instance (paid) in order to load much powerful larger models.

If you use a private Space, you can pass your Hugging Face token through the `hf_token` argument on `Client` init method

```python
from gradio_client import Client

client = Client("abidlabs/whisper-large-v2", hf_token=YOUR_TOKEN)  # connecting to a Hugging Face Space
client.predict("test.mp4", api_name="/predict")
>> What a nice recording! # returns the result of the remote API call

client = Client("https://bec81a83-5b5c-471e.gradio.live")  # connecting to a temporary Gradio share URL
job = client.submit("hello", api_name="/predict")  # runs the prediction in a background thread
job.result()
```

Read more about the Client API in this [documentation section](https://www.gradio.app/guides/getting-started-with-the-python-client) - for those who are interested, gradio also offers a similar JS client!

You can also run the Space under GPU environments for faster inference using the Zero-GPU!

Read more about this feature [here](https://huggingface.co/zero-gpu-explorers), the idea you can use free GPU instances through a simple API in your gradio Space:
```py
import spaces
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(...)
pipe.to('cuda')

@spaces.GPU
def generate(prompt):
    return pipe(prompt).images

gr.Interface(
    fn=generate,
    inputs=gr.Text(),
    outputs=gr.Gallery(),
).launch()
```

We made a simple GPU-Zero Space using Llava model [here](https://huggingface.co/spaces/ybelkada/llava-1.5-dlai) - in the code block below try it out yourself!

In [9]:
# Lab will be narrow, and should fit at most 70 characters in one line
#123456789#123456789#123456789#123456789#123456789#123456789#123456789
from gradio_client import Client

client = Client("https://ybelkada-llava-1-5-dlai.hf.space/--replicas/5ppw4/")
result = client.predict(
		"Can you please describe this image for me?",	# str  in 'parameter_0' Textbox component
		"https://cms.eichertrucksandbuses.com/uploads/truck/sub-category/a933e5958e4a354cfb8d22665bd244fd.png",	# filepath  in 'parameter_1' Image component
		api_name="/predict"
)
print(result)

Loaded as API: https://ybelkada-llava-1-5-dlai.hf.space/--replicas/5ppw4/ ✔
 The image features a large yellow bus driving down a street. The bus is prominently displayed in the scene, occupying a significant portion of the image. The bus appears to be a public transit vehicle, possibly a tour bus, as it is driving down the road.

There are several people visible in the scene, with some standing near the bus and others further away. They seem to be going about their daily activities, possibly waiting for the bus or walking along the street.


## 4- Pushing this further

We use the smallest version of BLIP model family! Now it is your turn to try your hands on deploying your demo on 🤗 Spaces and use the gradio Client API. Given all the other lessons on this course, we will ask you to pick the one you liked the most and make it a gradio demo that you deploy and use it as an API.