<a href="https://colab.research.google.com/github/suphawadeeth/Building-Generative-AI-Applications-with-Gradio/blob/main/02_NLP_image_captioning_app.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Image Captioning App with Gradio** 🖼️📝
Load your HF API key and relevant Python libraries

In [None]:
from IPython.display import Image, display, HTML
from PIL import Image
import base64
import io
from google.colab import userdata
hf_api_key = userdata.get('HF_API_KEY')
import gradio as gr

## **Building Image Captioning App with Gradio**

Here we'll be using an Inference Endpoint for `Salesforce/blip-image-captioning-base`, a 14M parameter captioning model.

In [None]:
# Helper functions
import requests, json

# Image-to-text endpoint
def get_completion(inputs, parameters=None, ENDPOINT_URL=userdata.get('HF_API_ITT_BASE')):
    headers = {
      "Authorization": f"Bearer {hf_api_key}",
      "Content-Type": "application/json"
    }
    data = { "inputs": inputs }
    if parameters is not None:
        data.update({"parameters": parameters})
    response = requests.request("POST",
                                ENDPOINT_URL,
                                headers=headers,
                                data=json.dumps(data))
    return json.loads(response.content.decode("utf-8"))

From the `get_completion` function above, it creates a caption for the given image, as shown here.

[{'generated_text': 'a dog wearing a santa hat and a red scarf'}]


With Gradio, we can enhance its appeal.

In [None]:
image_url = "https://free-images.com/sm/9596/dog_animal_greyhound_983023.jpg"
print(get_completion(image_url))
Image(url=image_url)

[{'generated_text': 'a dog wearing a santa hat and a red scarf'}]


**Building Captioning App with Gradio Interface**

In [None]:
def image_to_base64_str(pil_image):
    """The function will take an image and transform it to base64 format
    which required to run using API"""
    byte_arr = io.BytesIO()
    pil_image.save(byte_arr, format='PNG')
    byte_arr = byte_arr.getvalue()
    return str(base64.b64encode(byte_arr).decode('utf-8'))

def captioner(image):
    """The function takes an image and generate caption"""
    base64_image = image_to_base64_str(image)
    result = get_completion(base64_image)
    return result[0]['generated_text']

demo = gr.Interface(fn=captioner,
                    inputs=[gr.Image(label="Upload image", type="pil")],
                    outputs=[gr.Textbox(label="Caption")],
                    title="Image Captioning App",
                    description="Caption any image using the BLIP model",
                    allow_flagging="never")

demo.launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://b220df1a58a47ffcc1.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




In [None]:
gr.close_all()