<a href="https://colab.research.google.com/github/suphawadeeth/Building-Generative-AI-Applications-with-Gradio/blob/main/describe_and_generate_app.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Describe and Generate Game App**
In this app, we are putting together of 'image-to-text' and 'text-to-image' in a single app.

The idea is to build a Game App with this simple steps:
- Generate caption from the input image
- Then from generated caption, we'll generate a new image

---

Load your HF API key and relevant Python libraries

In [None]:
import os
import io
from IPython.display import Image, display, HTML
from PIL import Image
import base64
from google.colab import userdata
hf_api_key = userdata.get('HF_API_KEY')

In [None]:
#### Helper function
import requests, json

#Here we are going to call multiple endpoints!
def get_completion(inputs, parameters=None, ENDPOINT_URL=""):
    headers = {
      "Authorization": f"Bearer {hf_api_key}",
      "Content-Type": "application/json"
    }
    data = { "inputs": inputs }
    if parameters is not None:
        data.update({"parameters": parameters})
    response = requests.request("POST",
                                ENDPOINT_URL,
                                headers=headers,
                                data=json.dumps(data))
    if ENDPOINT_URL == ITT_ENDPOINT:
        return json.loads(response.content.decode("utf-8"))
    else:
        return response.content

In [None]:
#text-to-image api
TTI_ENDPOINT = userdata.get('HF_API_TTI_BASE')

#image-to-text api
ITT_ENDPOINT = userdata.get('HF_API_ITT_BASE')

# **Building your game with gr.Blocks()**

In [None]:
# Bring all functions from previous apps (text-to-image and image-to-text apps)
def image_to_base64_str(pil_image):
    byte_arr = io.BytesIO()
    pil_image.save(byte_arr, format='PNG')
    byte_arr = byte_arr.getvalue()
    return str(base64.b64encode(byte_arr).decode('utf-8'))

def base64_to_pil(img_base64):
    #base64_decoded = base64.b64decode(img_base64)
    byte_stream = io.BytesIO(img_base64)
    pil_image = Image.open(byte_stream)
    return pil_image

def captioner(image):
    base64_image = image_to_base64_str(image)
    result = get_completion(base64_image, None, ITT_ENDPOINT)
    return result[0]['generated_text']

def generate(prompt):
    output = get_completion(prompt, None, TTI_ENDPOINT)
    result_image = base64_to_pil(output)
    return result_image

## **First, build a simple captioning app**

In [None]:
import gradio as gr

with gr.Blocks() as demo:
    gr.Markdown("# Describe-and-Generate game 🖍️")
    image_upload = gr.Image(label="Your first image",type="pil")
    btn_caption = gr.Button("Generate caption")
    caption = gr.Textbox(label="Generated caption")

    btn_caption.click(fn=captioner, inputs=[image_upload], outputs=[caption])

gr.close_all()
demo.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://d7cfa57a1cc45a4648.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




## **Let's add image generation function**

In [None]:
with gr.Blocks() as demo:
    gr.Markdown("# Describe-and-Generate game 🖍️")
    image_upload = gr.Image(label="Your first image",type="pil")
    btn_caption = gr.Button("Generate caption")
    caption = gr.Textbox(label="Generated caption")
    btn_image = gr.Button("Generate image")
    image_output = gr.Image(label="Generated Image")

    # btn_caption calls a "captioner" function, takes image and return generated caption
    btn_caption.click(fn=captioner, inputs=[image_upload], outputs=[caption])

    # btn_image takes the output (generated caption) and return a new (generated) image
    btn_image.click(fn=generate, inputs=[caption], outputs=[image_output])

gr.close_all()
demo.close()
demo.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://31bd77269bf369b782.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7863 <> https://31bd77269bf369b782.gradio.live




## **Streamlined Version - One Click Does it All!**
Create a streamlined version where the user can click just one button and do all the work at once



In [None]:
def caption_and_generate(image):
    caption = captioner(image)
    image = generate(caption)
    return [caption, image]

with gr.Blocks() as demo:
    gr.Markdown("# Describe-and-Generate game 🖍️")
    image_upload = gr.Image(label="Your first image",type="pil")
    btn_all = gr.Button("Caption and generate")
    caption = gr.Textbox(label="Generated caption")
    image_output = gr.Image(label="Generated Image")

    btn_all.click(fn=caption_and_generate, inputs=[image_upload], outputs=[caption, image_output])

gr.close_all()
demo.launch()

Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://7487d34f9b2dc3a649.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




In [None]:
gr.close_all()

Closing server running on port: 7863
