#**Image Understanding with GPT-4 model**
- Ask question about image.
- Our model will understand the image and generate response.

###**Install Dependencies**

In [22]:
!pip install openai gradio pillow



###**Import Statements**

In [23]:
import gradio as gr
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

###**Retrive API key from Secrets and Set as an ENV**

In [11]:
# Retrieve the API key from Colab's secrets
from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
imgbb_api_key = userdata.get('imgbb_api_key')

In [12]:
# Set OPENAI_API_KEY as an ENV
import os
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

###**Create OpenAI Client**

In [13]:
from openai import OpenAI
client=OpenAI()

##**Optional : Response Generation**

In [24]:


response = client.responses.create(
    model="gpt-4.1-mini",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "what's in this image?"},
            {
                "type": "input_image",
                "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
            },
        ],
    }],
)

print(response.output_text)

This image shows a wooden boardwalk path running through a lush green grassy field. The grass on either side of the path is tall and vibrant. In the background, there are some bushes and trees scattered along the horizon. Above is a blue sky with wispy clouds spread across it, suggesting a clear and calm day. The scene appears to be a natural or park-like setting, evoking a peaceful and serene atmosphere.


##**Example 1: Gradio Image Q/A App without third party API**

In [30]:
def analyze_image(image,question):
    try:
        # Convert image to base64
        buffered = BytesIO()
        image.save(buffered, format="PNG")
        img_b64 = base64.b64encode(buffered.getvalue()).decode()

        # GPT-4 Vision input format
        response = client.chat.completions.create(
            model="gpt-4.1",
            messages=[
                {
                    "role": "user",
                    "content": [
                        {"type": "text", "text": question},
                        {
                            "type": "image_url",
                            "image_url": {
                                "url": f"data:image/png;base64,{img_b64}"
                            }
                        }
                    ],
                }
            ],
            max_tokens=500,
        )

        return response.choices[0].message.content

    except Exception as e:
        return f"❌ Error: {str(e)}"

# Gradio UI
iface = gr.Interface(
    fn=analyze_image,
    inputs=[
        gr.Image(type="pil", label="Upload an image"),
        gr.Textbox(label="Your question about the image")
    ],
    outputs="text",
    title="🖼️ Ask Anything About Your Image",
)

iface.launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://fe1a1539a5acaf8dad.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




##**Example 2: Create image URL using 3rd party API for Response Generation**

Get your free api key from here to upload image and create URL :  https://api.imgbb.com/

In [29]:
import tempfile
import requests

def ask_image_question(image, question):
    # Save to temp file
    with tempfile.NamedTemporaryFile(suffix=".png", delete=False) as tmp:
        image.save(tmp.name)
        tmp_path = tmp.name

    # Upload to ImgBB
    with open(tmp_path, "rb") as f:
        response = requests.post(
            "https://api.imgbb.com/1/upload",
            params={"key": imgbb_api_key},
            files={"image": f},
        )

    if response.status_code != 200:
        return f"ImgBB upload failed: {response.text}"

    try:
        image_url = response.json()["data"]["url"]
    except Exception as e:
        return f"Failed to parse ImgBB response: {str(e)}"

    print("Uploaded image URL:", image_url)

    # Prepare message
    user_msg = {
        "role": "user",
        "content": [
            {"type": "input_text", "text": question},
            {"type": "input_image", "image_url": image_url}
        ]
    }

    # Call Vision model
    try:
        response = client.responses.create(
            model="gpt-4.1-mini",
            input=[user_msg]
        )
        return response.output_text
    except Exception as e:
        return f"OpenAI API error: {str(e)}"

###**Gradio Interface**

In [31]:
# Gradio Interface
iface = gr.Interface(
    fn=ask_image_question,
    inputs=[
        gr.Image(type="pil", label="Upload an image"),
        gr.Textbox(label="Your question about the image")
    ],
    outputs=gr.Textbox(label="GPT-4 Vision answer"),
    title="🖼️ Ask Anything About Your Image",
    description="Upload an image and ask a question about it.",
)

if __name__ == "__main__":
    iface.launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a82a623ef84ab65968.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
