# Our first Generative AI application

For this experiment, we will use the GPT-4-Vision multimodal model. We do not have direct access to the parameters of the model, so instead, we interact with it through an API.

This is a deep-learning based model, which takes two inputs: and image, and some text describing a task or question. The model processes both, and then outputs a response.

For the moment, we won't go into the details (that is the matter of the whole course!), but we can start playing with it right away.

![GPT-4-Vision](https://media.licdn.com/dms/image/D5622AQFG8oaKzJdh-A/feedshare-shrink_800/0/1697201062338?e=2147483647&v=beta&t=KvgfdmBhdAsyxLi3tp_qnS99Xh3Hr8WbqXaMf0iD_ZI)

In [None]:
# You need to install the OpenAI library:
#pip install --upgrade openai
from openai import OpenAI
import os

# This API key is provided by the professor. Please, do not share it with anyone!
os.environ["OPENAI_API_KEY"] = "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"

client = OpenAI()

Let's define a function to interact with the GPT-4-Vision API.

It takes as inputs:
* The image_url
* The textual prompt with our query

In [None]:
def compute_response(image_url, prompt):
    response = client.chat.completions.create(
        model="gpt-4-vision-preview",
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": prompt},
                    {
                        "type": "image_url",
                        "image_url": image_url,
                    },
                ],
            }
        ],
        max_tokens=500,
    )
    return response.choices[0].message.content

In [None]:
from IPython.core.display import Image, display
display(Image(url='https://pbs.twimg.com/media/FrSGyS9WIAMLGbe?format=jpg&name=small', width=500, unconfined=True))

In [None]:
image_url = "https://pbs.twimg.com/media/FrSGyS9WIAMLGbe?format=jpg&name=small"
prompt = "Describe what is in this photo"

response = compute_response(image_url, prompt)

In [None]:
print(response)

## Creating a photo2recipe app

We just have to change the prompt!

In [None]:
recipe_prompt = "Based on the ingredients you can see in the photo, please write a recipe for some dish."

recipe_answer = compute_response(image_url, recipe_prompt)

In [None]:
print(recipe_answer)

Let's try with another photo:

In [None]:
from IPython.core.display import Image, display
display(Image(url='https://images.pexels.com/photos/4443433/pexels-photo-4443433.jpeg?cs=srgb&dl=pexels-polina-tankilevitch-4443433.jpg&fm=jpg', width=500, unconfined=True))

In [None]:
image_url = "https://images.pexels.com/photos/4443433/pexels-photo-4443433.jpeg?cs=srgb&dl=pexels-polina-tankilevitch-4443433.jpg&fm=jpg"

recipe_answer = compute_response(image_url, recipe_prompt)

In [None]:
print(recipe_answer)

## Can you think of other applications?

For more details, you can:
* Read the paper: https://cdn.openai.com/papers/GPTV_System_Card.pdf
