# Open AI Vision with GPT-4o

copyright 2024 Denis Rothman

**June 26,2024 update:**  `gpt-4-vision-preview` upgraded to `gpt-4o`

[OpenAI Vision](https://platform.openai.com/docs/guides/vision)

Example 1: A standard image and text

Example 2:  Divergent Semantic Association, moderate divergence

Example 3:  Divergent Semantic Association, high divergence

In [None]:
#model="gpt-4-vision-preview"
vmodel="gpt-4o"

In [None]:
#API Key
#Store you key in a file and read it(you can type it directly in the notebook but it will be visible for somebody next to you)
from google.colab import drive
drive.mount('/content/drive')
f = open("drive/MyDrive/files/api_key.txt", "r")
API_KEY=f.readline()
f.close()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
#Importing openai
try:
  import openai
except:
  !pip install openai
  import openai

Collecting openai
  Downloading openai-1.35.4-py3-none-any.whl (327 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m327.4/327.4 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 ht

In [None]:
#The OpenAI Key
import os
os.environ['OPENAI_API_KEY'] =API_KEY
openai.api_key = os.getenv("OPENAI_API_KEY")

# Example 1 :  A standard image and text


In [None]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model=vmodel,
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

In [None]:
import textwrap

choice = response.choices[0]
response_content = choice.message.content  # Access the content attribute

# Use textwrap to format the text into a paragraph
formatted_response = textwrap.fill(response_content, width=80)

# Printing the formatted response
print(formatted_response)

This image depicts a scenic landscape with a wooden boardwalk path running
through a grassy field. The sky above is blue with scattered clouds, and there
are trees and shrubs in the background. The setting appears to be a natural or
park area, offering a peaceful and serene view.


In [None]:
from IPython.display import Image, display

# URL of the image
image_url =  "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"

# Display the image
display(Image(url=image_url))

# Example 2: Divergent Semantic Association, moderate divergence



In [None]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model=vmodel,
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://raw.githubusercontent.com/Denis2054/Transformers_3rd_Edition/master/Chapter16/dog.png",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

In [None]:
import textwrap

choice = response.choices[0]
response_content = choice.message.content  # Access the content attribute

# Use textwrap to format the text into a paragraph
formatted_response = textwrap.fill(response_content, width=80)

# Printing the formatted response
print(formatted_response)

The image shows a highly stylized and abstract depiction of an animal, likely a
fox or a dog. The artwork is composed of swirling, vibrant colors and intricate
patterns, creating a dynamic and expressive visual effect. The background is
dark, which helps to highlight the colorful and flowing elements of the animal's
form, making it stand out. The use of spirals and fluid lines gives the image a
sense of movement and energy.


In [None]:
from IPython.display import Image, display

# URL of the image
image_url = "https://raw.githubusercontent.com/Denis2054/Transformers_3rd_Edition/master/Chapter16/dog.png"

# Display the image
display(Image(url=image_url))

# Example 3: Divergent Semantic Association, high divergence


In [None]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model=vmodel,
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://raw.githubusercontent.com/Denis2054/Transformers_3rd_Edition/master/Chapter16/D4.png",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

In [None]:
import textwrap

choice = response.choices[0]
response_content = choice.message.content  # Access the content attribute

# Use textwrap to format the text into a paragraph
formatted_response = textwrap.fill(response_content, width=80)

# Printing the formatted response
print(formatted_response)

The image features an abstract and colorful depiction of a dog's head. The
composition is created using swirling patterns and vibrant colors that flow
together to form the recognizable shape of the dog's face and features. The
design elements include a mix of spirals, curves, and loops that give it an
artistic and dynamic feel.


In [None]:
from IPython.display import Image, display

# URL of the image
image_url = "https://raw.githubusercontent.com/Denis2054/Transformers_3rd_Edition/master/Chapter16/D4.png"

# Display the image
display(Image(url=image_url))