# Prompting Experiment

## Setup

In [2]:
%conda install openai

Channels:
 - defaults
Platform: osx-arm64
Collecting package metadata (repodata.json): done
Solving environment: done

# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.


In [8]:
import base64
import json
import os
import numpy as np
from dotenv import load_dotenv
from openai import OpenAI
from PIL import Image

In [36]:
# Load environment variables
load_dotenv()

# Create OpenAI client
client = OpenAI()

# Helper function to get completion
def get_completion(messages, model="gpt-4o", temperature=0.7, max_tokens=100):
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens
    )
    return response.choices[0].message.content

## Examples

In [10]:
# Get image count
image_count = len(os.listdir("images"))

# Load examples
with open('examples.json') as f:
    examples = json.load(f)

# Get example count
example_count = len(examples)

# Get average word count
word_counts = [len(example["desc"].split()) for example in examples.values()]
avg_word_count = np.mean(word_counts)

# Print stats
print(f"Image count: {image_count}")
print(f"Example count: {example_count}")
print(f"Average word count: {avg_word_count}")

Image count: 20
Example count: 20
Average word count: 21.25


## Prompting

### Prepare images

In [11]:
def resize_image(image_path, size=(512, 512)):
    with Image.open(image_path) as f:
        f.thumbnail(size, Image.Resampling.LANCZOS)
        f.save(image_path)

def get_image_format(image_path):
    with Image.open(image_path) as f:
        return f.format.lower()

def to_base64(image_path):
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode("utf-8")

def preprocess_image(image_path):
    resize_image(image_path)
    return {"format": get_image_format(image_path), "b64": to_base64(image_path)}

In [12]:
processed_images = []

for example in examples.values():
    processed_image = preprocess_image(example["path"])
    processed_image["desc"] = example["desc"]
    processed_images.append(processed_image)

### Prepare messages

In [25]:
def create_prompt(image_format: str, image_b64: str, max_words: int=30):
    return {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": f"Write a description for this image. The desciption should be within {max_words} words."
            }, {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/{image_format};base64,{image_b64}"
                }
            }
            ]
        }

def create_messages(references: list, image_item: dict, max_words: int=30):
    messages = []
    messages.append({
        "role": "system",
        "content": "You are an assistant writing image descriptions. Your task is to write descriptions for provided images in a consistent style. You response should be concise and informative. Your tone should be neutral and professional."
    })
    for ref in references:
        messages.append(create_prompt(ref["format"], ref["b64"], max_words=max_words))
        messages.append({"role": "assistant", "content": ref["desc"]})
    messages.append(create_prompt(image_item["format"], image_item["b64"], max_words=max_words))
    return messages

### Get completion

In [49]:
references = [processed_images[0], processed_images[1], processed_images[2], processed_images[3], processed_images[4]]
image_item = processed_images[5]
messages = create_messages(references, image_item)
response = get_completion(messages)

print(response)
print("---")
print(image_item["desc"])

A triangular black lacquer table featuring intricate carvings with geometric patterns and openwork spandrels, supported by three legs connected by stretchers.
---
The desktop of a trapezoid shape with a red marble panel inlaid to the center, then with a drawer opening at the front center, overall of dark wood color.


In [41]:
references = [processed_images[9], processed_images[12], processed_images[13]]
image_item = processed_images[14]
messages = create_messages(references, image_item, max_words=100)
response = get_completion(messages, temperature=1, max_tokens=300)

print(response)
print("---")
print(image_item["desc"])

Carved as a plaque with an openwork design of dragons and ruyi clouds, worked from a pale green jade stone, surmounted by a red bead and adorned with a white jade bead necklace.
---
Overall likely heart shape, carved two Chi dragons on top and bottom gazing at each other around the center aperture, the stone of overall grayish-white color.


In [45]:
references = [processed_images[17], processed_images[18]]
image_item = processed_images[19]
messages = create_messages(references, image_item)
response = get_completion(messages, temperature=0.5)

print(response)
print("---")
print(image_item["desc"])

Of cylindrical form, crafted from bamboo, featuring intricate carvings of a landscape scene with trees and mountains, with an inscription near the top.
---
Of slightly compressed cylindrical form, carved to the exterior with a figural scene and poetic phrases to the reverse.
