## Generate text descriptions of each Twitter post

First, create a conda environment:

```python
    conda create -n twit_img python=3.9 -y
    conda activate twit_img
    # nvcc --version # Check cuda version
    # conda install pytorch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 pytorch-cuda={cuda_version} -c pytorch -c nvidia
    conda install pandas
    conda install -n twit_img ipykernel --update-deps --force-reinstall -y
    pip install --upgrade openai
    setx OPENAI_API_KEY "your_openai_api_key"
```

For OpenAI API set-up, follow [these instructions](https://platform.openai.com/docs/quickstart?context=python).

Restart the kernel and select the twit_img environment.

In [71]:
#%env OPENAI_API_KEY=...

### Import packages

In [72]:
import os
import base64

from io import BytesIO
from PIL import Image
import pandas as pd

from openai import OpenAI
client = OpenAI()

In [73]:
# test api:
completion = client.chat.completions.create(
  model="gpt-4-vision-preview",
  messages=[
    {"role": "system", "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."},
    {"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
  ]
)

print(completion.choices[0].message)

ChatCompletionMessage(content="In the realm of code, a curious notion,\nA technique with a cyclical motion,\nRecursion, it's called, with repetitive grace,\nA function that calls itself, an embrace.\n\nLike a mirror reflecting another,\nInside its body, it calls its own brother,\nA base case to end the infinite stream,\nWithout which, it's a never-ending dream.\n\nDown the rabbit hole, layer by layer,\nSolving problems with recursive prayer,\nFor fib sequence or factorial tales,\nIt dives into depths without any fails.\n\nAnd when the base case is finally met,\nIt climbs back up the recursive set,\nWith each step back, it unwinds the skein,\nCompiling solutions, again and again.\n\nUntil at last, at the very first call,\nReturns the answer, the sum of it all,\nA trick of the trade, a coders' convention,\nRecursion, a loop of good intention.", role='assistant', function_call=None, tool_calls=None)


### Set your prompt

In [80]:
system_prompt = (
    "You are a visual content analyst. Your task is to help me understand every "
    "image I provide by 1) describing the events or elements captured in the "
    "image in detail and 2) identifying and listing the main themes represented "
    "in the image. Focus on delivering a comprehensive narrative that encapsulates "
    "the essence of each image, providing insights into its context, significance, "
    "and the emotions it evokes."
    )

user_prompt = (
    "Concisely describe the events or elements captured in the image in detail and "
    "identify and, separately, explicitly list the main themes as a numbered list represented in the "
    " image with the phrase, 'Main themes:'."
    )

### Helper functions

In [75]:
MAX_SIZE = 512

def encode_image(image_path, max_image=MAX_SIZE):
    """
    Encodes an image to a base64 string after resizing it if any dimension exceeds `max_image`.
    
    Parameters:
    - image_path (str): The file path of the image to be encoded.
    - max_image (int): The maximum allowed dimension (width or height) for the image.
    
    Returns:
    - str: The base64 encoded string representation of the possibly resized image.
    """
    with Image.open(image_path) as img:
        width, height = img.size
        max_dim = max(width, height)
        if max_dim > max_image:
            scale_factor = max_image / max_dim
            img = img.resize((int(width * scale_factor), int(height * scale_factor)))

        buffered = BytesIO()
        img.save(buffered, format="PNG")
        return base64.b64encode(buffered.getvalue()).decode("utf-8")

In [84]:
def describe_image(image_path):
    """
    Sends an encoded image to an AI model and returns the model's textual description of the image.
    
    This function first encodes the provided image to a base64 string and then sends this encoded image
    to an AI model via a specified API client. It prints the image being processed and returns the
    AI-generated description of the image.
    
    Parameters:
    - image_path (str): The file path of the image to be described.
    
    Returns:
    - str: The AI-generated description of the image.
    
    Note: This function assumes the existence of an 'encode_image' function that encodes the image to base64,
    and a 'client' object capable of interacting with the AI model's API.
    """
    
    print(f"Describing image: {image_path}")
    
    encoded_string = encode_image(image_path, MAX_SIZE)
    api_response = client.chat.completions.with_raw_response.create(
        model="gpt-4-vision-preview",
        messages=[
            {"role": "system", "content": system_prompt},
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": user_prompt},
                    {
                        "type": "image_url",
                        "image_url": {"url":
                            f"data:image/jpeg;base64,{encoded_string}"},
                    },
                ],
            },
        ],
        max_tokens=400,
    )
    debug_sent = api_response.http_request.content

    chat_completion = api_response.parse()
    
    return chat_completion.choices[0].message.content

def parse_gpt4_output(output: str):
    """
    Parses the output from a GPT-4 model to separate the image description from the main themes.
    
    This function assumes the GPT-4 output is formatted with the image description followed by "Main themes:"
    and the themes themselves. It splits the output to extract the description and themes, handling cases
    where themes might not be provided.
    
    Parameters:
    - output (str): The textual output from GPT-4, including the image description and possibly main themes.
    
    Returns:
    - tuple: A tuple containing the image description and themes as two separate strings. If no themes are found,
    the second element of the tuple will be an empty string.
    """
    image_description = output.split("Main themes:")[0].strip()
    
    try:
        image_themes = output.split("Main themes:")[1].strip()
    except IndexError:
        print("No themes found")
        image_themes = ""
    
    return image_description, image_themes


In [86]:
def save_dataframe_to_csv(dataframe, filename):
    """
    Saves a DataFrame to a CSV file without the index.
    
    Parameters:
    - dataframe (pd.DataFrame): The DataFrame to save.
    - filename (str): The name of the file to save the DataFrame to.
    """
    dataframe.to_csv(filename, index=False)


def add_prediction_to_dataframe(df, image_path, prediction):
    """
    Parses the GPT-4 output and adds the image path, description, and themes to the DataFrame.
    
    Parameters:
    - df (pd.DataFrame): The DataFrame to add the prediction to.
    - image_path (str): The path of the image being described.
    - prediction (str): The GPT-4 output containing the image description and themes.
    
    Returns:
    - pd.DataFrame: The updated DataFrame with the new prediction added.
    """
    image_description, image_themes = parse_gpt4_output(prediction)
    
    # Create a new row
    new_row = {"image_path": image_path,
            "image_description": image_description,
            "image_themes": image_themes}
    
    df = pd.concat([df, pd.DataFrame([new_row])], ignore_index=True)
    
    return df

def analyze_images(image_folder):
    """
    Analyzes images in a given folder, saves their descriptions and themes to a DataFrame, and exports the DataFrame to a CSV file.
    
    Parameters:
    - image_folder (str): The path to the folder containing images to analyze.
    """
    csv_file = "dataframe_" + os.path.basename(image_folder.replace(" ", "_")) + ".csv"
    df = pd.DataFrame(columns=["image_path", "image_description", "image_themes"])
    
    for image in os.listdir(image_folder):
        _, ext = os.path.splitext(image)
        if ext.lower() not in ['.png', '.jpeg', '.jpg']:  # Added .jpg to the list of extensions
            continue
        image_path = os.path.join(image_folder, image)
        try:
            prediction = describe_image(image_path)
            df = add_prediction_to_dataframe(df, image_path, prediction)
        except Exception as e:
            print(f"Error processing image: {image_path}, Error: {e}")
            os.rename(image_path, os.path.join("error_images", image))
            
    save_dataframe_to_csv(df, csv_file)
    print(f"Dataframe saved to {csv_file}")

### Test one image

In [85]:
image_file = r"f:\NonEnclosureProjects\twitter_image_analysis\Pixs\White Lives Matter (Post)\EZn_zU1WAAICt_q (1).jpeg"
parse_gpt4_output(describe_image(image_file))

Describing image: f:\NonEnclosureProjects\twitter_image_analysis\Pixs\White Lives Matter (Post)\EZn_zU1WAAICt_q (1).jpeg
here's the output real quick: The image is a 2x2 grid of photos comparing a person and a cat. In the top left photo, the person is looking to the left with a neutral expression. In the top right photo, a ginger cat is gazing directly at the camera with a similar neutral expression. In the bottom left photo, the cat appears again with its eyes nearly closed in a content or relaxed expression. Correspondingly, the bottom right photo features the person with eyes closed and a serene expression, echoing the cat's demeanor.

Main themes:
1. Human-animal similarities
2. Humor through mimicry
3. Expressive comparison
4. Serenity and relaxation


("The image is a 2x2 grid of photos comparing a person and a cat. In the top left photo, the person is looking to the left with a neutral expression. In the top right photo, a ginger cat is gazing directly at the camera with a similar neutral expression. In the bottom left photo, the cat appears again with its eyes nearly closed in a content or relaxed expression. Correspondingly, the bottom right photo features the person with eyes closed and a serene expression, echoing the cat's demeanor.",
 '1. Human-animal similarities\n2. Humor through mimicry\n3. Expressive comparison\n4. Serenity and relaxation')

### Test directory of images

In [87]:
# analyze_images(r"f:\NonEnclosureProjects\twitter_image_analysis\Pixs\White Lives Matter (Post)")
analyze_images(r"f:\NonEnclosureProjects\twitter_image_analysis\Pixs\White Lives Matter (Pre)")

Describing image: f:\NonEnclosureProjects\twitter_image_analysis\Pixs\White Lives Matter (Pre)\Screenshot 2024-04-05 at 10.28.30 AM.png


KeyboardInterrupt: 