## LLM Emotion + Likeness Rating & Explanation for Paintings

### Model information: Gemini 2.5 flash
### Two prompts are used, a simple prompt and an enhanced prompt with rating score anchoring + explicitly asking the model to pay attention to low level features
### output "emotion_output.csv" from simple prompt and "emotion_output_enhanced.csv" for enhanced prompt

In [None]:
#import libraries

from google import genai
from google.genai import types
from google.api_core import exceptions as google_exceptions


import os
import re
import time
import random
import logging, sys
from pathlib import Path
import mimetypes

import pandas as pd
from tqdm import tqdm
from PIL import Image
from functools import reduce


%run "api_key.ipynb" #import API key


logging.basicConfig(
    stream=sys.stdout,             
    format="%(asctime)s [%(levelname)s] %(message)s",
    datefmt="%H:%M:%S")

Exception: File `'api_key.ipynb'` not found.

### Testing with prompts (numerical evaluation + text response explanation) for 20 images with Gemini
#### For each emotion, write one model to produce output. Plus one aesthetic rating model.

In [None]:
#Configuratoins
IMAGE_FOLDER = Path("/Users/Stella/Desktop/EmotionArt/emotion-art/data/40_test")
OUTPUT_CSV   = Path("emotion_model_output.csv")
MODEL_NAME   = "gemini-2.5-flash"  
MAX_RETRIES  = 6
BACKOFF_BASE = 5.0  # seconds
DELAY_BETWEEN_CALLS = 0.5  # throttle to avoid hitting rate limits
BATCH_SIZE = 10
MAX_OUTPUT = 2049

#manage situations where there is hidden files such as .DS_Store
IMAGE_EXTS = {".jpg", ".jpeg", ".png", ".bmp", ".gif", ".tiff"}

#init client 
client = genai.Client(api_key = api_key)


def is_image_file(path: Path):
    return path.is_file() and path.suffix.lower() in IMAGE_EXTS and not path.name.startswith(".")

# preload & filter image paths
all_images = sorted(p for p in IMAGE_FOLDER.iterdir() if is_image_file(p))

In [None]:
all_images

[PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/1 1.47.27 AM.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/a-grey-day-carmel.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/alya-1964 1.47.27 AM.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/coloured-wall-2003 1.47.27 AM.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/explosion-lyrique-no-c-1918.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/man-s-head.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/movement-in-white-umber-and-cobalt-green-1950.png'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFrame/10_test/thursday-jeudi.jpg'),
 PosixPath('/Users/Stella/Desktop/EmotionArt/emotion-art/ImagesAttentionFr

In [None]:
#prompts
def build_emotion_prompt(emotion: str) -> str:
    return (
        f"You are an art expert describing your emotional response to a painting.\n"
        f"Evaluate **{emotion.lower()}** independently, without reference to any other feeling. "
        "Do not assume anything about other possible emotional reactions — focus only on this one emotion.\n\n"
        "Provide your response using the following structure:\n"
        "1. A **numeric score** between 0 and 100 (on a continuous scale — do not round to nearest 5 or 10 unless warranted)\n"
        "2. A **detailed explanation** supporting the reason behind the rating you provided. Please try to be as detailed as possible.\n\n"
        f"Use the format exactly:\n"
        f"{emotion}: [score]\n"
        "Explanation: ..."
    )

# Prompt dictionary for loop call models later on with all emotions and liking rating:
prompt_dict = {emotion: build_emotion_prompt(emotion) for emotion in ["Joy", "Sadness", "Fear", "Anger", "Disgust", "Surprise"]}
prompt_dict["Liking"] = (
    "You are an art expert evaluating how much you like a painting.\n"
    "Rate your **personal aesthetic preference** for the painting, based only on what is visually presented.\n"
    "Provide your response using the following structure:\n"
    "1. A **numeric score** between 0 and 100 (on a continuous scale — do not round unless appropriate)\n"
    "2. A **detailed explanation** supporting the reason behind the rating you provided. Please try to be as detailed as possible.\n\n"
    "Use the format exactly:\n"
    "Liking: [score]\n"
    "Explanation: ..."
)


In [None]:
# Enhanced prompt that defines a clear scale anchors (0, 50, 100)
def build_emotion_prompt_enhanced(emotion: str) -> str:
    return (
        "You are an expert art critic and psychologist, trained to assess a viewer's emotional response"
        f"to a painting. Focus **only** on **{emotion.lower()}**—do not blend in any other feeling.\n\n"
        "**Scale definition (0-100):**  \n"
        f"- **0** means “no sense of {emotion.lower()} at all.”  \n"
        "- **50** means “a moderate, everyday level—what most people might feel in a typical scene.”  \n"
        f"- **100** means “an overwhelming, emotionally extreme sense of {emotion.lower()}.”\n\n"
        "**Instructions:**  \n"
        "1. Look closely at composition, color palette, lighting, brushwork, subject matter, and style.  \n"
        "2. Compare what you see to the anchors above—if it's slightly more than “everyday,” pick something like 60-70; if it barely registers, choose 5-10.  \n"
        f"3. Avoid clustering at 50: if the painting truly feels neutral for “{emotion.lower()},” explain why and use exactly 50; otherwise pick a number that reflects the visual evidence.  \n"
        "4. If you choose above 85 or below 15, you must justify why it crosses into “extreme” territory.  \n"
        "5. **Write exactly five complete sentences** in your explanation—no more, no fewer.  \n\n"
        "Provide your response using the following structure:\n"
        "1. A **numeric score** between 0 and 100 (on a continuous scale — do not round unless appropriate)\n"
        "2. A **detailed explanation** A detdescription of the visual elements (e.g., “the high-contrast reds and jagged lines give a surge of …”) that led you to that score. \n\n"
        "**Output format (exactly):**  \n"
        f"{emotion}: [score]\n"
        "Explanation: ..."
    )

prompt_dict_enhanced = {
    emotion: build_emotion_prompt_enhanced(emotion)
    for emotion in ["Joy", "Sadness", "Fear", "Anger", "Disgust", "Surprise"]
}

prompt_dict_enhanced["Liking"] = (
    "You are an expert art critic rating your own **aesthetic preference** for a painting on a 0-100 scale.\n\n"
    "**Scale definition (0-100):**  \n"
    "- **0** means “I wouldn't want this in my home or collection.”  \n"
    "- **50** means “it's average—interesting but not memorable.”  \n"
    "- **100** means “I find it utterly compelling and would absolutely display it.”\n\n"
    "**Instructions:**  \n"
    "1. Consider composition, color harmony, technique, originality, and emotional impact on *you*.  \n"
    "2. Anchor your number to the scale above—if your preference is tepid, choose 30-40; if you love it, choose 80-95.  \n"
    "3. Avoid mid-range clustering—only use 50 if it truly feels neutral.  \n"
    "4. If you go above 90 or below 10, explain why it's so extremely likable or unlikable.  \n"
    "5. **Write exactly five complete sentences** in your explanation—no more, no fewer.  \n\n"
    "Provide your response using the following structure:\n"
    "1. A **numeric score** between 0 and 100 (on a continuous scale — do not round unless appropriate)\n"
    "2. A **detailed explanation** supporting the reason behind the rating you provided. Please try to be as detailed as possible.\n\n"
    f"**Output format (exactly):**  \n"
    "Liking: [score]\n"
    "Explanation: ..."
)


In [None]:

# Regex to pull out the score and explanation
_SCORE_RE = re.compile(r"^[A-Za-z]+:\s*([\d.]+)", re.MULTILINE)
_EXPL_RE  = re.compile(r"Explanation:\s*(.*)", re.DOTALL)

def parse_response(raw: str) -> tuple[float|None, str]:
    # 1) Coerce into a str
    if raw is None:
        text = ""
    elif isinstance(raw, bytes):
        text = raw.decode("utf-8", errors="ignore")
    else:
        text = str(raw)

    # 2) Extract score and explanation
    m_score = _SCORE_RE.search(text)
    m_expl  = _EXPL_RE.search(text)
    if not (m_score and m_expl):
        # fallback: return entire text as explanation
        return None, text.strip()

    score       = float(m_score.group(1))
    explanation = m_expl.group(1).strip()
    return score, explanation


In [None]:
# Identify mime types of each image to be processed in the model

def get_mime_type(image_path: str) -> str:
    # Guess type, default to 'application/octet-stream' but we want image types
    mime_type, _ = mimetypes.guess_type(image_path)
    # Fallback for common cases:
    if mime_type is None:
        ext = os.path.splitext(image_path)[1].lower()
        if ext in ['.jpg', '.jpeg']:
            mime_type = 'image/jpeg'
        elif ext == '.png':
            mime_type = 'image/png'
        else:
            mime_type = 'application/octet-stream'
    return mime_type


In [None]:

# Wrapped call with retry/backoff
def emotion_model(image_path: str, prompt_text: str, emotion_label: str) -> dict:

    with open(image_path,"rb") as img_file:
        image_bytes = img_file.read()
        
    mime_type = get_mime_type(image_path)


    last_exc = None
    for attempt in range(1, MAX_RETRIES + 1):
        try:
            resp = client.models.generate_content(
                model=MODEL_NAME,
                contents=[types.Part.from_bytes(data=image_bytes, mime_type=mime_type),
                          prompt_text],
                config = types.GenerateContentConfig(
                    max_output_tokens= MAX_OUTPUT
                )
            )

            if resp.candidates:
                raw = resp.candidates[0].content.parts[0].text
            else:
                raw = ""

            score, explanation = parse_response(raw)

            return {
                "image": Path(image_path).name,
                f"{emotion_label.lower()}_rating":      score,
                f"{emotion_label.lower()}_explanation": explanation
            }
        
        except google_exceptions.ServiceUnavailable as e:
            backoff = min(BACKOFF_BASE*2**(attempt-1), 30) + random.random()
            logging.warning(f"503 overload (try {attempt}), sleeping {backoff:.1f}s")
            time.sleep(backoff)

        except Exception as e:
            logging.warning(f"Error on try {attempt}: {e}")
            time.sleep(min(BACKOFF_BASE*2**(attempt-1), 30) + random.random())

    # if we get here, all retries were 503s or other failures—
    # return an “error” record rather than raising
    return {
        "image": Path(image_path).name,
        f"{emotion_label.lower()}_rating":      "",
        f"{emotion_label.lower()}_explanation": f"ERROR: model overloaded after {MAX_RETRIES} tries"
    }


In [None]:
#helper function to split a list into batches 
def chunker(seq, size):
    for i in range(0, len(seq), size):
        yield seq[i:i+size]

In [None]:
def get_response_by_emotion(
    emotion_label: str,
    prompt_dict,
    batch_size: int = BATCH_SIZE
):
    """
    Runs emotion_model on all_images in batches for the given emotion,
    logs progress, and saves output as a dataframe to be merged later.
    """
    records = []
    total = len(all_images)
    total_batches = (total + batch_size - 1) // batch_size

    for b_idx, batch in enumerate(chunker(all_images, batch_size), start=1):
        logging.info(f"Starting {emotion_label} batch {b_idx}/{total_batches} (size={len(batch)})")

        for i, img_path in enumerate(batch, start=1):
            idx = (b_idx - 1) * batch_size + i
            name = img_path.name
            logging.info(f"[{idx}/{total}] {emotion_label}: Processing {name}")

            try:
                rec = emotion_model(str(img_path), prompt_dict[emotion_label], emotion_label)
                logging.info(f"→ {emotion_label} Success: {name} → rating={rec[f'{emotion_label.lower()}_rating']}")
            except Exception as e:
                logging.error(f"❌ {emotion_label} error on {name}: {e}", exc_info=True)
                rec = {
                    "image":               name,
                    f"{emotion_label.lower()}_rating":      "",
                    f"{emotion_label.lower()}_explanation": f"ERROR: {e}"
                }

            records.append(rec)

    # Save to CSV
    output = pd.DataFrame(records)
    #out_path = output_dir / f"{emotion_label.lower()}_ratings.csv"
    #df.to_csv(out_path, index=False)
    #logging.info(f"✅ Saved {emotion_label} results to {out_path}")
    print(f"✅ {emotion_label} results successfully produced")
    return output

In [None]:
joy_output = get_response_by_emotion('Joy',  prompt_dict = prompt_dict)

KeyboardInterrupt: 

In [None]:
sadness_output = get_response_by_emotion('Sadness', prompt_dict = prompt_dict)

14:42:06 [INFO] Starting Sadness batch 1/1 (size=10)
14:42:06 [INFO] [1/10] Sadness: Processing 1 1.47.27 AM.jpg
14:42:06 [INFO] AFC is enabled with max remote calls: 10.
14:42:10 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:42:10 [INFO] AFC remote call 1 is done.
Sadness: 78
Explanation: The painting's somber color palette, dominated by deep reds, blues, and blacks, evokes a strong sense of sadness. The lack of clear definition or discernible imagery further contributes to this feeling, creating a sense of ambiguity and unease. The way the colors bleed into one another suggests a blurring of emotions, possibly representing grief or sorrow. The overall composition appears heavy and weighted down, adding to the overall melancholic tone.
14:42:10 [INFO] → Sadness Success: 1 1.47.27 AM.jpg → rating=78.0
14:42:10 [INFO] [2/10] Sadness: Processing a-grey-day-carmel.jpg
14:42:10 [INFO] AFC is enabled w

In [None]:
fear_output = get_response_by_emotion('Fear', prompt_dict = prompt_dict)

14:33:24 [INFO] Starting Fear batch 1/1 (size=10)
14:33:24 [INFO] [1/10] Fear: Processing 1 1.47.27 AM.jpg
14:33:24 [INFO] AFC is enabled with max remote calls: 10.
14:33:28 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:33:28 [INFO] AFC remote call 1 is done.
Fear: 10
Explanation: While the painting is somber and uses dark colors, which can sometimes evoke a sense of unease, it doesn't strongly trigger fear. The color blocks are static and lack sharp edges or dynamic elements that are more typically associated with fear-inducing imagery. The colors are not particularly threatening.

14:33:28 [INFO] → Fear Success: 1 1.47.27 AM.jpg → rating=10.0
14:33:28 [INFO] [2/10] Fear: Processing a-grey-day-carmel.jpg
14:33:28 [INFO] AFC is enabled with max remote calls: 10.
14:33:30 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200

In [None]:
anger_output = get_response_by_emotion('Anger', prompt_dict = prompt_dict)

14:34:39 [INFO] Starting Anger batch 1/1 (size=10)
14:34:39 [INFO] [1/10] Anger: Processing 1 1.47.27 AM.jpg
14:34:39 [INFO] AFC is enabled with max remote calls: 10.
14:34:43 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:34:43 [INFO] AFC remote call 1 is done.
Anger: 15
Explanation: The painting evokes a sense of solemnity and contemplation, rather than direct agitation or frustration. The deep reds and blues can suggest a weighty or somber mood, but not necessarily an angry one. The lack of sharp lines or aggressive brushstrokes further diffuses any immediate sense of anger, contributing to a more subdued feeling overall. The colors themselves are not inherently angry colors, and the hazy, soft edges are not associated with strong emotions such as anger.
14:34:43 [INFO] → Anger Success: 1 1.47.27 AM.jpg → rating=15.0
14:34:43 [INFO] [2/10] Anger: Processing a-grey-day-carmel.jpg
14:34:43 [INFO] 

In [None]:
disgust_output = get_response_by_emotion('Disgust', prompt_dict = prompt_dict)

14:35:35 [INFO] Starting Disgust batch 1/1 (size=10)
14:35:35 [INFO] [1/10] Disgust: Processing 1 1.47.27 AM.jpg
14:35:35 [INFO] AFC is enabled with max remote calls: 10.
14:35:39 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:35:39 [INFO] AFC remote call 1 is done.
Disgust: 15
Explanation: The painting's colors, a dark brown and blue, aren't colors that inherently evoke disgust. The composition is also quite simple. However, there is something about the murkiness and dullness that inspires slight unpleasantness. It's not overwhelming, but I'm not excited to look at it. This mild aversion translates into a low level of disgust. It lacks any sort of vibrancy that would make it interesting, and just feels like something I'd rather not stare at. The colors are also reminiscent of something rotting.
14:35:39 [INFO] → Disgust Success: 1 1.47.27 AM.jpg → rating=15.0
14:35:39 [INFO] [2/10] Disgust: Proces

In [None]:
surprise_output = get_response_by_emotion('Surprise', prompt_dict = prompt_dict)

14:36:21 [INFO] Starting Surprise batch 1/1 (size=10)
14:36:21 [INFO] [1/10] Surprise: Processing 1 1.47.27 AM.jpg
14:36:21 [INFO] AFC is enabled with max remote calls: 10.
14:36:25 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:36:25 [INFO] AFC remote call 1 is done.
Surprise: 20
Explanation: The painting has a very simple, minimalist composition consisting of three horizontal blocks of color, with a border of the same blue as the middle section encompassing the whole image. The colors are dark and somber with no unexpected or unusual elements. It is possible to feel surprise upon initial viewing because it is rather abstract, but after some contemplation, the painting doesn't provide anything particularly shocking or unusual, thereby reducing the degree of surprise it elicits.
14:36:25 [INFO] → Surprise Success: 1 1.47.27 AM.jpg → rating=20.0
14:36:25 [INFO] [2/10] Surprise: Processing a-grey-day

In [None]:
liking_output = get_response_by_emotion('Liking', prompt_dict = prompt_dict)

14:37:12 [INFO] Starting Liking batch 1/1 (size=10)
14:37:12 [INFO] [1/10] Liking: Processing 1 1.47.27 AM.jpg
14:37:12 [INFO] AFC is enabled with max remote calls: 10.
14:37:15 [INFO] HTTP Request: POST https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent "HTTP/1.1 200 OK"
14:37:15 [INFO] AFC remote call 1 is done.
Liking: 75
Explanation: The painting is a color field painting featuring three horizontal bands of color: a dark red-brown at the top, a medium blue in the middle, and a darker blue at the bottom. The painting has a slightly muted, somber feel. I find the color composition pleasing. The subtle variations in tone within each band add depth and visual interest. The overall effect is calming and contemplative, which I appreciate. I'm not giving it a higher score simply because while I think the painting works well, I wouldn't be excited to display it.
14:37:15 [INFO] → Liking Success: 1 1.47.27 AM.jpg → rating=75.0
14:37:15 [INFO] [2/10] Lik

In [None]:
#merge all emotion rating dfs into one and save to csv
combined_output_df = [joy_output, sadness_output, fear_output, anger_output, disgust_output, surprise_output, liking_output]
emotion_output = reduce(
    lambda left, right: pd.merge(left,right, on="image", how="outer"),
    combined_output_df
)

emotion_output.to_csv("../output/emotion_output.csv", index=False)


### Use enhanced prompt

In [None]:
joy_output_enhanced = get_response_by_emotion('Joy', prompt_dict = prompt_dict_enhanced)



In [None]:
sadness_output_enhanced = get_response_by_emotion('Sadness', prompt_dict = prompt_dict_enhanced)



In [None]:
fear_output_enhanced = get_response_by_emotion('Fear', prompt_dict = prompt_dict_enhanced)



In [None]:
anger_output_enhanced = get_response_by_emotion('Anger', prompt_dict = prompt_dict_enhanced)

In [None]:
disgust_output_enhanced = get_response_by_emotion('Disgust', prompt_dict = prompt_dict_enhanced)

In [None]:
surprise_output_enhanced = get_response_by_emotion('Surprise', prompt_dict = prompt_dict_enhanced)



In [None]:
liking_output_enhanced = get_response_by_emotion('Liking', prompt_dict = prompt_dict_enhanced)



In [None]:
#merge all emotion rating dfs into one and save to csv
combined_output_df_enhanced = [joy_output_enhanced, sadness_output_enhanced, fear_output_enhanced, anger_output_enhanced, disgust_output_enhanced, surprise_output_enhanced, liking_output_enhanced]
emotion_output_enhanced = reduce(
    lambda left, right: pd.merge(left,right, on="image", how="outer"),
    combined_output_df_enhanced
)

emotion_output_enhanced.to_csv("../output/emotion_output_enhanced.csv", index=False)

------END-------