## Imager Demo

- This demo includes 10 sample images located in the `images` directory. 
- The code provides the following capabilities:
  - Extracting key metadata from the images, including:
    - Filename
    - File path 
    - Timestamps (created, modified)
    - Geolocation coordinates
    - EXIF data
  - Leveraging the power of Claude 3 Haiku AI to automatically generate rich descriptions and relevant tags for each image
  - Enabling users to easily search the image collection by metadata fields, AI-generated tags, and descriptions

In [26]:
# Import required libraries
import os
import base64
from dotenv import load_dotenv
from anthropic import Anthropic, RateLimitError, BadRequestError

# Load environment variables from .env file
load_dotenv()

# Constants
API_KEY = os.getenv("CLAUDE_API_KEY")
MODEL_NAME = "claude-3-haiku-20240307"

# Prompt for image description
prompt = """I am in possession of an image that beckons for a description so rich, layered, and immersive that it transcends mere visual representation to evoke a multi-sensory, emotionally resonant experience. To fully capture the essence and intricacies of this image, I invite you to embark on a journey of meticulous observation, profound interpretation, and creative expression, considering the following aspects:

Comprehensive Observation and Documentation:
Begin by thoroughly examining every element of the image, documenting your observations with utmost specificity and detail. Describe the objects, people, animals, and any other components present, noting their appearance, qualities, and significance. If any text is present in the image, whether as captions, labels, signs, or integrated elements, carefully transcribe and analyze its content, font, style, and placement, considering how it contributes to the overall meaning and context of the image. Analyze the setting, background, and context of the scene, painting a vivid picture of the environment, atmosphere, and surroundings. Discuss the colors, lighting, shadows, and textures, exploring how they contribute to the overall visual impact, mood, and sensory experience of the image.

Composition, Perspective, and Visual Dynamics:
Study the composition of the image, considering the arrangement, balance, and interaction of elements within the frame. Examine the perspective, angle, and point of view from which the image is captured, and how this influences the viewer's perception and engagement with the scene. Analyze the visual dynamics, such as the use of lines, shapes, forms, and negative space, and how they guide the viewer's eye and create a sense of movement, depth, or tension within the image.

Actions, Interactions, and Narrative Implications:
Explore the actions, interactions, and relationships between the elements in the image, whether explicit or implied. Discuss the body language, expressions, and gestures of any characters present, interpreting the emotions, intentions, and dynamics they convey. Consider the potential narratives, stories, or events suggested by the scene, and how the image captures a moment in time that implies a larger context or sequence of actions.

Textual Elements and Their Significance:
If the image contains any text, dedicate attention to its analysis and interpretation. Examine the content, meaning, and tone of the text, and how it relates to the visual elements and overall theme of the image. Consider the choice of font, style, color, and placement of the text, and how these factors influence its impact and readability. Reflect on how the presence of text adds layers of information, context, or symbolism to the image, and how it guides the viewer's understanding and engagement with the work.

Sensory Evocation and Emotional Resonance:
Delve into the multi-sensory aspects of the image, going beyond the visual to describe the implied textures, scents, sounds, and even tastes that the scene conjures. Use vivid and evocative language to create a sense of immersion, inviting the reader to imagine the tactile sensations, olfactory notes, auditory ambiance, and gustatory hints that the image suggests. Explore the emotional atmosphere and tone of the image, discussing how the various elements work together to evoke specific feelings, moods, or psychological states in the viewer.

Symbolism, Metaphor, and Thematic Depth:
Investigate the symbolic and metaphorical layers woven into the fabric of the image, deciphering the deeper meanings, themes, and messages they convey. Consider the cultural, historical, social, or personal context that may inform the image's content and significance, and how this context enriches the viewer's interpretation and connection to the work. Explore the universal human experiences, emotions, or ideas that the image taps into, and how it resonates with the viewer on a profound, archetypal level.

Technical Artistry, Medium, and Creative Vision:
Appreciate the technical skill, creativity, and artistic vision behind the image's creation, analyzing the chosen medium, such as painting, photography, digital art, or any other form. Discuss the artist's techniques, style, and mastery, and how these contribute to the overall effect, impact, and uniqueness of the piece. Examine the use of color, light, texture, form, and composition, and how these elements demonstrate the artist's intention, originality, and creative voice.

Viewer Engagement and Personal Reflection:
Consider how the image engages the viewer, inviting them to participate in the scene, story, or experience it depicts. Discuss how the image may evoke personal memories, associations, or reflections in the viewer, creating a sense of connection or self-discovery. Reflect on the questions, emotions, or ideas the image sparks in the viewer's mind, and how it may inspire further contemplation, creativity, or action beyond the initial viewing experience.

As you craft your description, aim to create a vivid, immersive, and emotionally resonant account that captures the full richness and complexity of the image, including any textual elements present. Use specific, descriptive, and evocative language to engage the reader's senses, imagination, and intellect, transporting them into the world of the image and revealing its layers of meaning, beauty, and significance. Your description should not only document the visual and textual elements but also explore the deeper themes, contexts, and human experiences that the image embodies, providing a comprehensive, insightful, and moving interpretation that celebrates the power and artistry of visual expression."""

prompt = """Examine the provided image carefully and describe in detail:
- All objects, people, animals, text, and other elements you observe
- The setting, background, and context of the scene
- Colors, lighting, shadows, and textures
- Any actions, interactions, or implied story being depicted
- The overall composition, style, and apparent intent of the image
Be as specific, thorough, and comprehensive as possible in documenting everything you notice about the contents and characteristics of the image."""

In [22]:
def encode_image(image_file_path):
    """Encode an image file as base64."""
    try:
        with open(image_file_path, "rb") as img_file:
            return base64.b64encode(img_file.read()).decode("utf-8")
    except IOError as e:
        print(f"Error reading image file: {e}")
        return None


def get_image_media_type(image_file_path):
    """Get the media type of the image file based on its extension."""
    _, extension = os.path.splitext(image_file_path)
    extension = extension.lower()
    if extension in [".jpg", ".jpeg", ".jfif"]:
        return "image/jpeg"
    elif extension == ".png":
        return "image/png" 
    elif extension == ".webp":
        return "image/webp"
    else:
        return None


def generate_image_description(image_file_path):
    """Generate an image description using the Anthropic API."""
    img_base64 = encode_image(image_file_path)
    if img_base64 is None:
        return None

    media_type = get_image_media_type(image_file_path)
    if media_type is None:
        print(f"Unsupported image format for file: {image_file_path}")
        return None

    client = Anthropic(api_key=API_KEY)

    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": media_type,
                        "data": img_base64,
                    },
                },
                {"type": "text", "text": prompt},
            ],
        }
    ]

    try:
        response = client.messages.create(
            model=MODEL_NAME, max_tokens=4000, messages=messages
        )
        return response.content[0].text
    except Exception as e:
        print(f"Error generating image description: {e}")
        return None

In [23]:
# Assuming the images are in the './images/' folder
img_folder = os.path.join(os.getcwd(), "images")
print(img_folder)

# Set the path to the image file
image_file_path = os.path.join(img_folder, "_c12044f1-73a3-4e9e-980c-452eacf6c06c.jpg")

# Generate image description
description = generate_image_description(image_file_path)
if description is not None:
    print(f"Image description for '{os.path.basename(image_file_path)}':")
    print(description)

d:\A\Anthropic\imager\images
Image description for '_c12044f1-73a3-4e9e-980c-452eacf6c06c.jpg':
The image depicts an abstract, geometric design featuring a swirling, three-dimensional spiral shape. The spiral is composed of overlapping bands and arcs in a vibrant color palette of oranges, pinks, reds, and yellows. The design has a dynamic, kinetic quality, with the spiral elements appearing to twist and rotate within the frame.

The background is a pale, neutral tone that provides a clean, minimal setting to showcase the bold, colorful spiral motif. The lighting and shading within the spiral create a sense of depth and dimensionality, adding to the optical illusion of movement and energy.

The overall composition is symmetrical and centered, with the spiral shape occupying a prominent position at the center of the frame. This strong, balanced design, combined with the striking colors and patterns, gives the image a bold, eye-catching aesthetic.

The text "CHALLENGER" appears in simple,

In [21]:
# Set the path to the image file
image_file_path = os.path.join(img_folder, "jetpack.jpg")

# Generate image description
description = generate_image_description(image_file_path)
if description is not None:
    print(f"Image description for '{os.path.basename(image_file_path)}':")
    print(description)

Image description for 'jetpack.jpg':
The image appears to be a hand-drawn diagram or illustration of a "Jetpack Backpack" that is designed to fit a 13-inch laptop. The backpack is depicted as lightweight and looking like a normal backpack, with a padded strap support system.

The key features highlighted in the diagram include:
- USB-C charging
- 15-minute battery life
- Retractable boosters
- Steam-powered, green/clean propulsion

The text describes the backpack as "lightweight" and states that it "looks like a normal backpack." The diagram also mentions that the backpack "fits 13" laptop," indicating its intended use.

The overall style is a simple, hand-drawn sketch with text annotations providing additional details about the backpack's functionality and design. There are no people or other elements depicted beyond the backpack illustration itself.


In [28]:
from PIL import Image
from PIL.ExifTags import TAGS
import piexif
from time import sleep

def extract_metadata(image_path):
    """Extract metadata from the image."""
    image = Image.open(image_path)
    exif_data = image._getexif()

    metadata = {
        "Filename": os.path.basename(image_path),
        "Filepath": os.path.abspath(image_path),
        "ImageSize": image.size,
        "ImageFormat": image.format,
        "ImageMode": image.mode,
    }

    if exif_data:
        for tag_id, value in exif_data.items():
            tag_name = TAGS.get(tag_id, tag_id)
            metadata[tag_name] = value

        # Extract timestamps
        if "DateTime" in metadata:
            metadata["DateTime"] = metadata["DateTime"]
        if "DateTimeOriginal" in metadata:
            metadata["DateTimeOriginal"] = metadata["DateTimeOriginal"]
        if "DateTimeDigitized" in metadata:
            metadata["DateTimeDigitized"] = metadata["DateTimeDigitized"]

        # Extract geolocation
        if "GPSInfo" in metadata:
            gps_info = metadata["GPSInfo"]
            if 1 in gps_info and 2 in gps_info and 3 in gps_info and 4 in gps_info:
                lat = (gps_info[2][0][0] / gps_info[2][0][1] +
                       gps_info[2][1][0] / 60 / gps_info[2][1][1] +
                       gps_info[2][2][0] / 3600 / gps_info[2][2][1])
                if gps_info[1] == "S":
                    lat = -lat
                lon = (gps_info[4][0][0] / gps_info[4][0][1] +
                       gps_info[4][1][0] / 60 / gps_info[4][1][1] +
                       gps_info[4][2][0] / 3600 / gps_info[4][2][1])
                if gps_info[3] == "W":
                    lon = -lon
                metadata["GPSLatitude"] = lat
                metadata["GPSLongitude"] = lon

    return metadata

def process_images(folder_path, max_images=None):
    """Process images in the specified folder and return metadata and descriptions."""
    image_files = [
        f
        for f in os.listdir(folder_path)
        if f.lower().endswith((".jpg", ".jpeg", ".jfif", ".png", ".webp"))
    ]

    if max_images is not None:
        image_files = image_files[:max_images]

    results = []

    for image_file in image_files:
        image_file_path = os.path.join(folder_path, image_file)
        metadata = extract_metadata(image_file_path)
        
        try:
            description = generate_image_description(image_file_path)
            sleep(1)
        except RateLimitError as e:
            print(f"Rate limit exceeded for image: {image_file}. Skipping description generation.")
            description = None
        except BadRequestError as e:
            print(f"Bad request error for image: {image_file}. {str(e)}. Skipping description generation.")
            description = None
        except Exception as e:
            print(f"Error generating description for image: {image_file}. {str(e)}")
            description = None

        result = {
            "filename": image_file,
            "metadata": metadata,
            "description": description
        }
        results.append(result)

    return results

# Assuming the images are in the './images/' folder
img_folder = os.path.join(os.getcwd(), "images")
print(img_folder)
max_images = 10

image_data = process_images(img_folder, max_images)

# Access the metadata and descriptions
for data in image_data:
    filename = data["filename"]
    metadata = data["metadata"]
    description = data["description"]

    print(f"Filename: {filename}")
    print("Metadata:")
    for key, value in metadata.items():
        print(f"{key}: {value}")
    print(f"Description: {description}")
    print()

d:\A\Anthropic\imager\images
Error generating image description: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages.0.content.0.image.source.base64: image exceeds 5 MB maximum: 9277832 bytes > 5242880 bytes'}}
Filename: DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
Metadata:
Filename: DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
Filepath: d:\A\Anthropic\imager\images\DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
ImageSize: (1024, 1024)
ImageFormat: WEBP
ImageMode: RGB
Description: This image depicts a vibrant and fantastical fo

In [35]:
def generate_tags(description):
    """Generate tags from the image description using Claude 3 Haiku."""
    client = Anthropic(api_key=API_KEY)

    messages = [
        {
            "role": "user",
            "content": f"Generate relevant tags for the following image description:\n\n<description>{description}</description>\n\nTags:"
        }
    ]

    try:
        response = client.messages.create(
            model=MODEL_NAME, max_tokens=100, messages=messages
        )
        return response.content[0].text.strip().split(", ")
    except Exception as e:
        print(f"Error generating tags: {e}")
        return []

# Access the metadata, descriptions, and generate tags
for data in image_data:
    filename = data["filename"]
    metadata = data["metadata"]
    description = data["description"]
    
    if description:
        tags = generate_tags(description)
        sleep(2)
    else:
        tags = []

    print(f"Filename: {filename}")
    print("Metadata:")
    for key, value in metadata.items():
        print(f"{key}: {value}")
    print(f"Description: {description}")
    print(f"Tags: {', '.join(tags)}")
    print()

Filename: DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
Metadata:
Filename: DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
Filepath: d:\A\Anthropic\imager\images\DALL·E 2024-03-26 19.38.00 - In a lush, vibrant forest setting, an anime-style scene unfolds where a variety of anthropomorphic animals gather in a clearing. These characters, dra.webp
ImageSize: (1024, 1024)
ImageFormat: WEBP
ImageMode: RGB
Description: This image depicts a vibrant and fantastical forest scene filled with a diverse cast of anthropomorphized animal characters engaging in a lively social gathering. The forest setting is lush and verdant, with tall trees casting dappled sunlight across the grassy clearing where the characters are gathered.


In [36]:
import re

def search_images(image_data, search_query):
    results = []
    
    for data in image_data:
        filename = data["filename"]
        metadata = data["metadata"]
        description = data["description"]
        tags = data.get("tags", [])
        
        # Case 1: Search by filename
        if search_query.lower() in filename.lower():
            results.append(data)
            continue
        
        # Case 2: Search by metadata field
        for key, value in metadata.items():
            if search_query.lower() in str(value).lower():
                results.append(data)
                break
        
        # Case 3: Search by description
        if description and search_query.lower() in description.lower():
            results.append(data)
            continue
        
        # Case 4: Search by tags
        if search_query.lower() in [tag.lower() for tag in tags]:
            results.append(data)
    
    return results

# Example usage

# Search by filename
query1 = "jet"
results1 = search_images(image_data, query1)
print(f"Search results for '{query1}':")
for result in results1:
    print(result["filename"])
print()

# Search by metadata field
query2 = "PNG"
results2 = search_images(image_data, query2)
print(f"Search results for '{query2}':")
for result in results2:
    print(result["filename"])
print()

# Search by description
query3 = "flower"
results3 = search_images(image_data, query3)
print(f"Search results for '{query3}':")
for result in results3:
    print(result["filename"])
print()

# Search by tags
query4 = "yellow"
results4 = search_images(image_data, query4)
print(f"Search results for '{query4}':")
for result in results4:
    print(result["filename"])

Search results for 'jet':
jetpack.jpg

Search results for 'PNG':
fhubhijgc_Highly_advanced_alien_technology_observing_Earth_from_468a88cb-0376-482c-90b3-251d2f5f44cd.png
lizziegriffith_56404_In_a_cyberpunk_future_a_group_of_punks_gat_a3dd04d9-752f-4a1f-b2f2-8f171651f7f9.png
lizziegriffith_56404_Narrative_of_human_struggle_and_alien_surv_ec3d393e-9271-49ed-a106-9d690165479c.png
lizziegriffith_56404_Shiba_Inu_and_a_yellow_balloon_with_a_deli_097ce893-8b68-4c12-93d8-d0e20c6640e6.png

Search results for 'flower':
GHJ4js6acAAFkBy.jfif
GHMDVmoa4AAeKmk.jfif

Search results for 'yellow':
DALL·E 2024-04-15 06.21.37 - Design a minimalist logo for a fictional company called 'Imager'. The logo should feature a highly stylized and symmetrical photo album icon, predomin.webp
lizziegriffith_56404_Shiba_Inu_and_a_yellow_balloon_with_a_deli_097ce893-8b68-4c12-93d8-d0e20c6640e6.png
_c12044f1-73a3-4e9e-980c-452eacf6c06c.jpg
