<a href="https://colab.research.google.com/github/elephant-xyz/photo-meta-data-notebook/blob/main/PhotoMedtaData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 🔄 Load Existing `.env` File

This step will load environment variables from an existing `.env` file already uploaded to the Colab environment.

The following variables are expected:

| Variable Name           | Purpose                     |
|-------------------------|-----------------------------|
| `OPENAI_API_KEY`        | Access to OpenAI API        |
| `AWS_ACCESS_KEY_ID`     | AWS access key              |
| `AWS_SECRET_ACCESS_KEY` | AWS secret access key       |
| `S3_BUCKET_NAME`        | S3 BUCKET NAME              |
| `IMAGES_DIR`            | The local folder for images



> ✅ Make sure `.env` is present in the file list on the left sidebar.





In [1]:
# Install dotenv support
!pip install -q python-dotenv

import os
from dotenv import load_dotenv

# Load the .env file from current directory
dotenv_path = ".env"

if os.path.exists(dotenv_path):
    load_dotenv(dotenv_path)
    print("✅ Environment variables loaded.\n")

    # Check specific keys (without printing sensitive values)
    for key in ['OPENAI_API_KEY', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY', 'S3_BUCKET_NAME', 'IMAGES_DIR']:
        val = os.getenv(key)
        if val:
            print(f"{key}: ✅ Loaded")
        else:
            print(f"{key}: ❌ Missing or not set in .env")
else:
    print("❌ `.env` file not found. Please upload it via the file browser.")



✅ Environment variables loaded.

OPENAI_API_KEY: ✅ Loaded
AWS_ACCESS_KEY_ID: ✅ Loaded
AWS_SECRET_ACCESS_KEY: ✅ Loaded
S3_BUCKET_NAME: ✅ Loaded
IMAGES_DIR: ✅ Loaded


# 🏠 Photo Metadata AI - AWS Rekognition Photo Categorizer

## 📋 What It Does

Automatically analyzes and categorizes real estate photos using AWS Rekognition AI. Uploads images from local folders to S3, then uses AI to detect objects and scenes, organizing them into categories like kitchen, bedroom, bathroom, etc.

## 🎯 Categories

- 🍳 **Kitchen**: Appliances, cabinets, countertops
- 🛏️ **Bedroom**: Beds, furniture, sleeping areas  
- 🚿 **Bathroom**: Toilets, showers, sinks, mirrors
- 🛋️ **Living Room**: Sofas, TVs, fireplaces
- 🍽️ **Dining Room**: Dining tables, chairs
- 🏠 **Exterior**: Building exteriors, architecture
- 🚗 **Garage**: Cars, vehicles, parking
- 💼 **Office**: Desks, computers, work areas
- 👕 **Laundry**: Washing machines, dryers
- 🪜 **Stairs**: Staircases, railings
- 👔 **Closet**: Wardrobes, clothing storage
- 🏊 **Pool**: Swimming pools, water features
- 🌿 **Balcony**: Terraces, patios, decks
- 📦 **Other**: Unmatched items

## 📁 Required Folder Structure

```
images/
├── property-123/
│   ├── kitchen1.jpg
│   ├── bedroom1.jpg
│   └── bathroom1.jpg
├── property-456/
│   ├── exterior1.jpg
│   └── garage1.jpg
└── property-789/
    ├── office1.jpg
    └── dining1.jpg
```

## 🔧 Usage Options

When you run `photo-categorizer`, you'll get three options:

1. **📤 Upload Only**: Upload images from local folder to S3
2. **🔍 Categorize Only**: Process existing images in S3  
3. **🚀 Upload + Categorize**: Complete workflow (recommended)

## 📊 Results

- ✅ **Organized Images**: Sorted into category folders in S3
- 📈 **JSON Reports**: Detailed analysis with confidence scores
- 📋 **Summary**: Breakdown of images by category
- 🔍 **Labels**: Top detected objects for each image

## 🛠️ Requirements

- ✅ AWS Account with S3 and Rekognition access
- ✅ AWS credentials configured
- ✅ S3 bucket created
- ✅ Images in proper folder structure

## 🔐 Security Notes

- ⚠️ Never commit AWS credentials to version control
- 🔒 Use IAM roles with minimal required permissions

In [2]:
import os

# Read the path from the environment variable
images_dir = os.getenv("IMAGES_DIR")

# Check if the variable is set
if images_dir is None:
    raise ValueError("Environment variable 'IMAGES_DIR' is not set.")

# Create


In [3]:
# Install the tool from GitHub
!pip install git+https://github.com/elephant-xyz/photo-meta-data-ai.git@main --force-reinstall --no-cache-dir


# Run the photo categorizer
!photo-categorizer

Collecting git+https://github.com/elephant-xyz/photo-meta-data-ai.git@main
  Cloning https://github.com/elephant-xyz/photo-meta-data-ai.git (to revision main) to /tmp/pip-req-build-mzf_7x3r
  Running command git clone --filter=blob:none --quiet https://github.com/elephant-xyz/photo-meta-data-ai.git /tmp/pip-req-build-mzf_7x3r
  Resolved https://github.com/elephant-xyz/photo-meta-data-ai.git to commit 551e0c5db931d955317f64554babe4846083a529
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting boto3>=1.26.0 (from photo-metadata-ai==1.0.0)
  Downloading boto3-1.39.10-py3-none-any.whl.metadata (6.7 kB)
Collecting botocore>=1.29.0 (from photo-metadata-ai==1.0.0)
  Downloading botocore-1.39.10-py3-none-any.whl.metadata (5.7 kB)
Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.26.0->photo-metadata-ai==1.0.0)
  Downloading jmespath-1.0.1-py3-none-any.whl.metadata (7

AWS Rekognition Photo Categorizer
Target S3 Bucket: photo-metadata-ai

1. Authenticating with AWS...
✓ AWS S3 authentication successful
✓ AWS Rekognition client initialized
✓ Using S3 bucket: photo-metadata-ai

2. Auto-processing all properties...
✓ AWS S3 authentication successful

📤 Uploading all images to S3...
Starting upload for all properties...
✓ Found 2 property folders

Processing Property 1/2: .ipynb_checkpoints
Starting upload for property ID: .ipynb_checkpoints
✓ Found property folder: images/.ipynb_checkpoints
✓ Found 1 image files to upload

[1/1] Processing: 007-2558GardensPkwy-PalmBeachGardens-FULL.jpg
Size: 9,513,707 bytes (9.07 MB)
  Uploading to S3: .ipynb_checkpoints/007-2558GardensPkwy-PalmBeachGardens-FULL.jpg
  ✓ Successfully uploaded 007-2558GardensPkwy-PalmBeachGardens-FULL.jpg

Upload Summary for Property .ipynb_checkpoints
Total files found: 1
Successful uploads: 1
Failed uploads: 0
Total size uploaded: 9,513,707 bytes (9.07 MB)
S3 location: s3://photo-metada

In [3]:
# Install the tool from GitHub
!pip install git+https://github.com/elephant-xyz/photo-meta-data-ai.git@main --force-reinstall --no-cache-dir


!ai-analyzer --all-properties

Collecting git+https://github.com/elephant-xyz/photo-meta-data-ai.git@main
  Cloning https://github.com/elephant-xyz/photo-meta-data-ai.git (to revision main) to /tmp/pip-req-build-pnknc9nx
  Running command git clone --filter=blob:none --quiet https://github.com/elephant-xyz/photo-meta-data-ai.git /tmp/pip-req-build-pnknc9nx
  Resolved https://github.com/elephant-xyz/photo-meta-data-ai.git to commit 17deb779a153eeadc27aef6d2814113fbbdb9ddf
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting boto3>=1.26.0 (from photo-metadata-ai==1.0.0)
  Downloading boto3-1.39.10-py3-none-any.whl.metadata (6.7 kB)
Collecting botocore>=1.29.0 (from photo-metadata-ai==1.0.0)
  Downloading botocore-1.39.10-py3-none-any.whl.metadata (5.7 kB)
Collecting openai>=1.0.0 (from photo-metadata-ai==1.0.0)
  Downloading openai-1.97.0-py3-none-any.whl.metadata (29 kB)
Collecting python-d

✓ Loaded environment variables from .env file
✓ Set AWS_DEFAULT_REGION to us-east-1 (default)
✓ All required environment variables are set

🚀 Starting AI Image Analysis
📁 Output directory: output
🔧 Batch size: 5
👥 Max workers: 3
🚀 Starting optimized real estate image processing with S3 and IPFS integration...
🖼️  Image optimization: Max size 1024x1024, JPEG quality 85%
🚀 Single call processing: All images processed in one API call per folder
☁️  S3 Bucket: photo-metadata-ai
🌐 IPFS Schemas: 6 schemas + 1 relationship schema
📁 Output Structure: All files go directly to output/property_id/ (no subfolders)
🔄 Data Merging: Updates existing files instead of creating new ones
🔗 Individual Relationships: Creates separate relationship files with IPFS format

[→] Loading schemas from IPFS...
2025-07-21 21:52:11,269 - INFO - Loading schemas from IPFS...
2025-07-21 21:52:11,269 - INFO - Fetching lot schema from IPFS CID: bafkreigy3tsgcwtgz4nu5jc7cnkb6bizpbxbn3rh6ectz44z6f3tqfjdum
2025-07-21 21:52: