06: Add API service for the-mcculloughs.org integration by swmcc · Pull Request #6 · swmcc/indexatron

swmcc · 2026-04-06T01:00:37Z

Summary

Complete AI photo analysis service that integrates with the-mcculloughs.org:

Features

Fetch & Analyze - Downloads pending photos, analyzes with LLaVA vision model
Context-Aware Analysis - Uses title, caption, date, gallery info for better results
Family Nicknames - Maps nicknames to real names (e.g., "Mamie" → "Isobel McCullough")
Embeddings - Generates 768-dim vectors with nomic-embed-text for semantic search
Era Override - Uses actual date_taken instead of AI guessing
Category Enrichment - Hierarchical tags (puppy → pet → animal)
Safety Filters - Blocks inappropriate terms, limits category count

CLI Usage

# Process pending uploads
python scripts/run.py --limit 10

# Reprocess a specific photo
python scripts/run.py --shortcode ABC123 --debug

# Test connections
python scripts/run.py test

Family Nickname Mappings

"Wee Mamie" / "Mamie" → Isobel McCullough (Mum)
"The Oul Man" / "The Oul Fella" → Edmund McCullough (Dad)
"The Leech" → Christina McCullough (sister)
"Asshole" / "The Bro" → John McCullough (brother)

Configuration

# .env.development
INDEXATRON_API_KEY=sk_xxx
INDEXATRON_API_BASE_URL=http://localhost:3000
INDEXATRON_VISION_MODEL=llava:7b
INDEXATRON_DEBUG=true

Test plan

python scripts/run.py test passes
Process a photo with --shortcode
Verify nickname resolution works
Check era uses actual date when available

🤖 Generated with Claude Code

- pydantic-settings for environment-based config - httpx for async HTTP client to the-mcculloughs.org API - pytest-httpx for testing HTTP calls - Add CLI entry point for indexatron command Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- pydantic-settings for type-safe config - Load from .env.{INDEXATRON_ENV} files - Support debug mode, API settings, Ollama config - Auto-create download directory Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- Debug HTTP requests/responses with masked secrets - Debug LLaVA prompts and responses - Debug embedding generation with previews - Pretty config display in debug mode Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- fetch_pending_uploads(): Get photos needing analysis - download_image(): Download images to temp dir - post_analysis(): Submit analysis + embedding - test_connection(): Verify API key works - Full debug logging support Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- config/.env.example with all options documented - Update .gitignore for env files Co-Authored-By: Stephen McCullough <stephen@swm.cc>

IndexatronService: - Fetches pending uploads from API - Downloads images to temp directory - Analyzes with LLaVA, generates embeddings - Posts results back to API - Progress bar with rich output - Connection verification for API and Ollama Co-Authored-By: Stephen McCullough <stephen@swm.cc>

Commands: - run: Process pending uploads (default) - test: Verify API and Ollama connections - config: Show current configuration Options: - --debug: Enable verbose output - --limit N: Process max N uploads - --env: Override environment - --dry-run: Fetch but don't process Co-Authored-By: Stephen McCullough <stephen@swm.cc>

Co-Authored-By: Stephen McCullough <stephen@swm.cc>

Scripts wrapper for development use. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

pydantic-settings looks for env files in the working directory. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

Co-Authored-By: Stephen McCullough <stephen@swm.cc>

Rails Active Storage uses redirects to serve files. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

LLaVA 7b crashes on WebP images. Convert to JPG using Pillow after download. Also improved extension detection using Content-Type header instead of URL. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- Improved prompt to ask for hierarchical tags (puppy → dog → pet) - Post-process categories with CATEGORY_HIERARCHY mapping - Covers pets, holidays, celebrations, family members Co-Authored-By: Stephen McCullough <stephen@swm.cc>

LLaVA sometimes returns nested lists for categories. Flatten before processing. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

- Update default vision_model to llama3.2-vision in config - Update PhotoAnalysis model default - Make PhotoAnalyzer use settings.vision_model dynamically - Pass actual model used to analysis results - Fix linting issues (bare except, line length, import order) Llama 3.2 Vision provides better era detection for family photos spanning from 1920s to present. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Build analysis prompt with title, caption, and date_taken context - AI can now identify people by name from captions (e.g., "John's wedding") - Confirm/refine era estimates when actual date is known - Extract location hints from metadata - Add name field to PersonInfo model for identified people - Display context in console output during analysis This makes the AI much smarter for family photos with existing metadata. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Vision models don't need full resolution to understand photo content. Resizing to 1024px max dimension before sending to Ollama significantly reduces inference time with no meaningful loss of analysis quality. Also converts all images to JPG for better model compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add blocklist for inappropriate terms (grooming, beauty, etc.) - Sanitize descriptions to remove inappropriate phrases - Detect and handle repetition loops in model output - Limit categories to 20 max to prevent runaway repetition - Filter blocked terms from categories Llama 3.2 Vision was producing inappropriate descriptions and endless category loops for family photos. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Shows what title/caption/date is being used for AI context. Helps debug when metadata isn't being picked up. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Extract names from caption/title and tell model to use them: "IMPORTANT: This photo includes Emily. Use this name..." - Extract decade from date and force high confidence: "IMPORTANT: This photo is from 2015 (2010s). Use this era." - Simplified JSON schema to reduce repetition loops - Added explicit "Max 10 categories. No repetition." rule Should make the model actually use the metadata instead of ignoring it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Gallery names like "Moffett Family" provide context about who's in photos. Now extracts names from gallery name + title + caption combined. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- New CLI option: python scripts/run.py --shortcode ABC123 - Fetches single upload via new API endpoint - Uses gallery_description for additional AI context - Useful for reprocessing photos after improving prompts Example: python scripts/run.py -s M3JnZV --debug Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Maps family nicknames to real names so AI identifies people correctly: - "Wee Mamie" / "Mamie" → Isobel McCullough (Mum) - "The Oul Man" / "The Oul Fella" → Edmund McCullough (Dad) - "The Leech" → Christina McCullough (sister) - "Asshole" / "The Bro" → John McCullough (brother) When caption says "Mamie at Christmas", AI will output: "people": [{"name": "Isobel McCullough", ...}] Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When we have the real date_taken, don't trust the AI's era estimate. Override it with the actual decade from the date, with "high" confidence and reasoning that explains it came from the actual date. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Llama 3.2 Vision was producing garbage output (repetition loops, inappropriate content). LLaVA 7b is more reliable for this use case. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Shows a spinner with model name while waiting for Ollama response, instead of just hanging with HTTP trace messages. Also shows elapsed time when response completes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

swmcc and others added 27 commits April 6, 2026 02:00

⚙️ Add environment-based configuration module

31a9f7d

- pydantic-settings for type-safe config - Load from .env.{INDEXATRON_ENV} files - Support debug mode, API settings, Ollama config - Auto-create download directory Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🔍 Add debug logging module with rich output

f488cc8

- Debug HTTP requests/responses with masked secrets - Debug LLaVA prompts and responses - Debug embedding generation with previews - Pretty config display in debug mode Co-Authored-By: Stephen McCullough <stephen@swm.cc>

📝 Add environment configuration templates

5a3045e

- config/.env.example with all options documented - Update .gitignore for env files Co-Authored-By: Stephen McCullough <stephen@swm.cc>

📦 Export main components from package

54c58f8

Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🚀 Add convenience run script

b8dbe75

Scripts wrapper for development use. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

📁 Move .env.example to project root

c9bdeea

pydantic-settings looks for env files in the working directory. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🐛 Fix debug_request to accept params argument

9b12212

Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🐛 Enable follow_redirects for Active Storage URLs

ed760c4

Rails Active Storage uses redirects to serve files. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🐛 Convert WebP to JPG before LLaVA analysis

8296d90

LLaVA 7b crashes on WebP images. Convert to JPG using Pillow after download. Also improved extension detection using Content-Type header instead of URL. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🏷️ Add hierarchical category enrichment

fae27fe

- Improved prompt to ask for hierarchical tags (puppy → dog → pet) - Post-process categories with CATEGORY_HIERARCHY mapping - Covers pets, holidays, celebrations, family members Co-Authored-By: Stephen McCullough <stephen@swm.cc>

🐛 Fix category enrichment for nested lists

46a546b

LLaVA sometimes returns nested lists for categories. Flatten before processing. Co-Authored-By: Stephen McCullough <stephen@swm.cc>

📝 Add logging for metadata context

388c6d7

Shows what title/caption/date is being used for AI context. Helps debug when metadata isn't being picked up. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

✨ Use gallery_name for additional AI context

6827908

Gallery names like "Moffett Family" provide context about who's in photos. Now extracts names from gallery name + title + caption combined. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

🔄 Switch back to LLaVA 7b

fb6feb1

Llama 3.2 Vision was producing garbage output (repetition loops, inappropriate content). LLaVA 7b is more reliable for this use case. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

✨ Add spinner while waiting for vision model

e743332

Shows a spinner with model name while waiting for Ollama response, instead of just hanging with HTTP trace messages. Also shows elapsed time when response completes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

swmcc merged commit 649010a into main Apr 6, 2026

swmcc changed the title ~~Add API service for the-mcculloughs.org integration~~ 06: Add API service for the-mcculloughs.org integration Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06: Add API service for the-mcculloughs.org integration#6

06: Add API service for the-mcculloughs.org integration#6
swmcc merged 27 commits intomainfrom
feature/api-service

swmcc commented Apr 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

swmcc commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

CLI Usage

Family Nickname Mappings

Configuration

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

swmcc commented Apr 6, 2026 •

edited

Loading