Halo

Search by meaning, not metadata

Halo is a multimodal Retrieval-Augmented Generation (RAG) system for semantic photo search with automatic album generation. Users can search their photo library by describing vibes and moods, or let AI automatically organize photos into intelligent albums.

Features

Semantic Photo Search

CLIP-based embeddings for images, text queries, and BLIP-generated captions
Optional BLIP captioning and hybrid scoring for better vibe/mood recall
LLM-powered query expansion plus optional explanation mode for search hits
Search-by-example: upload a reference photo and retrieve visually similar shots
Metadata filters (date range + GPS bounding box) to narrow the search space
Local-only vector store (ChromaDB) for privacy-preserving retrieval

Automatic Album Generation

AI-powered clustering: Automatically organizes photos into meaningful albums
Three clustering methods:
- Visual: Groups photos by appearance similarity using K-means on CLIP embeddings
- Temporal: Groups photos by time periods based on capture dates
- Hybrid: Combines visual similarity + timestamps + GPS location for intelligent grouping
LLM-generated titles: Creative album names and descriptions powered by Gemini/GPT
Customizable parameters: Adjust target number of albums and minimum photos per album
Persistent storage: Albums save to JSON and reload on app restart

Story-telling

Intelligent story-telling: Leveraged Gemini API to deliver intelligent and relevant stories for auto-generated photo albums

Getting Started

Installation

# Clone and navigate to project
git clone https://github.com/PeterMcMaster/Halo.git
cd Halo

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install the project in editable mode so `halo` is on your Python path
pip install -e .

Configuration

Create a .env file (or copy .env.example) and set your preferred LLM provider:

For Gemini (Free tier available):

LLM_PROVIDER=gemini
GEMINI_API_KEY=your-google-ai-studio-key
GEMINI_MODEL=gemini-1.5-flash

For OpenAI:

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-key
OPENAI_MODEL=gpt-4o-mini

Get API keys:

Gemini: https://aistudio.google.com/app/apikey
OpenAI: https://platform.openai.com/api-keys

Run the Application

streamlit run src/halo/ui.py

The app will open in your browser at http://localhost:8501

How to Use

1. Index Your Photos

Go to the "Index Photos" section in the sidebar
Enter the path to your photo folder
Toggle "Generate BLIP captions" (recommended for better search)
Click "Index Photos"
Wait for completion (~1-2 seconds per photo with BLIP)

2. Search for Photos

Text Search:

Navigate to "Text Search" tab
Enter a description: "moody nighttime cityscapes", "cozy indoor warm lighting"
Toggle "LLM query expansion" for richer descriptions
Optional: Apply date range or GPS filters
Click "Run text search"

Search by Example:

Navigate to "Search by Example" tab
Upload a reference photo
Click "Find similar"
View visually similar images with similarity scores

3. Generate Albums

Go to the "Albums" tab
Choose Clustering Method:
- Visual: Groups similar-looking photos (beaches, mountains, portraits)
- Temporal: Groups by time periods (trips, events, seasons)
- Hybrid: Smart grouping using visual + temporal + location data (recommended)
Set Target # of Albums (2-10)
Set Min Photos/Album (2-10)
Click "🎬 Generate Albums"
Browse generated albums with AI-generated titles

Album Features:

Each album has a creative title and description
Photos displayed in grid layout
Albums automatically saved to photos/albums.json
Load previously generated albums with "Load Saved Albums"

React Components

The Streamlit UI can load a custom React results grid. Build once and Streamlit will serve the static assets:

cd react_components/result-grid
npm install
npm run build

For live development, run the dev server and point Streamlit to it:

npm run dev  # at react_components/result-grid (defaults to http://localhost:5173)
export RESULT_GRID_DEV_URL=http://localhost:5173
streamlit run src/halo/ui.py

If RESULT_GRID_DEV_URL is unset, Streamlit will load the built bundle from react_components/result-grid/dist.

Album Generation: Technical Details

How It Works

1. Feature Extraction

Each photo is represented as a 512-dimensional CLIP embedding vector
Optional temporal features from EXIF timestamps
Optional spatial features from GPS coordinates

2. Clustering Algorithms

Visual Clustering:

Uses K-means algorithm on CLIP embeddings
Groups photos with similar visual content
Ideal for collections with distinct visual themes

Temporal Clustering:

Extracts capture timestamps from EXIF metadata
Groups photos taken within similar time periods
Simple time-based binning approach
Ideal for organizing by trips and events

Hybrid Clustering:

Combines multiple features into unified feature space:
- CLIP embeddings (70% weight)
- Normalized timestamp (15% weight)
- GPS coordinates (15% weight)
Uses K-means on standardized combined features
Most intelligent method for general photo libraries

3. LLM-Powered Naming

Analyzes cluster metadata (dates, locations, photo count)
Generates creative album titles (max 6 words)
Creates descriptive 2-3 sentence summaries
Falls back to generic names if LLM unavailable

4. Persistence

Albums saved as JSON to photos/albums.json
Includes all metadata: titles, descriptions, photo paths
Reloadable across sessions

Algorithm Comparison

Method	Best For	Algorithm	Features Used
Visual	Similar-looking photos	K-means	CLIP embeddings only
Temporal	Events & trips	Time binning	Timestamps only
Hybrid	Smart organization	K-means	CLIP + Time + GPS

Hybrid clustering is recommended as it produces the most meaningful albums by considering both visual content and contextual metadata.

Testing with Sample Data

If you need quick test photos:

source venv/bin/activate

# Download 40 sample photos from Picsum
python scripts/download_sample_photos.py --clean --limit 40

# Run smoke test
python scripts/smoke_test.py --folder photos/sample_dataset

Script Options:

--limit: Number of images (1-100)
--width/--height: Image dimensions
--clean: Remove old photos first
--no-expand: Skip LLM query expansion

Results written to smoke_results.json

Evaluation

Performance Metrics

See notebooks/evaluation.ipynb for:

Ablation studies (CLIP-only vs hybrid scoring)
Latency measurements across collection sizes
UMAP visualization of embedding space
Qualitative assessments for reports

Launch via Jupyter/VS Code after activating the virtual environment:

jupyter notebook notebooks/evaluation.ipynb

Expected Album Generation Results

Collection Size | Albums Generated | Processing Time
----------------|------------------|----------------
40 photos       | 3-5 albums      | ~10 seconds
100 photos      | 5-10 albums     | ~20 seconds
500 photos      | 15-25 albums    | ~60 seconds

Processing time includes clustering and LLM name generation

Dependencies

Core Libraries

torch, torchvision - PyTorch for neural networks
transformers - Hugging Face models (CLIP, BLIP)
chromadb - Vector database for embeddings
pillow - Image processing
streamlit - Web UI framework

LLM Providers

openai - OpenAI API (GPT models)
google-generativeai - Google Gemini API

Data Science

numpy - Numerical computing
scikit-learn - Clustering algorithms (K-means, StandardScaler)
umap-learn - Dimensionality reduction for visualization
matplotlib - Plotting and visualization

Utilities

python-dotenv - Environment variable management
exifread - Extract EXIF metadata from photos
tqdm - Progress bars

See requirements.txt for complete dependency list with versions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Halo

Features

Semantic Photo Search

Automatic Album Generation

Story-telling

Getting Started

Installation

Configuration

Run the Application

How to Use

1. Index Your Photos

2. Search for Photos

3. Generate Albums

React Components

Album Generation: Technical Details

How It Works

Algorithm Comparison

Testing with Sample Data

Evaluation

Performance Metrics

Expected Album Generation Results

Dependencies

Core Libraries

LLM Providers

Data Science

Utilities

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
notebooks		notebooks
photos/sample_dataset		photos/sample_dataset
react_components/result-grid		react_components/result-grid
scripts		scripts
src/halo		src/halo
.gitignore		.gitignore
README.md		README.md
env_example		env_example
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
smoke_results.json		smoke_results.json

Folders and files

Latest commit

History

Repository files navigation

Halo

Features

Semantic Photo Search

Automatic Album Generation

Story-telling

Getting Started

Installation

Configuration

Run the Application

How to Use

1. Index Your Photos

2. Search for Photos

3. Generate Albums

React Components

Album Generation: Technical Details

How It Works

Algorithm Comparison

Testing with Sample Data

Evaluation

Performance Metrics

Expected Album Generation Results

Dependencies

Core Libraries

LLM Providers

Data Science

Utilities

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages