Nazariya is a local visual search and clustering tool for photo archives.
The current goal is to help find photos that are visually close enough to support a realistic style transfer with human input, but not so identical that simple copy-paste settings would be enough.
In practice, Nazariya helps answer questions like:
- Which candidate sets are visually close to each other?
- Which photos share similar lighting, palette, environment, or editability?
- Which sets should I inspect in Lightroom when building stronger delivery groups?
Nazariya currently works as a Python CLI, with a small Lightroom Classic helper plugin for exporting catalog metadata to CSV.
Use the included Lightroom plugin to export a CSV of RAW/DNG photos that are in your candidate keyword sets.
The plugin exports metadata only. It does not export or copy the RAW files.
Example output:
$MATRIX/packages/nazariya/data/inputs/candidates.csv
This CSV includes source paths, candidate keys, keywords, ratings, labels, capture time, camera/lens metadata, and location fields when Lightroom exposes them.
For this project, the current candidate set contains roughly 39,397 RAW photos across candidate keys such as c001 through c325.
Create a CSV where each candidate set can define white balance and exposure normalization settings.
./scripts/nazariya make-overrides-template \
--input data/inputs/candidates.csv \
--output data/config/candidate_overrides.csvMost rows can use the same default settings. Override only sets that need adjustment.
For fast iteration, sample a few images from each candidate set.
./scripts/nazariya sample \
--input data/inputs/candidates.csv \
--output data/inputs/candidates_sample_003_seed_42.csv \
--per-candidate 3 \
--seed 42This produces a manageable visual review set. Three images per candidate is usually a good starting point because one image can be misleading.
Nazariya reads the original RAW files, applies analysis-only white balance and exposure normalization, then writes JPEG previews.
The purpose is not to make the previews look beautiful. The purpose is to make visually comparable previews so similar images land close together in feature space.
This step is iterative:
- Build previews.
- Generate contact sheets.
- Inspect exposure and white balance consistency.
- Adjust
data/config/candidate_overrides.csv. - Swap bad random samples when a picked frame does not represent the set.
- Rebuild previews.
See the detailed workflow:
Once the previews look usable, generate embeddings and color/light features.
Nazariya currently combines:
- CLIP image embeddings
- LAB/HSV color and tonal histograms
- simple color/light statistics
For edit-family search, color and lighting usually matter more than semantic similarity, so a color-heavy weighting can be useful.
Example:
./scripts/nazariya extract-features \
--preview-map "$WHISK_ML_DATASETS/nazariya/previews/sample_003_seed_42_review/preview_map.csv" \
--output "$WHISK_ML_DATASETS/nazariya/features/sample_003_seed_42_review/features_clip035_color065.npz" \
--metadata "$WHISK_ML_DATASETS/nazariya/features/sample_003_seed_42_review/features_clip035_color065.csv" \
--clip-weight 0.35 \
--color-weight 0.65 \
--batch-size 32Then generate one neighbor contact sheet per candidate set:
./scripts/nazariya neighbor-sheets \
--features "$WHISK_ML_DATASETS/nazariya/features/sample_003_seed_42_review/features_clip035_color065.npz" \
--preview-map "$WHISK_ML_DATASETS/nazariya/previews/sample_003_seed_42_review/preview_map.csv" \
--output "$WHISK_ML_DATASETS/nazariya/debug_neighbors/by_candidate_clip035_color065" \
--top-k 10 \
--exclude-same-candidate \
--thumb-size 260See the detailed workflow:
The Lightroom helper plugin lives here:
lightroom/nazariya.lrplugin
Current menu item:
Library > Plug-in Extras > Export Candidate CSV
It exports selected Lightroom photos to the configured CSV path in lightroom/nazariya.lrplugin/Config.lua.
Recommended local repo layout:
data/
config/
candidate_overrides.csv
inputs/
candidates.csv
candidates_backup.csv
candidates_sample_003_seed_42.csv
candidates_sample_003_seed_42_swapped.csv
previews/
Generated previews, features, and contact sheets can be written to a larger dataset volume such as:
$WHISK_ML_DATASETS/nazariya/
For long jobs, local scratch is often more reliable than a network volume. You can render locally, then copy results to the dataset volume afterward.
./scripts/nazariya --version
./scripts/nazariya hello
./scripts/nazariya init
./scripts/nazariya sample
./scripts/nazariya swap-sample
./scripts/nazariya make-overrides-template
./scripts/nazariya build-previews
./scripts/nazariya contact-sheets
./scripts/nazariya extract-features
./scripts/nazariya neighbor-sheetsThis project uses uv.
uv sync
./scripts/nazariya --helpBuild:
uv buildPublish:
uv publishNazariya is still experimental. The current approach is intentionally practical:
- Use Lightroom to identify candidate pools.
- Use Python to normalize RAW previews and compute features.
- Use contact sheets for human visual review.
- Use nearest-neighbor suggestions to rebuild better photo groups manually.
The human review step is part of the design. The tool suggests promising neighborhoods; the final grouping still depends on taste, context, and the editing goal.