Automated inventory cataloging system for large media collections. Uses Google Lens image recognition, local LLM analysis, and PriceCharting API integration to catalog thousands of items with automated valuation and customs documentation.
I built this because I'm planning to move internationally with a large collection of video games, LEGO sets, comics, and other collectibles. I'm making this public because it has a ton of uses outside of mine and I hope others find this useful. This system automates the tedious process of creating detailed inventories for customs, insurance, and tracking. It combines:
- Google Lens visual identification via SerpAPI
- Local LLM (Qwen 2.5:32b) for intelligent data synthesis and regional variant detection
- Multimodal vision LLM for reading grades directly from slab images (graded comics & cards)
- PriceCharting API for authoritative video game, LEGO, comic book, and trading card pricing
- QR-coded tote tracking for physical organization
- Automated file organization with sequence-numbered filenames
- Image auto-cropping to remove scanner backgrounds
Perfect for collectors, expats, or anyone needing professional inventory documentation for international customs, insurance claims, or estate planning.
- ✅ Live scanning workflow - Watch directory for new book scanner images
- ✅ Graded item scanning - Two-pass LLM reads grades from slabs, then identifies items
- ✅ Batch processing - Process manually photographed large items
- ✅ QR code tote tracking - Organize items by physical storage container
- ✅ Intelligent item identification - Google Lens + LLM synthesis
- ✅ Regional variant detection - Mostly Distinguishes NTSC-J, NTSC-U, PAL versions
- ✅ Platform-based condition defaults - Generally acceptable assumptions (cartridge vs disc systems) given that newer systems tend to have plastic boxes which folks tend to keep vs cardboard ones of the 8/16-bit generation.
- ✅ Duplicate handling - Sequence numbers in filenames uniquely identify each item, even duplicates
- ✅ Validation & auto-correction - Catches and fixes LLM output errors
- ✅ Manual review flagging - Highlights uncertain identifications
- ✅ Dry-run test mode - Test the API → LLM pipeline without tote scans or disk writes
- 📊 Dual output formats - Detailed JSON + spreadsheet CSV
- 💰 Valuation tracking - Per-item and total collection value
- 🏷️ Zebra label generation - QR-coded tote labels (ZPL format)
- 🔐 Security seal tracking - Associate numbered seals to containers
- 🗑️ Item removal utility - Clean deletion with automatic backup
- 🔄 PriceCharting updater - Batch update pricing data periodically. Requires a subscription which is $50 a month (woof).
- 📤 PriceCharting collection export - Generate bulk-upload text files for PriceCharting collections (video games, LEGO, comics, trading cards)
inventory.json- Complete detailed records with AI analysisinventory.csv- Simplified spreadsheet summary/TOTE-XXX/ItemName_001_TOTE-XXX.jpg- Organized, cropped, sequence-numbered images by containerseal_tracking.json- Physical security seal associations/pricecharting/- PriceCharting collection upload files (videogames.txt, legos.txt, comics.txt, cards.txt)
- Detailed item-by-item inventory with valuations
- Photos of each item for verification
- Exportable to customs-required formats
- Built-in personal effects eligibility flagging based on product release date
- Complete photographic evidence
- Current market valuations
- Easily updatable pricing data
- Professional documentation format
- Comprehensive collection catalog
- Current fair market values (Pricecharting only, the FMV data from Google is hot garbage)
- Organized by storage location
- Shareable CSV format
Czur Scanner → QR Code Detection → Google Lens → PriceCharting (optional) →
LLM Analysis → Auto-Crop → Rename → Organize → Save to Inventory
Czur Scanner → QR Code Detection → Vision LLM (reads grade from slab) →
Google Lens → PriceCharting (optional) → Text LLM (synthesizes all data) →
Auto-Crop → Rename → Organize → Save to Inventory
Scripts:
automated_inventory.py- Live automated scanning (supports--dry-run)automated_graded_inventory.py- Live scanning for graded comics & trading cards, two-pass LLM (supports--dry-run)batch_inventory.py- Batch process manual photosgenerate_labels.py- Create QR-coded tote labelsmanage_seals.py- Track security sealsupdate_pricecharting.py- Batch pricing updatesremove_item.py- Item removal with backuppricecharting_collection_generator.py- Generate PriceCharting collection upload files
Data Flow:
- Scan images → Identify with Google Lens → Optionally query PriceCharting → LLM synthesizes results (I use Qwen 2.5 32B) → Validates output → Saves to JSON/CSV → Organizes files
- Book scanner (e.g., Czur) or camera for photographing items
- Black mat for consistent backgrounds (optional, enables auto-cropping)
- Zebra label printer (optional, for QR labels)
- Python 3.8+
- Ollama with a capable model (recommended: qwen2.5:32b, plus a vision model like deepseek-ocr for graded items)
- ImageMagick (for auto-cropping)
- ngrok (free or paid, for exposing local images to Google Lens)
- SerpAPI - Google Lens image search ($50/month for 5,000 searches)
- PriceCharting - Optional ($50/month for 10,000 requests)
git clone https://github.com/prosolis/Ditto.git
cd Dittoapt install python3-requests python3-dotenv python3-watchdog python3-pyzbar python3-willow# ImageMagick (for auto-cropping)
sudo apt install imagemagick
# Ollama (for LLM analysis)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:32b# Download from https://ngrok.com/download
# Or via package manager:
brew install ngrok # macOS
# Or snap install ngrok on Linux# Copy example config
cp .env.example .env
# Edit with your settings
nano .envRequired configuration in .env:
SERPAPI_KEY=your_serpapi_key_here
NGROK_URL=https://your-url.ngrok-free.app
SCAN_DIR=/path/to/scanner/output
ORGANIZED_DIR=/path/to/organized/inventorySee .env.example for all available options.
# Generate 50 QR-coded labels
python generate_labels.py 50
# Print labels to Zebra printer
cat zpl_labels/print_all.zpl | lp -d YourZebraPrinter# Terminal 1: Start HTTP server
cd <parent of SCAN_DIR>
python3 -m http.server 8000
# Terminal 2: Start ngrok tunnel
ngrok http 8000
# Copy the https URL to your .env file
# Terminal 3: Start Ollama
ollama run qwen2.5:32b
# Terminal 4: Start inventory scanner
python automated_inventory.py- Scan tote QR label - Sets context for following items
- Scan items - Use foot pedal for rapid scanning
- System automatically:
- Identifies each item
- Queries PriceCharting (if enabled)
- Synthesizes data with LLM
- Crops and renames image
- Moves to organized folder
- Appends to inventory
For professionally graded (slabbed) comics and trading cards, use the graded inventory scanner. Same workflow as above, but with a vision model that reads grades from slab labels.
# Terminal 1-2: Same HTTP server + ngrok setup as above
# Terminal 3: Start both models
ollama run qwen2.5:32b
# In another terminal:
ollama run deepseek-ocr
# Terminal 5: Start graded inventory scanner
python automated_graded_inventory.pyThe graded scanner performs two LLM passes per item:
- Pass 1 (Vision): Sends a DPI-downscaled image to the multimodal LLM to read the grade, grading authority (CGC, PSA, etc.), certification number, and label color from the slab
- Pass 2 (Text): Feeds the vision-extracted grade info alongside Google Lens results and PriceCharting data into Qwen 2.5 for final structured JSON output
Organized filenames include grade info: Amazing_Spider-Man_300_CGC_98_001_TOTE-001.jpg
Test the identification pipeline end-to-end without tote scanning, auto-cropping, file organization, or saving to disk. Accepts one or more image paths and prints JSON results to stdout.
# Test standard inventory (API → LLM)
python automated_inventory.py --dry-run photo1.jpg photo2.jpg
# Test graded inventory (Vision LLM → API → Text LLM)
python automated_graded_inventory.py --dry-run slab1.jpg slab2.jpgThis is useful for verifying your API keys, ngrok tunnel, and Ollama models are working correctly before starting a full scanning session.
# For items too large for book scanner
mkdir /photos/TOTE-005
# Take photos, copy to folder
python batch_inventory.py /photos/TOTE-005API Keys:
SERPAPI_KEY=required
PRICECHARTING_API_KEY=optionalPaths:
SCAN_DIR=/path/to/scanner/output
ORGANIZED_DIR=/path/to/inventory
INVENTORY_JSON=organized/inventory.json # Override if stored elsewhere
INVENTORY_CSV=organized/inventory.csv # Override if stored elsewhere
BACKUP_DIR=organized/backups # Override if stored elsewhereProcessing:
AUTOCROP_ENABLED=true # Auto-crop images
AUTOCROP_FUZZ=10 # ImageMagick fuzz tolerance
PRICECHARTING_MAX_RESULTS=5 # PriceCharting options per item
MAX_RETRIES=2 # Retry attempts on network errors (timeouts, drops)
VERBOSE_LOGGING=false # Show detailed QR detection and debug outputLLM:
LLM_MODEL=qwen2.5:32b
OLLAMA_TIMEOUT=120Vision Model (graded inventory only):
VISION_MODEL=deepseek-ocr # Multimodal model for reading slab labels
VISION_TIMEOUT=120 # Timeout for vision model requests
DOWNSCALE_DPI=72 # Target DPI for images sent to vision modelSee .env.example for complete documentation.
Complete detailed record per item:
{
"timestamp": "2024-02-07T19:03:37.607670",
"tote_id": "TOTE-001",
"item_sequence": 42,
"item_name": "Super Metroid",
"image_file": "Super_Metroid_042_TOTE-001.jpg",
"ai_analysis": {
"item_name": "Super Metroid",
"platform": "SNES",
"region": "NTSC-U",
"confidence": "HIGH",
"estimated_value_usd": 75.00,
"pricing_basis": "LOOSE_CART",
"category": "Video Game Software",
"pricecharting_match_confidence": "HIGH"
},
"pricecharting_data": [...],
"status": "success"
}Simplified spreadsheet:
tote_id,item_sequence,item_name,category,estimated_value_usd,confidence,manual_review,status
TOTE-001,42,Super Metroid,Video Game Software,75.00,HIGH,NO,successThe system intelligently detects regional variants from search results:
- NTSC-J (Japan): Japanese text indicators, Super Famicom, PC Engine
- NTSC-U (USA/Canada): ESRB ratings, English text, SNES, Genesis
- PAL (Europe): PEGI ratings, multi-language, Mega Drive
Matches PriceCharting listings to the correct regional variant for accurate pricing.
Smart assumptions based on gaming platform:
- 8/16-bit cartridges (NES, SNES, Genesis, Master System, Game Boy/GBC/GBA, TurboGrafx-16, Atari, Neo Geo, Neo Geo Pocket/Color, WonderSwan/Color, Virtual Boy, Game Gear) → Default: LOOSE_CART
- Disc-based systems (PlayStation, Xbox, GameCube, Saturn, Dreamcast, Sega CD, 3DO, CDi, PC Engine CD) → Default: COMPLETE_IN_BOX
- Modern cartridges (DS, 3DS, Switch, PS Vita) → Default: COMPLETE_IN_BOX
Overrides defaults only when search results explicitly indicate different condition.
Catches common LLM errors:
- ✅ Swapped min/max value ranges → Auto-fixes
- ✅ Multiple pricing_basis values → Takes first, flags for review
- ✅ Hallucinated PriceCharting option numbers → Sets to null, flags
- ✅ Invalid enums → Rejects with clear error
- ✅ Missing required fields → Rejects
Automatically flags items needing human verification:
- Condition drastically affects value (10x+ difference)
- LLM uncertain about regional variant
- Conflicting Google search results
- PriceCharting match questionable
Run periodically to refresh valuations:
# Update all items
python update_pricecharting.py
# Dry run (preview changes)
python update_pricecharting.py --dry-run
# Only update items without PriceCharting data
python update_pricecharting.py --new-only
# Only update specific categories
python update_pricecharting.py --categories "Video Game Software" "LEGO"Export inventory to text files for PriceCharting's bulk collection upload. One file per category with formatted search strings:
# Generate files for entire inventory
python pricecharting_collection_generator.py
# Generate files for a single tote (incremental upload)
python pricecharting_collection_generator.py --tote TOTE-003
# Custom inventory path
python pricecharting_collection_generator.py --inventory /path/to/inventory.json
# Custom output directory
python pricecharting_collection_generator.py --output-dir /path/to/outputOutput files:
| File | Categories | Format Example |
|---|---|---|
videogames.txt |
Video Game Software, Console, Accessory, Handheld | "Chrono Trigger SFC CIB" |
cards.txt |
Trading Cards | "Charizard Pokemon Base Set #4" |
comics.txt |
Comic Books | "Action Comics #13 1939 CGC 8" |
legos.txt |
LEGO | "Fire Mario #71370 LEGO Super Mario" |
When using --tote, filenames include the tote ID (e.g., videogames-tote-003.txt) for incremental uploads without re-importing the entire collection.
Platform names are automatically normalized to abbreviated forms (NES, SNES, SFC, PSP, etc.) and redundant platform references embedded in item names by the LLM are stripped. Japanese games (NTSC-J) use regional platform names (SFC, FC, MD, PCE).
# Sequence number is in the filename: ItemName_054_TOTE-002.jpg → sequence 54
python remove_item.py TOTE-002 54
# Prompts for confirmation
# Creates backup before removal
# Regenerates CSV
# Remove all failed scan entries at once
python remove_item.py --purge-failed
# Lists each failed entry with tote, sequence, and error
# Prompts for confirmation before removing
# Original images remain in SCAN_DIR for re-scanning# View all seal assignments
python manage_seals.py view
# Assign seal to tote
python manage_seals.py assign TOTE-001 AB123456
# Bulk assignment mode
python manage_seals.py bulk# Crop all already-processed images
find organized/TOTE-* -type f \( -iname "*.jpg" -o -iname "*.png" \) -exec convert {} -fuzz 10% -trim +repage {} \;- Czur book scanner: ~$100-300 (or use existing camera)
- Zebra label printer: ~$200-400 (optional)
- SerpAPI: $50/month (5,000 searches) or $0 (100 free searches/month)
- PriceCharting: $50/month (10,000 requests) - optional, can run separately
- ngrok Personal (optional): $8/month for static domain
- Total: $50-108/month while actively scanning
- Scan everything with just Google Lens first ($50/month)
- Run PriceCharting updates later in batch ($50 one-time)
- Use free ngrok tier (requires URL update each session)
- Cancel subscriptions between scanning sessions
Example: 2,000 items scanned over 1 month = ~$100 total
Nintendo: NES, Famicom, SNES, Super Famicom, N64, GameCube, Wii, Wii U, Switch, Game Boy, Game Boy Color, Game Boy Advance, Nintendo DS, Nintendo 3DS, Virtual Boy
PlayStation: PS1, PS2, PS3, PS4, PS5, PSP, PS Vita
Xbox: Xbox, Xbox 360, Xbox One, Xbox Series X
Sega: Master System, Genesis/Mega Drive, Game Gear, Saturn, Dreamcast, Sega CD, Sega 32X
SNK: Neo Geo AES, Neo Geo MVS, Neo Geo Pocket, Neo Geo Pocket Color
Other: TurboGrafx-16/PC Engine, WonderSwan, WonderSwan Color, 3DO, CDi, Atari (2600, 7800, Jaguar, Lynx)
- Video Game Software
- Video Game Consoles
- Video Game Accessories
- Handheld Game Systems
- LEGO Sets
- Comic Books
- Trading Cards
- Electronics
- Collectibles
Ensure .env file exists and contains your SerpAPI key.
Update .env with your actual ngrok tunnel URL from ngrok http 8000.
Check ImageMagick is installed: convert --version
The scanner automatically retries on network errors (timeouts, connection drops) with exponential backoff. Increase MAX_RETRIES in .env if you have an unreliable connection (default: 2).
Increase OLLAMA_TIMEOUT in .env or use faster model.
This is expected. Sequence numbers are immutable IDs baked into filenames (ItemName_003_TOTE-001.jpg). Gaps after removal are harmless — the scanner uses the highest existing sequence to determine the next number.
Try adjusting PRICECHARTING_MAX_RESULTS to 10 for more options.
The system generates inventory suitable for international customs:
- Item-by-item listing with photos
- Fair market valuations in USD
- Personal effects eligibility flags
- Exportable to spreadsheet formats
For customs submission:
- Export
inventory.csvto Excel - Group similar items if permitted by destination country
- Translate to destination language if required
- Have certified by consulate if required
- Include high-value items individually
Consult destination country's customs requirements for specific format.
.
├── automated_inventory.py # Live scanning
├── automated_graded_inventory.py # Live scanning for graded items (two-pass LLM)
├── batch_inventory.py # Batch processing
├── generate_labels.py # QR label generation
├── manage_seals.py # Seal tracking
├── update_pricecharting.py # Price updates
├── pricecharting_collection_generator.py # PriceCharting collection export
├── remove_item.py # Item removal
├── .env.example # Config template
└── README.md # Documentation
Contributions welcome! Please open an issue first to discuss changes.
Use --dry-run to verify the pipeline without tote scans or disk writes:
python automated_inventory.py --dry-run test_image.jpg
python automated_graded_inventory.py --dry-run test_slab.jpgThen test with small batches in live mode:
- Generate 5 test labels
- Scan 10-20 items
- Verify accuracy before bulk processing
MIT License - See LICENSE file for details
- SerpAPI for Google Lens API access
- PriceCharting for video game pricing data
- Ollama for local LLM infrastructure
- Anthropic for Claude AI assistance in development
For issues or questions:
- Open a GitHub issue
- Check
.env.examplefor configuration help - Review troubleshooting section above
Note: This system uses AI for automated identification. Always verify high-value items manually. The developers are not responsible for customs compliance - consult your destination country's requirements.