An Android application that uses on-device machine learning to detect objects through the camera and estimate their prices in real-time.
See howto/ for the canonical documentation entry point. The howto/ directory
consolidates all operational documentation, runbooks, architecture guides, and deployment
procedures.
- Run KMP unit tests:
./gradlew :shared:core-models:test :shared:core-tracking:test - Run Android unit tests:
./gradlew :androidApp:testDebugUnitTest - Run all tests:
./gradlew test - Run coverage:
./gradlew koverVerify - Instrumented tests (emulator/device required):
./gradlew :androidApp:connectedDebugAndroidTest
- JVM tests (shared modules):
./gradlew :shared:core-models:jvmTest :shared:core-tracking:jvmTest - Pre-push validation:
./gradlew prePushJvmCheck - Install pre-push hook:
./scripts/dev/install-hooks.sh
See howto/project/reference/DEV_GUIDE.md for details.
Scanium is a camera-first Android app that demonstrates object detection and price estimation using Google ML Kit. Point your camera at everyday items, and the app will identify them and provide estimated price ranges in EUR.
- Real-Time Object Detection: Uses Google ML Kit for on-device object detection and classification
- Domain Pack Category System: Config-driven fine-grained categorization beyond ML Kit's 5
coarse categories
- 23 specific categories (sofa, chair, laptop, TV, shoes, etc.)
- 10 extractable attributes (brand, color, material, size, condition, etc.)
- JSON-based configuration for easy extension
- Ready for CLIP, OCR, and cloud-based attribute extraction
- Visual Detection Overlay: Real-time bounding boxes and labels displayed on detected objects
- Live visualization of ML Kit detections with category labels
- Confidence scores shown on each detection
- Automatic overlay updates during continuous scanning
- Clean UI with minimal overlay text
- Intelligent Object Tracking: Multi-frame tracking system that eliminates duplicate detections
of the same object
- ML Kit trackingId-based matching with spatial fallback
- Confirmation thresholds ensure stable, confident detections
- Automatic expiry of objects that leave the frame
- Multiple Scan Modes:
- Object Detection: Point at objects and scan continuously with de-duplication
- Barcode Scanning: Scan QR codes and barcodes with instant recognition
- Document Text: Extract text from documents and images using OCR
- Dual Capture Modes:
- Tap to capture: Take a photo and detect objects in the image
- Long-press to scan: Continuous detection while holding the button
- Price Estimation: Category-based price ranges for detected objects
- Items Management: View and manage all detected items with their estimated prices
- Export-First Sharing: Export selected items to CSV/ZIP for spreadsheets, chat apps, or
marketplaces (marketplace integrations temporarily disabled)
- Multi-select items from detected list
- Export CSV summaries and ZIP bundles with images
- Share exports via standard Android share sheet
- Privacy-First: All processing happens on-device with no cloud calls
- Debug Logging: Comprehensive detection statistics and threshold tuning in debug builds
- Developer Options (Debug): System Health diagnostics panel showing backend connectivity, network status, permissions, and device capabilities with auto-refresh and clipboard export
- Kotlin - Primary programming language
- Jetpack Compose - Modern declarative UI framework
- Material 3 - Material Design components and theming with Scanium branding
- CameraX - Camera API for preview and image capture
- ML Kit Object Detection - On-device object detection and classification with tracking
- ML Kit Barcode Scanning - On-device barcode and QR code scanning
- Image Analysis - Real-time video stream processing with multi-frame candidate tracking
- MVVM Pattern - ViewModel-based architecture
- Kotlin Coroutines - Asynchronous programming
- StateFlow - Reactive state management
- Navigation Compose - Type-safe navigation
- Lifecycle Components - Android lifecycle-aware components
- Kotlinx Serialization - JSON parsing for Domain Pack configuration
- Node.js + TypeScript - Backend API server
- Prisma - Database ORM and migrations
- PostgreSQL - Primary database
- Fastify - HTTP server framework
- ngrok - Development tunneling for mobile device testing
- Docker Compose - Container orchestration for local development
- Grafana - Visualization dashboards and alerting
- Alloy - OpenTelemetry (OTLP) receiver and router
- Loki - Log aggregation and storage
- Tempo - Distributed tracing backend
- Mimir - Prometheus-compatible metrics storage
- Docker Compose - Containerized observability infrastructure
Scanium is a full-stack mobile application consisting of three main components:
┌─────────────────────────────────────────────────────────────────┐
│ Android Application │
│ (Kotlin + Compose + ML Kit + CameraX) │
└────────────────────┬────────────────────────────────────────────┘
│ HTTPS (ngrok tunnel in dev)
│ OTLP telemetry (Alloy)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Backend API Server │
│ (Node.js + TypeScript + Prisma + PostgreSQL) │
└────────────────────┬────────────────────────────────────────────┘
│ OpenTelemetry
▼
┌─────────────────────────────────────────────────────────────────┐
│ Observability Stack (LGTM + Alloy) │
│ Grafana → Loki (logs) + Tempo (traces) + Mimir (metrics) │
└─────────────────────────────────────────────────────────────────┘
The Android app follows a Simplified MVVM architecture with feature-based package organization:
androidApp/src/main/java/com/scanium/app/
├── camera/ # Camera functionality, CameraX, mode switching
├── diagnostics/ # System health diagnostics (backend, network, permissions)
├── domain/ # Domain Pack system (config, repository, category engine)
├── items/ # Detected items management and display
├── ml/ # Object detection and pricing logic
├── tracking/ # Object tracking and de-duplication system
├── selling/ # eBay marketplace integration (mock)
│ ├── data/ # API, repository, marketplace service
│ ├── domain/ # Listing models, status, conditions
│ ├── ui/ # Marketplace screen, listing VM, debug settings
│ └── util/ # Image preparation, draft mapping
├── ui/settings/ # Settings and Developer Options screens
└── navigation/ # Navigation graph setup
backend/
├── src/
│ ├── main.ts # Fastify server entry point
│ ├── routes/ # API endpoints
│ ├── services/ # Business logic
│ ├── modules/ # Feature modules (classifier, assistant)
│ └── middleware/ # Auth, validation, error handling
├── prisma/
│ ├── schema.prisma # Database schema
│ └── migrations/ # Version-controlled schema changes
└── docker-compose.yml # PostgreSQL container
monitoring/
├── docker-compose.yml # LGTM stack + Alloy services
├── grafana/
│ ├── provisioning/ # Auto-configured datasources
│ └── dashboards/ # Pre-built visualization dashboards
├── alloy/alloy.hcl # OTLP routing configuration
├── loki/loki.yaml # Log aggregation config
├── tempo/tempo.yaml # Distributed tracing config
└── mimir/mimir.yaml # Metrics storage config
For detailed architecture documentation, see howto/project/reference/ARCHITECTURE.md.
- Multi-module structure - Android app plus shared core libraries for models, tracking, and domain packs
- Hilt DI - Dagger Hilt for dependency injection (
@HiltViewModel,@AndroidEntryPoint) - Camera-first UX - App opens directly to camera screen
- On-device ML by default - Privacy-focused with optional cloud classification when API keys configured
- Reactive state management - StateFlow for UI state updates
- Backend integration - Node.js backend with PostgreSQL for persistence and marketplace features
- Android Studio Hedgehog (2023.1.1) or later
- JDK 17 (required) - See SETUP.md for installation instructions
- Android SDK with minimum API 24 (Android 7.0)
- Target API 34 (Android 14)
For detailed cross-platform setup instructions (macOS, Linux, Windows), see SETUP.md.
Quick Start:
-
Clone the repository:
git clone <repository-url> cd scanium
-
Ensure Java 17 is installed (see SETUP.md if needed)
-
Open the project in Android Studio, or build from command line:
./scripts/build.sh assembleDebug # Auto-detects Java 17 -
Run the app on an emulator or physical device
Prerequisites:
- Node.js 20+
- Docker (for PostgreSQL and monitoring stack)
- ngrok (for mobile device testing)
One-Command Startup (Recommended):
# Start backend + monitoring stack together
scripts/backend/start-dev.sh
# This automatically starts:
# - PostgreSQL database
# - Backend API server (port 8080)
# - ngrok tunnel (for mobile testing)
# - Observability stack (Grafana, Loki, Tempo, Mimir, Alloy)What You Get:
- Backend API: http://localhost:8080
- ngrok Public URL: Displayed in terminal (update mobile app with this URL)
- Grafana Dashboards: http://localhost:3000
- OTLP Endpoints: localhost:4317 (gRPC), localhost:4318 (HTTP)
- Health checks and status for all services
Options:
# Skip monitoring stack
scripts/backend/start-dev.sh --no-monitoring
# Stop all services
scripts/backend/stop-dev.sh
# Stop including monitoring
scripts/backend/stop-dev.sh --with-monitoring
# View monitoring URLs and health status
scripts/monitoring/print-urls.shBackend Configuration:
- Copy
.env.exampleto.envin thebackend/directory - Configure required environment variables
- Run database migrations:
cd backend && npm run prisma:migrate
See howto/project/reference/DEV_GUIDE.md and howto/monitoring/README.md for detailed setup instructions.
./scripts/build.sh assembleDebug # Build debug APK (auto-detects Java 17)
./scripts/build.sh assembleRelease # Build release APK
./scripts/build.sh test # Run unit tests
./scripts/build.sh clean # Clean build artifactsThe build.sh script automatically finds Java 17 on your system across macOS, Linux, and Windows.
./gradlew assembleDebug # Ensure Java 17 is active
./gradlew assembleRelease
./gradlew test # Run unit tests
./gradlew connectedAndroidTest # Run instrumented tests (requires device)- Each push to
mainbuilds a debug APK in the Android Debug APK workflow. - In GitHub Actions, download the
scanium-app-debug-apkartifact from the latest run. - Unzip the archive and install
app-debug.apkon your device (enable unknown sources if needed).
scanium/
├── androidApp/ # Compose UI + feature orchestration
│ ├── camera/ # CameraScreen, CameraXManager, DetectionOverlay
│ ├── items/ # ItemsListScreen, ItemDetailDialog, ItemsViewModel
│ ├── ml/ # ObjectDetectorClient, BarcodeScannerClient, PricingEngine
│ ├── navigation/ # ScaniumNavGraph + routes
│ └── ui/ # Material 3 theme and shared components
├── android-platform-adapters/ # Bitmap/Rect adapters → ImageRef/NormalizedRect
├── android-camera-camerax/ # CameraX helpers
├── android-ml-mlkit/ # ML Kit plumbing
├── core-models/ # Portable models (ScannedItem, ImageRef, NormalizedRect)
├── core-tracking/ # Tracking math (ObjectTracker, AggregatedItem)
├── core-domainpack/, core-scan/, core-contracts/ # Shared contracts
├── backend/ # Node.js API server
│ ├── src/ # TypeScript source (routes, services, middleware)
│ ├── prisma/ # Database schema and migrations
│ ├── docker-compose.yml # PostgreSQL container
│ └── .env # Environment configuration (gitignored)
├── monitoring/ # Observability stack (LGTM + Alloy)
│ ├── docker-compose.yml # Grafana, Loki, Tempo, Mimir, Alloy
│ ├── grafana/ # Dashboards and datasource provisioning
│ ├── alloy/ # OTLP routing configuration
│ ├── loki/, tempo/, mimir/ # Backend storage configs
│ └── data/ # Persistent data volumes (gitignored)
├── scripts/ # Build and development automation
│ ├── backend/ # Backend dev scripts (start-dev, stop-dev)
│ ├── monitoring/ # Monitoring scripts (start, stop, print-urls)
│ └── build.sh # Java 17 auto-detection for Android builds
├── docs/ # Project documentation
│ ├── ARCHITECTURE.md # System architecture
│ ├── DEV_GUIDE.md # Development workflow
│ └── CODEX_CONTEXT.md # Agent quickmap
├── md/ # Feature docs, fixes, testing guides
├── AGENTS.md
├── ROADMAP.md
└── README.md
- Launch the app - Camera screen opens automatically
- Select scan mode - Swipe to switch between:
- Object Detection: Detect and price everyday objects
- Barcode: Scan QR codes and barcodes
- Document: Extract text from documents
- Point at objects - Aim your camera at items you want to identify
- Capture:
- Tap the camera button to capture a single photo
- Long-press to start continuous scanning
- Double-tap to stop scanning
- View results - Detected objects appear in two ways:
- Visual Overlay: Real-time bounding boxes and labels shown on camera preview
- Items List: Tap "View Items" to see all detected objects with details
- In continuous scanning mode, each physical object appears only once (de-duplicated)
- The tracker confirms objects instantly for responsive detection
- Manage items - Tap "View Items" to see all detected objects
- Export items:
- Long-press an item to enter selection mode
- Tap to select multiple items
- Tap "Export" to open export options
- Choose CSV for spreadsheets or ZIP for images + data
- Share the export with chat apps or other tools
- Marketplace integrations are coming later (currently disabled)
- Delete items - Tap the delete icon in the top bar to clear all items
The app requires the following permission:
- Camera (
android.permission.CAMERA) - For object detection and capture
- Mocked pricing data - Prices are generated locally based on category (API integration ready)
- Local-only sync - Items stored on-device; backend sync available but cloud sync not yet enabled
- Marketplace disabled - eBay integration implemented but temporarily disabled in UI ( export-first flow active)
The project includes comprehensive test coverage with 171 total tests:
Unit Tests:
- Tracking & Detection (110 tests):
- ObjectTracker (tracking and de-duplication logic)
- ObjectCandidate (spatial matching algorithms)
- TrackingPipelineIntegration (end-to-end scenarios)
- DetectionResult (overlay data model validation)
- ItemsViewModel (state management)
- PricingEngine, ScannedItem, ItemCategory
- Domain Pack System (61 tests):
- DomainPack (data model and helper methods)
- LocalDomainPackRepository (JSON loading, caching, validation)
- CategoryMapper (category conversion and validation)
- BasicCategoryEngine (ML Kit label matching, priority handling)
- DomainPackProvider (singleton initialization, thread safety)
Instrumented Tests:
- DetectionOverlay UI tests (bounding box rendering)
- ModeSwitcher UI tests
- ItemsViewModel integration tests
Test Frameworks: JUnit 4, Robolectric (SDK 28), Truth assertions, MockK, Coroutines Test, Compose Testing, Kotlinx Serialization Test
- Real pricing API integration
- Historical price tracking and analytics
- Multi-currency support
- Share detected items
- Compare prices across retailers
- Color-based object matching for improved tracking
- iOS client using shared KMP modules
- Cloud-based ML models for enhanced classification
- Adaptive tracking thresholds based on scene complexity
- End-to-end telemetry from Android app to Grafana
- Production deployment configuration (Kubernetes/Cloud Run)
- ✅ Hilt Dependency Injection: Full DI setup with @HiltViewModel, @AndroidEntryPoint for testability and modularity
- ✅ Backend Services: Node.js + TypeScript + Prisma + PostgreSQL with LGTM observability stack
- ✅ Developer Options with System Health: Debug-only diagnostics panel showing backend connectivity, network status, permissions, device capabilities, and app configuration with auto-refresh and clipboard export
- ✅ WCAG 2.1 Accessibility: TalkBack support with proper semantics, 48dp touch targets, traversal order, and screen reader announcements
- ✅ Export-First Sharing (CSV/ZIP): Export selected items for spreadsheets, chat apps, or
marketplaces
- Multi-selection UI with long-press and tap gestures
- CSV summaries + ZIP bundles with images
- Share sheet integration for quick handoff
- Marketplace integrations are disabled while export-first UX ships
- ✅ Domain Pack Category System (Track A): Config-driven fine-grained categorization with 23 categories and 10 attributes
- ✅ Cross-Platform Build System: Portable build.sh script and Java 17 toolchain for multi-machine development
- ✅ UI Refinements: Slim vertical slider, clean overlay text, minimal camera interface
- ✅ Visual Detection Overlay: Real-time bounding boxes and labels on camera preview
- ✅ Object Tracking & De-duplication: Multi-frame tracking with ML Kit integration
- ✅ Barcode/QR Code Scanning: Real-time barcode detection
- ✅ Local Persistence & History: Room-backed storage with full item change-log (not yet exposed in UI)
- ✅ Document Text Recognition: OCR for document scanning
- ✅ Comprehensive Test Suite: 175+ tests covering tracking, detection, domain pack, and selling systems
- ✅ SINGLE_IMAGE_MODE Detection: More accurate object detection for both tap and long-press
- ✅ Scanium Branding: Complete visual rebrand with new color scheme and identity
See howto/ for the complete documentation entry point and directory guide.
| Topic | Location |
|---|---|
| Architecture | Camera Pipeline, Item Aggregation |
| Releases | Release Checklist, Release & Rollback |
| Security | Security Guidelines, NAS Security |
| Monitoring | Grafana Access, Stack Changelog |
| KMP Migration | Migration Map |
- AGENTS.md - Guidelines for AI agents working on the project
- GEMINI.md - Scanium context for Gemini AI
Historical documentation from December 2025 is archived in howto/archive/2025-12/.
[Add your license here]
[Add contributing guidelines here]
[Add contact information here]