Skip to content

mehrotra0307/RecipeScanner

Repository files navigation

RecipeScanner

Photograph any recipe, in any language — translated to English in seconds, entirely on your phone.

RecipeScanner uses Gemma 4 E2B, Google's latest on-device language model, to scan recipe images written in 140+ languages and translate them to structured English. No internet required. No data leaves your device.


Demo

Point the camera at a recipe card, menu, or cookbook in Chinese, Japanese, Arabic, or any other language — the app extracts the text, translates it, and presents the recipe with ingredients and steps clearly laid out.

Home Screen Camera Result Saved Recipes
screenshot screenshot screenshot screenshot

Features

Phase 1 (Current)

  • Live camera capture — CameraX-powered viewfinder with one-tap capture
  • Gallery import — pick an existing recipe photo from your photos
  • On-device OCR — ML Kit text recognition handles Latin, Chinese, and mixed scripts
  • Gemma 4 translation — LiteRT-LM runs the full 2B model locally; no API key, no internet
  • Structured output — recipe name, cuisine, ingredients with quantities, step-by-step instructions
  • Scan history — browse all previously scanned recipes with thumbnails
  • Delete / clear — manage your recipe library

Phase 2 (Planned)

  • Nutrition calculator (on-device, using Gemma 4 function-calling)
  • Store finder for specialty ingredients (Maps deep-link, no internet needed from app)
  • Unit converter (metric ↔ imperial ↔ traditional Asian measurements)
  • Ingredient substitution advisor

Tech Stack

Layer Technology
Language Kotlin 2.2
UI Jetpack Compose + Material 3
On-device AI Gemma 4 E2B via LiteRT-LM
OCR ML Kit Text Recognition (Latin + Chinese)
Camera CameraX
Database Room (SQLite)
DI Hilt (Dagger 2)
Async Kotlin Coroutines + Flow
Image loading Coil
Navigation Compose Navigation

Requirements

Device

  • Android 8.0+ (API 26+)
  • 8 GB RAM recommended (Gemma 4 E2B loads ~1.5 GB into memory at runtime)
  • ~3 GB free storage (2.6 GB for the model + app data)
  • Tested on: Pixel 7 Pro (Android 14)

Development machine

  • Android Studio Meerkat (2024.3+) or newer
  • JDK 17 (use Android Studio's bundled JDK)
  • Gradle 8.x (managed by wrapper)
  • adb on your PATH (comes with Android Studio's SDK)

Build & Run

1. Clone the repo

git clone https://github.com/yourusername/RecipeScanner.git
cd RecipeScanner

2. Download the Gemma 4 model

Download gemma-4-E2B-it.litertlm from the litert-community/gemma-4-E2B-it-litert-lm Hugging Face repo (requires a free HF account and accepting Google's Gemma terms).

File size: ~2.6 GB

3. Push the model to your device

Connect your Android device via USB with USB debugging enabled, then run:

adb shell mkdir -p /sdcard/Android/data/com.recipescanner.debug/files

adb push gemma-4-E2B-it.litertlm \
  /sdcard/Android/data/com.recipescanner.debug/files/gemma_model.litertlm

This push takes 2–5 minutes depending on your USB speed. It's a one-time setup.

4. Build and install

# Using Android Studio's bundled JDK:
JAVA_HOME="/Applications/Android Studio.app/Contents/jbr/Contents/Home" \
  ./gradlew installDebug

Or just hit the Run button in Android Studio.

5. First launch

On first launch the app copies the model from external storage to internal storage (~40 seconds). After that, the model loads in ~5 seconds on every subsequent launch.


How It Works

1. User captures or selects a recipe photo
        ↓
2. ML Kit OCR extracts raw text from the image
   (Latin + Chinese recognisers run in parallel; best result is kept)
        ↓
3. OCR text is sent to Gemma 4 (running locally via LiteRT-LM)
   Prompt: "Extract this recipe and translate to English. Return JSON."
        ↓
4. Gemma 4 returns structured JSON:
   { recipe_name, cuisine, ingredients[], steps[], original_language, ... }
        ↓
5. Result is displayed and optionally saved to Room database

Everything runs on-device. The app has no internet-dependent AI calls.


Project Structure

app/src/main/java/com/recipescanner/
├── MainActivity.kt              # Single Activity, Compose host
├── RecipeScannerApp.kt          # @HiltAndroidApp Application class
├── ui/
│   ├── navigation/
│   │   └── RecipeScannerNavGraph.kt   # All 4 routes + back-stack logic
│   ├── screens/
│   │   ├── MainScreen.kt        # Home: model status banner, action buttons, recipe list
│   │   ├── CameraScreen.kt      # CameraX live preview + capture
│   │   ├── ResultScreen.kt      # Scan result + saved recipe detail view
│   │   └── SavedRecipesScreen.kt # Grid view of all saved recipes
│   ├── theme/
│   │   ├── Theme.kt             # Material 3 + dynamic color
│   │   └── Type.kt              # Typography scale
│   └── viewmodel/
│       └── RecipeViewModel.kt   # Single shared ViewModel for all screens
├── data/
│   ├── RecipeEntity.kt          # Room entity (one row per recipe)
│   ├── RecipeDao.kt             # CRUD queries
│   ├── RecipeDatabase.kt        # Room singleton
│   └── RecipeRepository.kt      # Coordinates AI + DB + file storage
├── ml/
│   └── Gemma4Manager.kt         # LiteRT-LM engine + ML Kit OCR pipeline
└── di/
    └── DatabaseModule.kt        # Hilt providers for DB and DAO

Architecture

RecipeScanner follows a standard Android MVVM / Clean Architecture pattern:

UI (Compose screens)
    ↕  UiState sealed class / events
ViewModel (Hilt, viewModelScope)
    ↕  suspend functions / Flow
Repository (single source of truth)
    ↕                    ↕
Room (SQLite)     Gemma4Manager (LiteRT-LM + ML Kit OCR)

Privacy

  • No analytics, no crash reporting, no telemetry
  • Recipe photos are stored only in app-private storage on your device
  • The Gemma 4 model runs entirely on-device — no inference calls leave the phone
  • Internet permission is declared for potential future model download support; no network calls are made in Phase 1

License

Apache 2.0 — see LICENSE.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages