An Android document scanner app that uses AI to extract and digitize text from images.
- Camera Capture - Take photos of documents directly
- Gallery Import - Select existing images for processing
- AI-Powered OCR - Uses Google Gemini AI for accurate text extraction
- Markdown Export - Optionally save extracted text as markdown files
- Batch Processing - Process multiple documents at once
- Smart Filenames - AI suggests appropriate filenames based on content
- Model Selection - Choose between different Gemini AI models
| Home Screen | Processing | Results |
|---|---|---|
| Camera/Gallery options | AI extraction in progress | View and save extracted text |
- Tap Camera to capture a document or Gallery to select images
- AI processes the image and extracts text
- Review the extracted text and markdown conversion
- Select a target directory and customize the filename
- Toggle "Save Markdown" if you want
.mdfiles alongside images - Tap Save Document to store the digitized content
This app requires a Google Gemini API key.
- Get an API key from Google AI Studio
- Create or edit
local.propertiesin the project root:apiKey=YOUR_GEMINI_API_KEY_HERE - Build and run the app
The API key is automatically loaded via the secrets-gradle-plugin and never committed to version control.
CAMERA- Required to capture document photosREAD_EXTERNAL_STORAGE- Required to access gallery images
- Kotlin - 100% Kotlin codebase
- Jetpack Compose - Modern declarative UI
- Google Generative AI SDK - Gemini AI integration
- Material 3 - Latest Material Design components
- Secrets Gradle Plugin - Secure API key management
- Android 9.0 (API 28) or higher
- Google Gemini API key
- Download the latest APK from Releases
- Install and grant camera/storage permissions
- Note: You'll need to build from source with your own API key for full functionality
git clone https://github.com/sunil-dhaka/Digitizer.git
cd Digitizer
# Add your API key to local.properties
echo "apiKey=YOUR_GEMINI_API_KEY" >> local.properties
./gradlew assembleDebugapp/src/main/java/com/example/digitizer/
MainActivity.kt # Entry point
DocumentScannerScreen.kt # Main UI composable
BakingViewModel.kt # AI processing logic
UiState.kt # UI state definitions
FilePicker.kt # File/directory selection
PermissionHandler.kt # Runtime permissions
ui/theme/ # Material 3 theming
The app supports multiple Gemini models:
- Gemini 1.5 Flash - Fast processing (recommended)
- Gemini 1.5 Pro - Higher accuracy for complex documents
- Gemini 2.0 Flash - Latest model with improved capabilities
MIT License - feel free to use, modify, and distribute.
Built with Jetpack Compose by sunil-dhaka