Android AI Transcription App

GitHub Repository • Copilot Instructions

Overview

This app lets you easily transcribe media files (audio or video) using AI. Users can upload, share, or open media files which are then re-encoded and sent to multiple transcription models. The app supports traditional Whisper-1 transcription, as well as newer GPT-4o variants that either directly transcribe or provide enhanced output. AI-powered post-processing further improves transcript readability. All transcriptions and settings are stored locally—with your API key secured in encrypted storage.

Key Features

Media Handling:
- Select, share, or open media files directly from your device.
- Direct transcription of voice messages from messaging apps like WhatsApp, Telegram, or Signal.
- Automatic re-encoding using FFmpegKit to M4A/AAC format for all transcription models.
- File size check (max. 24MB after processing) to ensure smooth operation.
Multiple Transcription Models:
- Whisper-1: Traditional audio transcription model.
- GPT-4o Transcribe: Specifically optimized for transcription tasks.
- GPT-4o-mini Transcribe: Efficient transcription model for cost-effective processing.
AI-Powered Cleanup:
- Enhance transcript readability with AI-driven cleanup using GPT-4o chat completions.
- Customizable cleanup prompts ensure the original content is preserved while improving clarity.
- Optional auto-format setting to automatically enhance readability after every transcription.
Local History & Settings:
- Maintain a local history of transcriptions with comprehensive statistics:
  - Transcript length (character count)
  - File sizes (original and uploaded)
  - Audio duration (when available)
  - Processing settings used (model, language, prompt)
- History view displays only relevant information for each entry, with expandable details.
- Secure API key storage using EncryptedSharedPreferences.
- In-app Settings allow you to:
  - Save and test your OpenAI API key.
  - Configure transcription models and language preferences.
  - Customize prompts for both transcription and AI cleanup.
  - Enable auto-format to automatically enhance transcriptions after processing.
User Experience Enhancements:
- Support for shared intents (from other apps) and direct file access.
- Retry functionality for reprocessing files with updated parameters.
- Clear, responsive UI built with Jetpack Compose and modern Android architecture practices.
Use Cases:
- Convert voice messages to text for easy reading and sharing
- Archive and search through voice message content
- Make voice messages accessible for hearing-impaired users
- Quick transcription of meeting recordings and lectures

How It Works

Initialization & Setup:
- Enter and test your OpenAI API key in the Settings screen.
- Choose your transcription model and set any custom prompts or language preferences.
Media Processing:
- Select a media file, or share one to the app.
- The file is copied, re-encoded to AAC format, and its size is validated.
Transcription & Cleanup:
- The processed file is uploaded to the selected transcription API.
- Once transcribed, the text is optionally enhanced with an AI cleanup process.
- The final transcript is displayed and stored locally with comprehensive statistics:
  - File sizes (original and processed)
  - Transcript length
  - All processing parameters used
History & Management:
- View past transcriptions with detailed statistics (file sizes, transcript length, duration).
- Each entry shows only relevant processing settings that were actually used.
- Copy transcriptions with or without detailed statistics.
- Delete individual entries or clear all history as needed.

Flowchart

graph TD
    A[Start] --> B[Enter & Test API Key in Settings]
    B --> C[Select Transcription Model & Set Prompts]
    C --> D[Choose Media File / Share to App]
    D --> E["Copy & Re-encode File (M4A/AAC)"]
    E --> F["File Size Check (<= 24MB)"]
    F --> G[Upload to Selected Transcription API]
    G --> H{Response Successful?}
    H -- Yes --> I[Display & Store Transcript]
    I --> J[Optional: AI Cleanup for Readability]
    J --> K[Update History]
    H -- No --> L[Show Error Message]
    K --> M[User Can Retry or View History]
    M --> N[End]

Privacy & Legal Notice

Third-Party Processing:
Your media files are sent to OpenAI’s servers. Ensure you have permission to share and transcribe them.
Sensitive Data:
Do not transcribe content with sensitive personal or confidential information.
User Responsibility:
Use your own OpenAI API key. You are responsible for any costs incurred, and you must comply with OpenAI’s Terms of Service.

Testing

For developers and testers who want to build the app with a pre-configured OpenAI API key for testing, see the Testing Guide. This allows you to create builds with embedded API keys without committing secrets to the repository.

Next Steps

Future enhancements include:

Refactoring UI state management into ViewModels.
Consolidating duplicated code and further adopting Kotlin coroutines with Retrofit’s suspend functions.
Enhancing the UI and user interaction flow.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.claude		.claude
.github		.github
.idea		.idea
app		app
docs/images		docs/images
gradle		gradle
testing/scripts		testing/scripts
.gitignore		.gitignore
BUILD_VERIFICATION_UPDATED.md		BUILD_VERIFICATION_UPDATED.md
CLAUDE.md		CLAUDE.md
CLIPBOARD_FLOW_DIAGRAMS.md		CLIPBOARD_FLOW_DIAGRAMS.md
CLIPBOARD_IMPLEMENTATION_SUMMARY.md		CLIPBOARD_IMPLEMENTATION_SUMMARY.md
CLIPBOARD_TESTING.md		CLIPBOARD_TESTING.md
README.de.md		README.de.md
README.md		README.md
TESTING.md		TESTING.md
build.gradle.kts		build.gradle.kts
demo_api_key_flow.sh		demo_api_key_flow.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
local.properties.example		local.properties.example
settings.gradle.kts		settings.gradle.kts
verify_api_key_mechanism.sh		verify_api_key_mechanism.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Android AI Transcription App

Overview

Key Features

How It Works

Flowchart

Privacy & Legal Notice

Testing

Next Steps

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Android AI Transcription App

Overview

Key Features

How It Works

Flowchart

Privacy & Legal Notice

Testing

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages