Whisper Android — Prebuilt on-device speech-to-text AAR

Run whisper.cpp on Android with one Gradle line. No NDK, no source build.

whisper-android is a prebuilt AAR that bundles whisper.cpp — a fast, on-device speech-to-text engine — behind a clean Kotlin API. Everything runs locally on the device: no network, no cloud, no API keys. 99 languages, optional translation to English.

You bring an audio file and a model file → you get text with timestamps.

Install

Pick one of the two methods.

A) Maven Central (recommended)

// build.gradle.kts (module)
dependencies {
    implementation("dev.ffmpegkit-maintained:whisper-android:0.1.2")
}

B) JitPack

// settings.gradle.kts
dependencyResolutionManagement {
    repositories {
        google()
        mavenCentral()
        maven { url = uri("https://jitpack.io") } // add this
    }
}

// build.gradle.kts (module)
dependencies {
    implementation("com.github.ffmpegkit-maintained.whisper:whisper-android:v0.1.2")
}

C) Direct AAR download

Grab whisper-android-<version>.aar from the Releases page, drop it in app/libs/, and add implementation(files("libs/whisper-android-0.1.2.aar")).

Quick Start

A complete, copy-paste example — even if you have never touched whisper.cpp or the NDK.

1. Add the dependency (see Install above).

2. Download a model (see Model Download) and ship it, or push it during dev:

adb push ggml-base.en.bin /sdcard/Android/data/<your.app.id>/files/models/

3. Transcribe:

import androidx.lifecycle.lifecycleScope
import dev.ffmpegkit.whisper.Whisper
import dev.ffmpegkit.whisper.WhisperConfig
import kotlinx.coroutines.launch
import java.io.File

class MainActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)

        lifecycleScope.launch {
            // Audio can be WAV/MP3/FLAC at any sample rate (decoded + resampled internally).
            val modelPath = File(getExternalFilesDir("models"), "ggml-base.en.bin").absolutePath
            val audioPath = File(getExternalFilesDir(null), "speech.wav").absolutePath

            val model  = Whisper.loadModel(this@MainActivity, modelPath)
            val result = Whisper.transcribe(model, audioPath, WhisperConfig(language = "en"))

            Log.i("Whisper", "Text: ${result.text}")
            result.segments.forEach { s ->
                Log.i("Whisper", "[${s.startMs}–${s.endMs} ms] ${s.text}")
            }

            Whisper.releaseModel(model)
        }
    }
}

That's it. Whisper.transcribe is a suspend function — call it from a coroutine.

Audio input: WAV, MP3 or FLAC at any sample rate — the library decodes and resamples to 16 kHz mono automatically (via whisper.cpp's built-in miniaudio decoder, no FFmpeg). No manual conversion needed.

Model Download

The model is not bundled in the AAR (models are 75 MB – 1.5 GB — far too big). Download the one that fits your speed/quality/size budget from Hugging Face:

Model	Size	Speed	Quality	Languages	Download
`tiny.en`	~75 MB	⚡⚡⚡ fastest	★★	English only	ggml-tiny.en.bin
`base`	~142 MB	⚡⚡ fast	★★★	99 languages	ggml-base.bin
`base.en`	~142 MB	⚡⚡ fast	★★★	English only	ggml-base.en.bin
`small`	~466 MB	⚡ slower	★★★★	99 languages	ggml-small.bin

Which one? Start with base (or base.en for English-only) — the best speed/quality trade-off on a phone. Use tiny.en if you need real-time-ish speed on low-end devices, or small when accuracy matters more than latency.

Ship the model with your app (assets or a first-run download), then load it with Whisper.loadModel(context, path) or Whisper.loadModelFromAsset(context, "models/ggml-base.bin").

Compatibility


ABI	`arm64-v8a` (covers >90% of modern Android devices)
Android	API 24+ (Android 7.0 and up)
Android 15	✅ 16 KB page size aligned
NEON	✅ enabled
compileSdk / targetSdk	35

Need x86_64 (emulators, Chromebooks), real-time streaming, VAD, or quantized models? Those are in the Pro build — see jokobee.com.

Documentation

Full guides on the Wiki: Installation · Quick Start · Model Download · FAQ · Troubleshooting.

API at a glance

object Whisper {
    suspend fun loadModel(context: Context, modelPath: String): WhisperModel
    suspend fun loadModelFromAsset(context: Context, assetName: String): WhisperModel
    suspend fun transcribe(model: WhisperModel, audioPath: String, config: WhisperConfig = WhisperConfig()): WhisperResult
    fun releaseModel(model: WhisperModel)
    fun getSystemInfo(): String
}

Publisher

Jokobee · https://www.jokobee.com · contact@jokobee.com Maintained under the ffmpegkit-maintained organisation.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
alias-whisper-cpp		alias-whisper-cpp
gradle		gradle
library		library
sample		sample
whisper.cpp @ 51c6961		whisper.cpp @ 51c6961
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
jitpack.yml		jitpack.yml
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Whisper Android — Prebuilt on-device speech-to-text AAR

Install

A) Maven Central (recommended)

B) JitPack

C) Direct AAR download

Quick Start

Model Download

Compatibility

Documentation

API at a glance

Publisher

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Whisper Android — Prebuilt on-device speech-to-text AAR

Install

A) Maven Central (recommended)

B) JitPack

C) Direct AAR download

Quick Start

Model Download

Compatibility

Documentation

API at a glance

Publisher

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages