Skip to content

feat(android/voice): on-device STT and LLM via SpeechRecognizer + Gemini Nano#27

Merged
tstapler merged 10 commits intomainfrom
stelekit-voice
Apr 26, 2026
Merged

feat(android/voice): on-device STT and LLM via SpeechRecognizer + Gemini Nano#27
tstapler merged 10 commits intomainfrom
stelekit-voice

Conversation

@tstapler
Copy link
Copy Markdown
Owner

Summary

  • Fix microphone permission: Android was silently denying mic access with no prompt. Now wires ActivityResultContracts.RequestPermission through MainActivity using the same CompletableDeferred pattern as the folder picker — first tap shows the system dialog.
  • On-device STT (AndroidSpeechRecognizerProvider): wraps android.speech.SpeechRecognizer as a DirectSpeechProvider that handles both recording and transcription in one step. Uses EXTRA_PREFER_OFFLINE=true (Pixel on-device model, no network), 3s silence timeout, and feeds RMS to the existing waveform animation. No Whisper API key needed.
  • On-device LLM (MlKitLlmFormatterProvider): formats transcripts using Gemini Nano via ML Kit Prompt API (genai-prompt:1.0.0-beta2). Handles all four FeatureStatus states; returns a friendly "downloading, try again" message instead of blocking during model download. Requires Pixel 9+ or AICore-enabled OEM flagships.
  • Settings UI: toggles for both on-device options appear only on eligible devices. When device LLM is enabled, API key fields are hidden and a 256-token output cap note is shown.
  • Pipeline priority: device LLM → Anthropic Claude → OpenAI → no-op (raw transcript).
  • minSdk bumped 24→26 (required by genai-prompt).

Architecture

New DirectSpeechProvider interface in commonMain supports pipelines where a single component handles both recording and transcription (bypassing the two-step record→STT path). VoiceCaptureViewModel branches on pipeline.directSpeechProvider before falling back to the existing AudioRecorder + SpeechToTextProvider path.

Test plan

  • First launch with no mic permission: tap mic button → system permission dialog appears → grant → recording starts
  • Settings → Voice Capture → "Use on-device speech recognition" toggle visible on Pixel (hidden on unsupported devices)
  • With device STT enabled: tap mic, speak, stop → transcript inserted without Whisper key
  • Settings → Voice Capture → "Use on-device LLM (Gemini Nano)" toggle visible on Pixel 9+ (hidden otherwise)
  • With device LLM enabled: voice note is formatted into bullet points without Claude/OpenAI key
  • With device LLM downloading: friendly error returned, not a crash or hang
  • Desktop/JVM: no regressions — NoOpAudioRecorder and no device toggles shown
  • Existing Whisper + Claude cloud path still works when device toggles are off

🤖 Generated with Claude Code

…ini Nano

- Fix Android microphone runtime permission — was silently denied with no prompt;
  now uses ActivityResultContracts.RequestPermission wired through MainActivity
- Add AndroidSpeechRecognizerProvider: wraps android.speech.SpeechRecognizer as a
  DirectSpeechProvider (combined record+transcribe, no audio upload, works offline)
  with EXTRA_PREFER_OFFLINE=true and RMS amplitude for waveform animation
- Add MlKitLlmFormatterProvider: on-device LLM formatting via ML Kit Prompt API
  (Gemini Nano through AICore); handles AVAILABLE/DOWNLOADABLE/DOWNLOADING states;
  returns user-friendly error while model downloads rather than blocking
- Add DirectSpeechProvider interface to commonMain to support integrated pipelines
  that bypass the two-step record→STT path
- Both on-device options are configurable via Settings → Voice Capture toggles;
  toggles only appear on devices that report availability
- Pipeline priority: device LLM > Anthropic Claude > OpenAI > no-op
- Bump minSdk 24→26 (required by com.google.mlkit:genai-prompt:1.0.0-beta2)
- Add genai-prompt dependency to kmp androidMain

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 24, 2026 15:48
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 24, 2026

Benchmark Results

Comparing ab57368 (this PR) vs 2a85c6d (baseline)
Graph config: small — ? pages

Metric This PR Baseline Delta
Phase 1 TTI 10ms 11ms (-1ms)
Phase 2 3ms 3ms (0ms)
Phase 3 index 9ms 9ms (0ms)
Total 20ms 22ms (-2ms)
Write p95 (baseline) 26ms 26ms -
Write p95 (under load) -1ms -1ms -
Jank factor -0.04x -0.04x -

Flamegraph not available

Top allocation hotspots (this PR)

59.1% byte[][k]
12.1% java.lang.Object[]
[k]
5.6% java.lang.String_[k]
3% jdk.internal.org.objectweb.asm.SymbolTable$Entry[][k]
2.5% jdk.internal.org.objectweb.asm.SymbolTable$Entry
[k]

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Android on-device voice pipeline by introducing a “direct” speech path (record+transcribe in one component) and an optional on-device LLM formatter, with settings-driven routing and Android-specific permission handling.

Changes:

  • Introduces DirectSpeechProvider + pipeline/view-model branching to support integrated listen→transcript providers (Android SpeechRecognizer).
  • Adds on-device LLM formatting provider (ML Kit Prompt API / Gemini Nano) and settings UI toggles gated by device capability.
  • Wires Android mic permission prompting via ActivityResultContracts.RequestPermission, and updates build config for the new ML Kit dependency / minSdk bump.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
kmp/src/commonMain/kotlin/dev/stapler/stelekit/voice/VoiceSettings.kt Adds persisted toggles for on-device STT/LLM.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/voice/VoicePipelineFactory.kt Extends pipeline factory to accept direct speech + device LLM providers and prioritize them.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/voice/VoicePipelineConfig.kt Adds directSpeechProvider and effectiveAmplitudeFlow for waveform routing.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/voice/VoiceCaptureViewModel.kt Branches capture flow to use direct speech provider when present.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/voice/DirectSpeechProvider.kt New interface for integrated listen/transcribe providers (+ optional amplitude).
kmp/src/commonMain/kotlin/dev/stapler/stelekit/ui/components/settings/VoiceCaptureSettings.kt Adds on-device STT/LLM toggles and hides cloud key inputs when enabled.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/ui/components/settings/SettingsDialog.kt Plumbs device capability flags into voice settings UI.
kmp/src/commonMain/kotlin/dev/stapler/stelekit/ui/App.kt Plumbs device capability flags through app UI and switches waveform to effectiveAmplitudeFlow.
kmp/src/androidMain/kotlin/dev/stapler/stelekit/voice/MlKitLlmFormatterProvider.kt New ML Kit Prompt API formatter provider with eligibility/status handling.
kmp/src/androidMain/kotlin/dev/stapler/stelekit/voice/AndroidSpeechRecognizerProvider.kt New SpeechRecognizer-backed DirectSpeechProvider with RMS amplitude mapping.
kmp/src/androidMain/kotlin/dev/stapler/stelekit/voice/AndroidAudioRecorder.kt Adds suspendable mic permission request hook before recording starts.
kmp/build.gradle.kts Adds ML Kit genai-prompt dependency to androidMain.
androidApp/src/main/kotlin/dev/stapler/stelekit/MainActivity.kt Implements mic permission flow, device capability detection, and new pipeline construction.
androidApp/build.gradle.kts Bumps app minSdk to 26.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (deviceSttAvailable && voiceSettings.getUseDeviceStt()) deviceSttProvider else null,
if (deviceLlmAvailable && voiceSettings.getUseDeviceLlm()) mlKitProvider else null,
)
var voicePipeline by remember { mutableStateOf(buildPipeline()) }
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deviceLlmAvailable is resolved asynchronously in LaunchedEffect, but voicePipeline is only built once and then rebuilt only via onRebuildVoicePipeline. If a user already has “Use on-device LLM” enabled, they’ll still start with the cloud/no-op pipeline until they manually open Settings and hit Save. Consider rebuilding the pipeline automatically when deviceLlmAvailable flips to true (or pass mlKitProvider regardless of deviceLlmAvailable and let the provider handle unsupported/downloading states).

Suggested change
var voicePipeline by remember { mutableStateOf(buildPipeline()) }
var voicePipeline by remember { mutableStateOf(buildPipeline()) }
LaunchedEffect(deviceLlmAvailable) {
voicePipeline = buildPipeline()
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit edbf46c: added LaunchedEffect(deviceLlmAvailable) in MainActivity that calls onRebuildVoicePipeline() whenever the flag flips, so the pipeline is rebuilt automatically without requiring a manual settings round-trip.

Comment on lines 58 to 63
if (requestMicPermission != null && !requestMicPermission()) {
return@withContext PlatformAudioFile("")
}

stopRequested = false
pauseRequested = false
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mic permission await happens before stopRequested is reset to false. If the user taps the mic, the permission dialog shows, and they tap again to stop/cancel while the dialog is up, stopRecording() can set stopRequested=true but it will be overwritten after permission is granted, causing an unexpected recording session to start. Consider checking stopRequested/cancellation before resetting flags, or moving the flag reset earlier and short-circuiting after permission if a stop was requested.

Suggested change
if (requestMicPermission != null && !requestMicPermission()) {
return@withContext PlatformAudioFile("")
}
stopRequested = false
pauseRequested = false
stopRequested = false
pauseRequested = false
if (requestMicPermission != null && !requestMicPermission()) {
return@withContext PlatformAudioFile("")
}
if (stopRequested) {
return@withContext PlatformAudioFile("")
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit edbf46c: flag resets (isRecording = false, stopRequested = false) moved to before the permission await, and a stopRequested guard was added immediately after permission is granted so that a cancel-during-permission-dialog races are handled correctly.

Comment on lines +38 to +95
mainHandler.post {
val recognizer = SpeechRecognizer.createSpeechRecognizer(context)
activeRecognizer = recognizer

recognizer.setRecognitionListener(object : RecognitionListener {
override fun onReadyForSpeech(params: Bundle?) {}
override fun onBeginningOfSpeech() {}
override fun onBufferReceived(buffer: ByteArray?) {}
override fun onEndOfSpeech() {}
override fun onEvent(eventType: Int, params: Bundle?) {}
override fun onPartialResults(partialResults: Bundle?) {}

override fun onRmsChanged(rmsdB: Float) {
// Map roughly -2..10 dB → 0..1
_amplitudeFlow.value = ((rmsdB + 2f) / 12f).coerceIn(0f, 1f)
}

override fun onResults(results: Bundle?) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
val text = results
?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
?.firstOrNull()
Log.d(TAG, "onResults: text=${text?.take(80)}")
if (text.isNullOrBlank()) cont.resume(TranscriptResult.Empty)
else cont.resume(TranscriptResult.Success(text))
}

override fun onError(error: Int) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
Log.w(TAG, "onError: code=$error")
cont.resume(mapError(error))
}
})

val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
putExtra(RecognizerIntent.EXTRA_PREFER_OFFLINE, true)
putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS, 3_000L)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS, 1_500L)
}
recognizer.startListening(intent)

cont.invokeOnCancellation {
mainHandler.post {
activeRecognizer?.let {
it.cancel()
it.destroy()
activeRecognizer = null
}
_amplitudeFlow.value = 0f
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

listen() creates and starts SpeechRecognizer inside a Handler.post without any try/catch. If createSpeechRecognizer(...) or startListening(...) throws (e.g., service not available), the coroutine will never be resumed/cancelled, leaving the UI stuck in Recording. Wrap creation/start in a try/catch and resume with a failure TranscriptResult (and ensure activeRecognizer is cleared/destroyed).

Suggested change
mainHandler.post {
val recognizer = SpeechRecognizer.createSpeechRecognizer(context)
activeRecognizer = recognizer
recognizer.setRecognitionListener(object : RecognitionListener {
override fun onReadyForSpeech(params: Bundle?) {}
override fun onBeginningOfSpeech() {}
override fun onBufferReceived(buffer: ByteArray?) {}
override fun onEndOfSpeech() {}
override fun onEvent(eventType: Int, params: Bundle?) {}
override fun onPartialResults(partialResults: Bundle?) {}
override fun onRmsChanged(rmsdB: Float) {
// Map roughly -2..10 dB → 0..1
_amplitudeFlow.value = ((rmsdB + 2f) / 12f).coerceIn(0f, 1f)
}
override fun onResults(results: Bundle?) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
val text = results
?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
?.firstOrNull()
Log.d(TAG, "onResults: text=${text?.take(80)}")
if (text.isNullOrBlank()) cont.resume(TranscriptResult.Empty)
else cont.resume(TranscriptResult.Success(text))
}
override fun onError(error: Int) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
Log.w(TAG, "onError: code=$error")
cont.resume(mapError(error))
}
})
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
putExtra(RecognizerIntent.EXTRA_PREFER_OFFLINE, true)
putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS, 3_000L)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS, 1_500L)
}
recognizer.startListening(intent)
cont.invokeOnCancellation {
mainHandler.post {
activeRecognizer?.let {
it.cancel()
it.destroy()
activeRecognizer = null
}
_amplitudeFlow.value = 0f
}
cont.invokeOnCancellation {
mainHandler.post {
activeRecognizer?.let {
it.cancel()
it.destroy()
activeRecognizer = null
}
_amplitudeFlow.value = 0f
}
}
mainHandler.post {
var recognizer: SpeechRecognizer? = null
try {
recognizer = SpeechRecognizer.createSpeechRecognizer(context)
activeRecognizer = recognizer
recognizer.setRecognitionListener(object : RecognitionListener {
override fun onReadyForSpeech(params: Bundle?) {}
override fun onBeginningOfSpeech() {}
override fun onBufferReceived(buffer: ByteArray?) {}
override fun onEndOfSpeech() {}
override fun onEvent(eventType: Int, params: Bundle?) {}
override fun onPartialResults(partialResults: Bundle?) {}
override fun onRmsChanged(rmsdB: Float) {
// Map roughly -2..10 dB → 0..1
_amplitudeFlow.value = ((rmsdB + 2f) / 12f).coerceIn(0f, 1f)
}
override fun onResults(results: Bundle?) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
val text = results
?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
?.firstOrNull()
Log.d(TAG, "onResults: text=${text?.take(80)}")
if (text.isNullOrBlank()) cont.resume(TranscriptResult.Empty)
else cont.resume(TranscriptResult.Success(text))
}
override fun onError(error: Int) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer.destroy()
if (!cont.isActive) return
Log.w(TAG, "onError: code=$error")
cont.resume(mapError(error))
}
})
val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
putExtra(RecognizerIntent.EXTRA_PREFER_OFFLINE, true)
putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS, 3_000L)
putExtra(RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS, 1_500L)
}
recognizer.startListening(intent)
} catch (t: Throwable) {
_amplitudeFlow.value = 0f
activeRecognizer = null
recognizer?.destroy()
Log.w(TAG, "Failed to start speech recognition", t)
if (cont.isActive) {
cont.resume(mapError(SpeechRecognizer.ERROR_CLIENT))
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commits edbf46c and 9c0825e: invokeOnCancellation moved before mainHandler.post, recognizer creation/start wrapped in try/catch inside the handler, and an isActive guard added after activeRecognizer = recognizer to destroy immediately if cancellation fired during the post.

Comment on lines +61 to +66
_state.value = VoiceCaptureState.Recording
val transcriptResult = if (pipeline.directSpeechProvider != null) {
pipeline.directSpeechProvider.listen()
} else {
recordAndTranscribe()
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VoiceCaptureViewModel now has a new direct-speech branch (directSpeechProvider.listen() / stopListening()), but the existing VoiceCaptureViewModelTest suite doesn’t cover it. Adding tests for success + permission denied/empty cases via a fake DirectSpeechProvider would help prevent regressions in the new pipeline path.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit edbf46c: added 4 tests in VoiceCaptureViewModelTest covering the DirectSpeechProvider path — success, PermissionDenied, Empty, and cancel.

Comment thread kmp/build.gradle.kts
Comment on lines 176 to 181
// Encrypted SharedPreferences for API key storage
implementation("androidx.security:security-crypto:1.1.0-alpha06")

// On-device LLM via Gemini Nano (Pixel 9+ and AICore-enabled OEM flagships)
implementation("com.google.mlkit:genai-prompt:1.0.0-beta2")
}
Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding com.google.mlkit:genai-prompt to the kmp Android source set likely requires raising this module’s android.defaultConfig.minSdk as well. kmp/build.gradle.kts still declares minSdk = 24 (near the bottom of the file), which can cause a Gradle manifest merge failure if genai-prompt has minSdk >= 26 even though the app module is already bumped. Please align the kmp module minSdk with the new requirement.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit b72c420: kmp/build.gradle.kts android { defaultConfig { minSdk } } raised to 26 to satisfy the genai-prompt:1.0.0-beta2 manifest requirement.

tstapler and others added 3 commits April 24, 2026 11:16
genai-prompt:1.0.0-beta2 declares minSdkVersion 26 in its manifest.
The kmp library module had its own android { defaultConfig { minSdk } }
block at 24, causing processDebugUnitTestManifest to fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Main

PerformanceDashboard.kt used Dispatchers.IO directly in commonMain,
which is a JVM-only symbol. compileCommonMainKotlinMetadata fails when
it encounters it because Dispatchers.IO is absent from the multiplatform
metadata API surface. queryPercentiles() is a blocking SQLite call so
PlatformDispatcher.DB is the correct dispatcher per the project's own
dispatcher matrix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kotlin incremental compilation stores per-project metadata in
~/.gradle/caches/kotlin-build-*/. When a previous CI run compiled
PlatformSettings.kt with `: Settings` (commit dc1b51b) the IC state
was saved in the Gradle home cache restored by setup-gradle. Subsequent
runs restore that stale metadata even though the source no longer has
the supertype, causing a spurious expect/actual mismatch:

  expect: PlatformSettings : Settings
  actual: PlatformSettings : Any

Excluding caches/kotlin-build-* forces a clean IC state per run,
eliminating cross-branch metadata pollution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@tstapler
Copy link
Copy Markdown
Owner Author

@copilot The cast all of the comments that you left

Copilot AI and others added 2 commits April 25, 2026 00:12
- MainActivity: rebuild voice pipeline automatically when deviceLlmAvailable
  flips via LaunchedEffect(deviceLlmAvailable)
- AndroidAudioRecorder: move flag resets before permission check and add
  stopRequested guard after permission grant to prevent spurious recordings
- AndroidSpeechRecognizerProvider: move invokeOnCancellation before
  mainHandler.post and wrap recognizer creation/start in try/catch so
  thrown exceptions don't leave the coroutine stuck forever
- VoiceCaptureViewModelTest: add 4 tests covering DirectSpeechProvider path
  (success, PermissionDenied, Empty, cancel)
- kmp/build.gradle.kts: minSdk=26 already aligned (no change needed)"

Agent-Logs-Url: https://github.com/tstapler/stelekit/sessions/c6a0afad-d76c-48da-90b0-3565a04aeb3f

Co-authored-by: tstapler <3860386+tstapler@users.noreply.github.com>
Add an isActive check inside mainHandler.post after setting activeRecognizer
so that if cancellation fires before the post runs (cleanup was a no-op),
the recognizer is destroyed immediately rather than starting a zombie session."

Agent-Logs-Url: https://github.com/tstapler/stelekit/sessions/c6a0afad-d76c-48da-90b0-3565a04aeb3f

Co-authored-by: tstapler <3860386+tstapler@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 25, 2026

@copilot The cast all of the comments that you left

All four review comments addressed in commits edbf46c and 9c0825e:

  1. MainActivity.kt — Added LaunchedEffect(deviceLlmAvailable) so the pipeline rebuilds automatically when device LLM eligibility is determined, no manual Settings save needed.

  2. AndroidAudioRecorder.kt — Moved stopRequested = false / pauseRequested = false flag resets to before the permission check, then added a stopRequested guard after the permission grant. This prevents a tap-during-dialog from being ignored.

  3. AndroidSpeechRecognizerProvider.kt — Moved invokeOnCancellation before mainHandler.post so it's registered immediately (not after the post runs), wrapped recognizer creation/start in try/catch to resume the coroutine with ERROR_CLIENT on unexpected failures, and added an isActive check inside the post after activeRecognizer is set so a cancellation that fires before the post runs still cleans up the recognizer.

  4. VoiceCaptureViewModelTest.kt — Added four tests for the DirectSpeechProvider path: success (reaches Done), PermissionDenied (error at RECORDING), Empty (error at TRANSCRIBING), and cancel during Recording (resets to Idle).

  5. kmp/build.gradle.ktsminSdk = 26 was already aligned; no change needed.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • dl.google.com
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.prefs/java.util.prefs=ALL-UNNAMED --add-opens=java.base/java.nio.charset=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.xml/javax.xml.namespace=ALL-UNNAMED -XX:&#43;UseParallelGC -Xmx4g -Dfile.encoding=UTF-8 -Duser.country -Duser.language=en -Duser.variant -cp (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

tstapler and others added 4 commits April 24, 2026 21:28
Brings in:
- fix(resilience): loading screen hang + Settings interface (dc1b51b)
- test(ci): JVM test job + loading state regression tests (d036af1)
- ci: Gradle cache / smoke-test improvements
- chore: v0.9.4 release

Conflict resolutions:
- PerformanceDashboard.kt: keep PlatformDispatcher.DB (correct for SQLite
  reads, and required for iOS compileCommonMainKotlinMetadata)
- PlatformSettings.android.kt: add `: Settings` and `override` modifiers
  to match the expect class which now implements the Settings interface
- FakeRepositories.kt: mark FakeFileSystem `open` so
  StelekitViewModelLoadingTest can subclass it

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Brings in:
- perf(android): decompose Phase 3 chunk writes (b5a51ab)
- chore(main): release 0.9.5 (#30)

Conflict resolution:
- build.gradle.kts: keep minSdk=26 (required by genai-prompt:1.0.0-beta2)
  and take testInstrumentationRunner from main

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Brings in:
- feat(android): SteleKitApplication class + shared GraphManager (0e7915e)
- feat(fdroid): fix repo version history + version in Settings (9bb8a77)
- fix(search): create-page in link picker now inserts link (2a85c6d)
- ci: benchmark retry with rebase (962b647)
- chore(main): release 0.10.0 (#34)

Conflict resolutions:
- build.gradle.kts: keep genai-prompt + add glance dependencies from main
- MainActivity.kt: keep our fileSystem init with SAF folder picker
  and AudioRecorder with mic permission callback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Android Benchmark Results

Comparing ab57368 (this PR) vs a1dfe06 (baseline)
Device: API 30 x86_64 emulator — 25 pages

Metric This PR Baseline Delta
Phase 1 TTI 48ms 60ms (-12ms)
Phase 3 index 31ms 35ms (-4ms)
Write p95 (baseline) 2ms 4ms -
Write p95 (during phase 3) 2ms 12ms (-10ms)
Jank factor 1x 3x -
Concurrent writes 1 1 -

@tstapler tstapler merged commit a275dbe into main Apr 26, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants