Mistral: per-key rate limiting, retry/backoff and improved error handling for streaming requests by Android-PowerUser · Pull Request #65 · Android-PowerUser/ScreenOperator

Android-PowerUser · 2026-03-30T16:12:18Z

Motivation

The previous single-timestamp rate limiter could block all requests and didn't handle multiple API keys or transient Mistral errors robustly.
Requests that received 429/5xx responses needed a more resilient retry strategy that can switch between keys and respect per-key cooldowns.

Description

Replace the single lastMistralRequestTimeMs limiter with a per-key cooldown map mistralNextAllowedRequestAtMsByKey and track the last token key/time via lastMistralTokenKey and lastMistralTokenTimeMs.
Implement a retry loop that selects the API key with the least remaining wait, delays as needed, marks keys on cooldown, and treats 429 and >=500 as retryable failures while surfacing other errors immediately.
Add helper functions markKeyCooldown, remainingWaitForKeyMs, and isRetryableMistralFailure to centralize cooldown and failure logic, and improve user-facing pending messages during retries.
Ensure responses are validated and closed properly, propagate informative errors, and update streaming parsing to mark the key cooldown when streaming begins.

Testing

Ran unit tests with ./gradlew :app:testDebugUnitTest and lint checks with ./gradlew :app:lint, both of which completed successfully.
Executed basic integration/manual streaming checks against Mistral (simulated 429/5xx scenarios) to verify key rotation and cooldown behavior, which behaved as expected.

Codex Task

amazon-q-developer

Review Summary

This PR implements per-key rate limiting and retry/backoff for Mistral API requests. While the approach is sound, there are 4 critical issues that must be fixed before merge:

Critical Issues (Must Fix):

Retry loop logic error (lines 1153-1160): The consecutiveFailures counter can prevent the code from cycling through all available keys when all are temporarily blocked
Network error handling gap (lines 1197-1211): IOException catch doesn't distinguish between network failures and HTTP errors, causing incorrect cooldown application
Null safety crash risk (lines 1228-1232): Streaming callback can crash if selectedKeyForResponse is null
Missing validation (lines 1126-1131): Empty key list validation happens too late, risking NoSuchElementException

Testing Recommendation:

Before merging, test these scenarios:

All API keys receiving 429 errors simultaneously
Network timeout during request
Empty/invalid API key configurations

The implementation correctly replaces the global rate limiter with per-key tracking and adds retry logic for 429/5xx errors, but the issues above prevent it from functioning correctly in edge cases.

You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.

amazon-q-developer · 2026-03-30T16:13:16Z

app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt

+                } catch (e: IOException) {
+                    val requestEndMs = System.currentTimeMillis()
+                    markKeyCooldown(selectedKey, requestEndMs)
+                    blockedKeysThisRound.add(selectedKey)
+                    consecutiveFailures++
+                    if (consecutiveFailures >= 5) {
+                        throw IOException("Mistral request failed after 5 attempts: ${e.message}", e)
                    }
-                    if (retryResponse == null || stopExecutionFlag.get()) {
-                        throw IOException("Mistral rate limit: Kein Erfolg innerhalb von 5 Sekunden.")
+                    withContext(Dispatchers.Main) {
+                        replaceAiMessageText(
+                            "Mistral Netzwerkfehler (Versuch $consecutiveFailures/5). Wiederhole...",
+                            isPending = true
+                        )
                    }
-                    response = retryResponse
                }


🛑 Logic Error: After closing a failed response, the code continues to use selectedKey in error handling without checking if the request actually failed due to network issues versus retryable API errors. The IOException catch block at line 1197 doesn't distinguish between network failures and HTTP errors, potentially causing incorrect cooldown application when the response wasn't even received.

amazon-q-developer · 2026-03-30T16:13:16Z

app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt

+            while (response == null && consecutiveFailures < 5) {
+                if (stopExecutionFlag.get()) break
+
+                val now = System.currentTimeMillis()
+                val keyPool = availableKeys.filter { it !in blockedKeysThisRound }.ifEmpty {
+                    blockedKeysThisRound.clear()
+                    availableKeys
+                }


🛑 Logic Error: The retry loop condition consecutiveFailures < 5 combined with the increment logic can cause the loop to exit without ever successfully getting a response when all keys are blocked. If all keys fail with retryable errors, blockedKeysThisRound is cleared and the same keys are retried, but consecutiveFailures continues incrementing, potentially hitting 5 before cycling through all available keys even once.

app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt

…otoReasoningViewModel.kt Co-authored-by: amazon-q-developer[bot] <208079219+amazon-q-developer[bot]@users.noreply.github.com>

Android-PowerUser · 2026-03-30T16:27:18Z

/q für die anderen beiden auch commit suggestion anbieten.

amazon-q-developer · 2026-03-30T16:28:01Z

I'll implement the following changes:

• Add null safety check for selectedKeyForResponse in streaming callback (app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt:1231-1237)
• Add proper IOException handling to distinguish between network failures and HTTP errors to avoid incorrect cooldown application (app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt:1200-1214)
• Fix retry loop logic to ensure all keys are cycled through before hitting the failure limit (app/src/main/kotlin/com/google/ai/sample/feature/multimodal/PhotoReasoningViewModel.kt:1156-1163)

Note: The validation check at line 1131 for empty API keys is already present in the code and functions correctly, so no change is needed for that suggestion.

amazon-q-developer · 2026-03-30T16:28:27Z

✅ Changes have been committed. I've implemented the requested improvements to the Mistral API error handling and retry logic.

- Add null safety check for selectedKeyForResponse in streaming callback - Distinguish between network failures and HTTP errors in IOException handling - Improve retry loop to cycle through all available keys before hitting failure limit - Update retry attempt messages to reflect dynamic max attempts calculation

Adjust Mistral retry scheduling for per-key 1.1s cooldown

61d16f6

Android-PowerUser added the codex label Mar 30, 2026 — with ChatGPT Codex Connector

amazon-q-developer bot reviewed Mar 30, 2026

View reviewed changes

Android-PowerUser and others added 2 commits March 30, 2026 18:24

Update app/src/main/kotlin/com/google/ai/sample/feature/multimodal/Ph…

3cbeac1

…otoReasoningViewModel.kt Co-authored-by: amazon-q-developer[bot] <208079219+amazon-q-developer[bot]@users.noreply.github.com>

Update app/src/main/kotlin/com/google/ai/sample/feature/multimodal/Ph…

e89c450

…otoReasoningViewModel.kt Co-authored-by: amazon-q-developer[bot] <208079219+amazon-q-developer[bot]@users.noreply.github.com>

Android-PowerUser merged commit b080797 into codex/analyze-technical-debt-and-complexity-o6bm9p Mar 30, 2026

Android-PowerUser deleted the plan-for-api-rate-limit-handling branch March 30, 2026 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mistral: per-key rate limiting, retry/backoff and improved error handling for streaming requests#65

Mistral: per-key rate limiting, retry/backoff and improved error handling for streaming requests#65
Android-PowerUser merged 4 commits intocodex/analyze-technical-debt-and-complexity-o6bm9pfrom
plan-for-api-rate-limit-handling

Android-PowerUser commented Mar 30, 2026

Uh oh!

amazon-q-developer bot left a comment

Uh oh!

amazon-q-developer bot Mar 30, 2026

Uh oh!

amazon-q-developer bot Mar 30, 2026

Uh oh!

Uh oh!

Uh oh!

Android-PowerUser commented Mar 30, 2026

Uh oh!

amazon-q-developer bot commented Mar 30, 2026

Uh oh!

amazon-q-developer bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Android-PowerUser commented Mar 30, 2026

Motivation

Description

Testing

Uh oh!

amazon-q-developer bot left a comment

Choose a reason for hiding this comment

Review Summary

Critical Issues (Must Fix):

Testing Recommendation:

Uh oh!

amazon-q-developer bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Android-PowerUser commented Mar 30, 2026

Uh oh!

amazon-q-developer bot commented Mar 30, 2026

Uh oh!

amazon-q-developer bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant