replicate#87
Conversation
- Updated environment variables documentation to include Replicate as a TTS provider option. - Added Replicate to the sidebar for TTS provider guides. - Included Replicate as a dependency in package.json and pnpm-lock.yaml. - Enhanced audiobook chapter generation to normalize native speed settings based on the TTS provider. - Improved error handling in TTS API routes to provide retry information for rate-limited responses. - Updated AudiobookExportModal to reflect native speed support for Replicate models. - Modified SettingsModal to set default model for Replicate. - Enhanced SpeedControl component to conditionally render native speed controls based on provider support. - Updated TTSContext to utilize effective native speed for TTS requests. - Implemented Replicate request handling in the TTS generation logic. - Added new documentation for configuring Replicate as a TTS provider.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 44 minutes and 30 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds Replicate as a first-class TTS provider: docs, package dependency, provider catalog and voice-resolution, server-side Replicate request implementation with retry/cooldown, UI and settings changes for provider/model selection, and native-model-speed capability checks propagated through caching and playback logic. Changes
Sequence DiagramsequenceDiagram
actor User
participant UI as Settings / Player UI
participant Client as Browser Client
participant Server as OpenReader Server
participant Replicate as Replicate API
participant Cache as Audio/Schema Cache
User->>UI: Select provider/model (Replicate)
UI->>UI: supportsNativeModelSpeed() → set effectiveNativeSpeed
User->>UI: Request TTS playback
UI->>Client: POST /api/tts (includes effectiveNativeSpeed)
Client->>Server: Forward TTS request
Server->>Server: resolveTTSRequest & supportsNativeModelSpeed()
Server->>Cache: Check audio cache (key uses effectiveNativeSpeed)
alt Cache hit
Cache-->>Server: Return cached audio
else Cache miss
Server->>Replicate: runReplicateRequest (through gate/cooldown)
alt 200 OK
Replicate-->>Server: Audio
Server->>Cache: Store audio
else 429 (Rate limited)
Replicate-->>Server: 429 + Retry-After?
Server->>Server: applyReplicateCooldown(), respond 429 with retryAfterSeconds
else 5xx / error
Replicate-->>Server: Error
Server->>Server: Retry/backoff (uses Retry-After or default)
end
end
Server-->>Client: Return audio or ProblemDetails
Client-->>UI: Play audio / show retry info
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d0c0ab7420
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const audioUrl = typeof output === 'string' ? output : String(output); | ||
| const audioResponse = await fetch(audioUrl, { signal }); |
There was a problem hiding this comment.
Read Replicate FileOutput instead of coercing to URL string
replicate@1.x returns FileOutput objects (and sometimes arrays of them) by default, but this code assumes every non-string result can be converted to a fetchable URL via String(output). For file outputs this becomes values like [object ReadableStream] (or comma-joined object strings), so fetch(audioUrl) fails and Replicate TTS requests return errors instead of audio. This breaks the new provider path for models that emit file outputs unless the client is configured with useFileOutput: false or the stream/object is read directly.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs-site/docs/deploy/vercel-deployment.md (1)
36-37:⚠️ Potential issue | 🟡 Minor**Inconsistency between code example and recommendation.**The code block at lines 36-37 shows
NEXT_PUBLIC_DEFAULT_TTS_PROVIDER=deepinfraandNEXT_PUBLIC_DEFAULT_TTS_MODEL=hexgrad/Kokoro-82M, but the info block at lines 52-53 recommendsreplicateandgoogle/gemini-3.1-flash-tts. Consider updating the code block to match the recommendation for consistency.📝 Proposed fix to align code example with recommendation
# Optional client/runtime feature defaults: NEXT_PUBLIC_ENABLE_DOCX_CONVERSION=false NEXT_PUBLIC_ENABLE_DESTRUCTIVE_DELETE_ACTIONS=false NEXT_PUBLIC_ENABLE_TTS_PROVIDERS_TAB=false -NEXT_PUBLIC_DEFAULT_TTS_PROVIDER=deepinfra -NEXT_PUBLIC_DEFAULT_TTS_MODEL=hexgrad/Kokoro-82M +NEXT_PUBLIC_DEFAULT_TTS_PROVIDER=replicate +NEXT_PUBLIC_DEFAULT_TTS_MODEL=google/gemini-3.1-flash-tts NEXT_PUBLIC_SHOW_ALL_DEEPINFRA_MODELS=falseAlso applies to: 52-53
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs-site/docs/deploy/vercel-deployment.md` around lines 36 - 37, The code example uses NEXT_PUBLIC_DEFAULT_TTS_PROVIDER and NEXT_PUBLIC_DEFAULT_TTS_MODEL with values deepinfra and hexgrad/Kokoro-82M which conflicts with the recommendation; update both occurrences of the environment examples where NEXT_PUBLIC_DEFAULT_TTS_PROVIDER and NEXT_PUBLIC_DEFAULT_TTS_MODEL are shown to use the recommended values (set NEXT_PUBLIC_DEFAULT_TTS_PROVIDER=replicate and NEXT_PUBLIC_DEFAULT_TTS_MODEL=google/gemini-3.1-flash-tts) so the examples and recommendation are consistent.src/lib/server/tts/generate.ts (1)
140-154:⚠️ Potential issue | 🟠 MajorCustom Replicate models only pass
{ text }to the API, silently dropping all other request parameters.When using a custom Replicate model (not one of the four hard-coded builders:
google/gemini-3.1-flash-tts,minimax/speech-2.8-turbo,qwen/qwen3-tts,inworld/tts-1.5-mini), thebuildReplicateInput()function falls through to line 317 and returns{ text: request.text }only. This causes:
- Voice is dropped from the API request
- Format is dropped from the API request
- Instructions are dropped upstream by
resolveTTSRequest()sincesupportsTtsInstructions()returns false for custom models- Speed is reset to 1 since
supportsNativeModelSpeed()returns false for custom modelsThis makes custom Replicate models only viable if the underlying model's schema is exactly
{ text }. Either constrain the UI to the mapped models or add an explicit mechanism to define per-model input mappings for custom Replicate models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/tts/generate.ts` around lines 140 - 154, The buildReplicateInput() fallback currently returns only { text: request.text } for non-mapped models, causing voice/format/instructions/speed to be dropped; update buildReplicateInput() to preserve and pass through the other parameters (voice, format, speed, instructions) when they exist or provide a clear error for unsupported custom models, and add a per-model input-mapping mechanism (e.g., a map keyed by model name) so custom Replicate models can declare the expected input shape; also ensure resolveTTSRequest() still filters instructions only when supportsTtsInstructions(model) is false and that supportsNativeModelSpeed(model) governs whether to include speed or normalize it, referencing buildReplicateInput(), resolveTTSRequest(), supportsTtsInstructions(), and supportsNativeModelSpeed() to locate the changes.
🧹 Nitpick comments (3)
src/lib/shared/tts-provider-catalog.ts (1)
136-144: Consider edge case for custom Replicate models.For custom Replicate models (not in either set),
supportsNativeModelSpeedreturnsfalse(line 160). This is a safe default, but users with custom models that do support native speed won't be able to use it. Consider documenting this behavior or adding a mechanism for custom model speed support if needed in the future.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/shared/tts-provider-catalog.ts` around lines 136 - 144, The current default in supportsNativeModelSpeed returns false for any Replicate model not in REPLICATE_MODELS_WITH_NATIVE_SPEED or REPLICATE_MODELS_WITHOUT_NATIVE_SPEED, which prevents custom models that do support native speed from being used; update the code or docs to handle this by either documenting the behavior clearly and advising users to add their custom model IDs to REPLICATE_MODELS_WITH_NATIVE_SPEED, or implement a configurable mechanism (e.g., an exported addReplicateModelWithNativeSpeed(modelId) function, a runtime-configurable Set, or reading a REPLICATE_NATIVE_SPEED_MODELS env/config list) and reference the sets REPLICATE_MODELS_WITH_NATIVE_SPEED, REPLICATE_MODELS_WITHOUT_NATIVE_SPEED and the supportsNativeModelSpeed function so callers can opt their custom models into native-speed support.src/components/SettingsModal.tsx (1)
565-565: Consider simplifying the base URL visibility condition.The condition
localTTSProvider !== 'replicate' && (localTTSProvider === 'custom-openai' || !localBaseUrl || localBaseUrl === '')works but is slightly convoluted. Consider extracting to a helper for clarity.♻️ Optional: Extract to named variable
+ const showBaseUrlInput = localTTSProvider !== 'replicate' && + (localTTSProvider === 'custom-openai' || !localBaseUrl); - {localTTSProvider !== 'replicate' && (localTTSProvider === 'custom-openai' || !localBaseUrl || localBaseUrl === '') && ( + {showBaseUrlInput && (🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/SettingsModal.tsx` at line 565, The JSX conditional for showing the base URL is complex; extract it into a named boolean (e.g., shouldShowBaseUrl) defined in SettingsModal (above the render/return) that computes: localTTSProvider !== 'replicate' && (localTTSProvider === 'custom-openai' || !localBaseUrl || localBaseUrl === ''), then replace the inline expression in the JSX with that variable (referencing localTTSProvider and localBaseUrl where computed).src/app/api/tts/route.ts (1)
62-71: Handle HTTP-dateRetry-Aftervalues in addition to delta-seconds.Current parsing only supports numeric values, so valid date-form headers are ignored.
🔧 Suggested parser improvement
function getUpstreamRetryAfterSeconds(error: unknown): number | undefined { if (typeof error !== 'object' || error === null) return undefined; const rec = error as Record<string, unknown>; const response = rec.response as { headers?: { get?: (name: string) => string | null } } | undefined; const retryAfterHeader = response?.headers?.get?.('retry-after'); if (!retryAfterHeader) return undefined; - const parsed = Number(retryAfterHeader); - if (!Number.isFinite(parsed) || parsed <= 0) return undefined; - return Math.ceil(parsed); + const deltaSeconds = Number(retryAfterHeader); + if (Number.isFinite(deltaSeconds) && deltaSeconds > 0) { + return Math.ceil(deltaSeconds); + } + + const retryAtMs = Date.parse(retryAfterHeader); + if (!Number.isFinite(retryAtMs)) return undefined; + const remainingSeconds = Math.ceil((retryAtMs - Date.now()) / 1000); + return remainingSeconds > 0 ? remainingSeconds : undefined; }Also applies to: 237-257
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/api/tts/route.ts` around lines 62 - 71, The current getUpstreamRetryAfterSeconds function only treats the Retry-After header as delta-seconds; update it to also handle HTTP-date values: keep the existing numeric parsing (Number(retryAfterHeader)), and if that fails, call Date.parse(retryAfterHeader) to get a timestamp, compute (parsedDateMs - Date.now())/1000, treat only positive results, and return Math.ceil(seconds). Apply the same change to the other equivalent parser in the file (the duplicate implementation around the later block) so both numeric and HTTP-date formats are supported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs-site/docs/configure/tts-providers.md`:
- Around line 22-28: The documentation uses the inconsistent casing "Deepinfra"
in the TTS providers section; update every instance of "Deepinfra" to the
canonical "DeepInfra" (e.g., in the sentence "For `OpenAI`, `Deepinfra`, and
`Replicate`..." and in the model list entry) so the provider name matches the
rest of the docs and branding.
In `@src/app/api/audiobook/chapter/route.ts`:
- Around line 96-100: The normalization function normalizeNativeSpeedForSettings
is only used for mismatch checks, so when data.settings is absent the runtime
can still read nativeSpeed from raw existingSettings and bypass normalization;
update the code paths that build the runtime settings (the merge of
existingSettings and data.settings / incoming settings used for generation) to
always pass the merged AudiobookGenerationSettings through
normalizeNativeSpeedForSettings before using it for generation or comparison,
ensuring any downstream reads (generation, supportsNativeModelSpeed checks, and
mismatch logic) use the normalized settings; apply the same change wherever
merged settings are constructed (including the logic that currently reads
existingSettings directly) so nativeSpeed is enforced consistently.
In `@src/lib/server/tts/generate.ts`:
- Around line 119-136: runWithReplicateGate currently serializes all requests by
awaiting the global replicateQueue (variable replicateQueue) so every call waits
for the previous one; change it so the function no longer awaits prev and thus
does not create a global one-at-a-time queue. Keep the cooldown logic based on
replicateBlockedUntilMs and sleepWithSignal(signal) to delay start when
necessary, but remove the await prev and the replicateQueue chaining; instead
let operations run concurrently while each checks waitMs = Math.max(0,
replicateBlockedUntilMs - Date.now()) and sleeps if needed, and only update
replicateBlockedUntilMs when you receive a 429 backoff in the code that calls
operation. Alternatively, if limited concurrency is desired, replace the global
chain with a bounded semaphore (not replicateQueue) to permit N concurrent
operations; update runWithReplicateGate (and references to release) accordingly.
- Around line 334-355: When the audio fetch fails (the audioResponse.ok branch
after fetch(audioUrl, { signal })), don't throw a plain Error; instead throw or
propagate an error object that preserves the upstream status and headers so
getUpstreamStatus(error) and getUpstreamRetryAfterSeconds(error) can read them.
Modify the audio fetch failure path that currently does throw new Error(`Failed
to fetch Replicate audio: ${audioResponse.status}`) to attach the response
status and relevant headers (or the response object) to the thrown error (or
create a custom Error subclass) so the subsequent logic around
getUpstreamStatus, getUpstreamRetryAfterSeconds, applyReplicateCooldown and
retry logic works correctly.
---
Outside diff comments:
In `@docs-site/docs/deploy/vercel-deployment.md`:
- Around line 36-37: The code example uses NEXT_PUBLIC_DEFAULT_TTS_PROVIDER and
NEXT_PUBLIC_DEFAULT_TTS_MODEL with values deepinfra and hexgrad/Kokoro-82M which
conflicts with the recommendation; update both occurrences of the environment
examples where NEXT_PUBLIC_DEFAULT_TTS_PROVIDER and
NEXT_PUBLIC_DEFAULT_TTS_MODEL are shown to use the recommended values (set
NEXT_PUBLIC_DEFAULT_TTS_PROVIDER=replicate and
NEXT_PUBLIC_DEFAULT_TTS_MODEL=google/gemini-3.1-flash-tts) so the examples and
recommendation are consistent.
In `@src/lib/server/tts/generate.ts`:
- Around line 140-154: The buildReplicateInput() fallback currently returns only
{ text: request.text } for non-mapped models, causing
voice/format/instructions/speed to be dropped; update buildReplicateInput() to
preserve and pass through the other parameters (voice, format, speed,
instructions) when they exist or provide a clear error for unsupported custom
models, and add a per-model input-mapping mechanism (e.g., a map keyed by model
name) so custom Replicate models can declare the expected input shape; also
ensure resolveTTSRequest() still filters instructions only when
supportsTtsInstructions(model) is false and that supportsNativeModelSpeed(model)
governs whether to include speed or normalize it, referencing
buildReplicateInput(), resolveTTSRequest(), supportsTtsInstructions(), and
supportsNativeModelSpeed() to locate the changes.
---
Nitpick comments:
In `@src/app/api/tts/route.ts`:
- Around line 62-71: The current getUpstreamRetryAfterSeconds function only
treats the Retry-After header as delta-seconds; update it to also handle
HTTP-date values: keep the existing numeric parsing (Number(retryAfterHeader)),
and if that fails, call Date.parse(retryAfterHeader) to get a timestamp, compute
(parsedDateMs - Date.now())/1000, treat only positive results, and return
Math.ceil(seconds). Apply the same change to the other equivalent parser in the
file (the duplicate implementation around the later block) so both numeric and
HTTP-date formats are supported.
In `@src/components/SettingsModal.tsx`:
- Line 565: The JSX conditional for showing the base URL is complex; extract it
into a named boolean (e.g., shouldShowBaseUrl) defined in SettingsModal (above
the render/return) that computes: localTTSProvider !== 'replicate' &&
(localTTSProvider === 'custom-openai' || !localBaseUrl || localBaseUrl === ''),
then replace the inline expression in the JSX with that variable (referencing
localTTSProvider and localBaseUrl where computed).
In `@src/lib/shared/tts-provider-catalog.ts`:
- Around line 136-144: The current default in supportsNativeModelSpeed returns
false for any Replicate model not in REPLICATE_MODELS_WITH_NATIVE_SPEED or
REPLICATE_MODELS_WITHOUT_NATIVE_SPEED, which prevents custom models that do
support native speed from being used; update the code or docs to handle this by
either documenting the behavior clearly and advising users to add their custom
model IDs to REPLICATE_MODELS_WITH_NATIVE_SPEED, or implement a configurable
mechanism (e.g., an exported addReplicateModelWithNativeSpeed(modelId) function,
a runtime-configurable Set, or reading a REPLICATE_NATIVE_SPEED_MODELS
env/config list) and reference the sets REPLICATE_MODELS_WITH_NATIVE_SPEED,
REPLICATE_MODELS_WITHOUT_NATIVE_SPEED and the supportsNativeModelSpeed function
so callers can opt their custom models into native-speed support.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 540f391f-2502-4d20-8d39-4f3e4aa2e515
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (17)
README.mddocs-site/docs/configure/tts-provider-guides/replicate.mddocs-site/docs/configure/tts-providers.mddocs-site/docs/deploy/vercel-deployment.mddocs-site/docs/docker-quick-start.mddocs-site/docs/introduction.mddocs-site/docs/reference/environment-variables.mddocs-site/sidebars.tspackage.jsonsrc/app/api/audiobook/chapter/route.tssrc/app/api/tts/route.tssrc/components/AudiobookExportModal.tsxsrc/components/SettingsModal.tsxsrc/components/player/SpeedControl.tsxsrc/contexts/TTSContext.tsxsrc/lib/server/tts/generate.tssrc/lib/shared/tts-provider-catalog.ts
Align default Replicate model to the versioned Kokoro model across UI, docs, and server logic. Refactor TTS settings merging for audiobooks to ensure consistent normalization. Improve Retry-After header handling for upstream rate limits. Expand Replicate voice resolution to use model schemas when available, with test coverage for custom and built-in models. Update documentation to reflect new Replicate defaults, model selection, and configuration guidance.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/lib/shared/tts-provider-catalog.ts (1)
127-128: Unbounded in-memory caches may cause memory growth over time.
replicateVoiceInputKeyCacheandreplicateOpenApiSchemaPromiseCachegrow indefinitely as new models are queried. In a long-running server with many custom Replicate models, this could accumulate memory.Consider using an LRU cache with a size limit, similar to
ttsAudioCacheingenerate.ts.♻️ Example using LRUCache
+import { LRUCache } from 'lru-cache'; + -const replicateVoiceInputKeyCache = new Map<string, ReplicateVoiceInputKey>(); -const replicateOpenApiSchemaPromiseCache = new Map<string, Promise<unknown | null>>(); +const replicateVoiceInputKeyCache = new LRUCache<string, ReplicateVoiceInputKey>({ max: 100 }); +const replicateOpenApiSchemaPromiseCache = new LRUCache<string, Promise<unknown | null>>({ max: 100 });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/shared/tts-provider-catalog.ts` around lines 127 - 128, The two unbounded maps replicateVoiceInputKeyCache and replicateOpenApiSchemaPromiseCache can grow without limit; replace them with a bounded LRU cache (or use the existing LRU approach used by ttsAudioCache in generate.ts) so entries are evicted when capacity is reached; specifically swap the Map instantiations for an LRUCache (or a small custom LRUMap) with a sensible max size, preserve the same key/value types (string -> ReplicateVoiceInputKey and string -> Promise<unknown|null>), and ensure all code that reads/writes these caches (references to replicateVoiceInputKeyCache and replicateOpenApiSchemaPromiseCache) continues to work with the LRU API.src/app/api/audiobook/chapter/route.ts (1)
312-316: Type assertion on partial object merge.The cast to
AudiobookGenerationSettingsassumes the merged object has all required fields. IfexistingSettingsfrom storage is malformed or incomplete, runtime errors could occur when accessing properties likettsProviderorttsModel.Consider adding validation or using a type guard before the cast.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/api/audiobook/chapter/route.ts` around lines 312 - 316, The mergedSettings creation dangerously casts a possibly-partial object to AudiobookGenerationSettings; instead validate and ensure required fields (e.g., ttsProvider, ttsModel, any other non-optional properties) exist on the merged object before calling normalizeNativeSpeedForSettings. Update the logic around mergedSettings/normalizedExistingSettings/incomingSettings to either 1) run a type guard that checks required keys and shapes and only then cast and call normalizeNativeSpeedForSettings, 2) or fill missing required fields with safe defaults and log/throw a controlled error if validation fails; ensure normalizeNativeSpeedForSettings is only invoked with a verified AudiobookGenerationSettings instance.src/app/api/tts/route.ts (1)
62-77: DuplicategetUpstreamRetryAfterSecondswith subtle inconsistency.This implementation ceils numeric Retry-After values (Line 70), while the identical function in
generate.tsreturns them unmodified. Consider extracting to a shared utility for consistency.♻️ Extract shared utility
Move
getUpstreamStatusandgetUpstreamRetryAfterSecondsto a shared module (e.g.,src/lib/server/tts/errors.ts) and import in both files:// src/lib/server/tts/errors.ts export function getUpstreamStatus(error: unknown): number | undefined { ... } export function getUpstreamRetryAfterSeconds(error: unknown): number | undefined { ... }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/api/tts/route.ts` around lines 62 - 77, Duplicate logic: extract getUpstreamStatus and getUpstreamRetryAfterSeconds into a shared utility and make their behavior consistent with the implementation in generate.ts (remove the Math.ceil on numeric Retry-After values so numeric values are returned unchanged). Create a new module exporting getUpstreamStatus and getUpstreamRetryAfterSeconds, update both route.ts and generate.ts to import and use those exported functions, and ensure getUpstreamRetryAfterSeconds still parses header strings and date values, only returning undefined for non-positive or unparsable values but not altering valid numeric values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/app/api/audiobook/chapter/route.ts`:
- Around line 312-316: The mergedSettings creation dangerously casts a
possibly-partial object to AudiobookGenerationSettings; instead validate and
ensure required fields (e.g., ttsProvider, ttsModel, any other non-optional
properties) exist on the merged object before calling
normalizeNativeSpeedForSettings. Update the logic around
mergedSettings/normalizedExistingSettings/incomingSettings to either 1) run a
type guard that checks required keys and shapes and only then cast and call
normalizeNativeSpeedForSettings, 2) or fill missing required fields with safe
defaults and log/throw a controlled error if validation fails; ensure
normalizeNativeSpeedForSettings is only invoked with a verified
AudiobookGenerationSettings instance.
In `@src/app/api/tts/route.ts`:
- Around line 62-77: Duplicate logic: extract getUpstreamStatus and
getUpstreamRetryAfterSeconds into a shared utility and make their behavior
consistent with the implementation in generate.ts (remove the Math.ceil on
numeric Retry-After values so numeric values are returned unchanged). Create a
new module exporting getUpstreamStatus and getUpstreamRetryAfterSeconds, update
both route.ts and generate.ts to import and use those exported functions, and
ensure getUpstreamRetryAfterSeconds still parses header strings and date values,
only returning undefined for non-positive or unparsable values but not altering
valid numeric values.
In `@src/lib/shared/tts-provider-catalog.ts`:
- Around line 127-128: The two unbounded maps replicateVoiceInputKeyCache and
replicateOpenApiSchemaPromiseCache can grow without limit; replace them with a
bounded LRU cache (or use the existing LRU approach used by ttsAudioCache in
generate.ts) so entries are evicted when capacity is reached; specifically swap
the Map instantiations for an LRUCache (or a small custom LRUMap) with a
sensible max size, preserve the same key/value types (string ->
ReplicateVoiceInputKey and string -> Promise<unknown|null>), and ensure all code
that reads/writes these caches (references to replicateVoiceInputKeyCache and
replicateOpenApiSchemaPromiseCache) continues to work with the LRU API.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 056db415-b6d9-4c14-88e7-415bb0dd68a7
📒 Files selected for processing (13)
README.mddocs-site/docs/configure/tts-provider-guides/replicate.mddocs-site/docs/configure/tts-providers.mddocs-site/docs/deploy/vercel-deployment.mddocs-site/docs/introduction.mdsrc/app/api/audiobook/chapter/route.tssrc/app/api/tts/route.tssrc/components/SettingsModal.tsxsrc/lib/server/tts/generate.tssrc/lib/shared/kokoro.tssrc/lib/shared/tts-provider-catalog.tstests/unit/kokoro.spec.tstests/unit/tts-provider-catalog.spec.ts
✅ Files skipped from review due to trivial changes (3)
- README.md
- docs-site/docs/introduction.md
- docs-site/docs/configure/tts-provider-guides/replicate.md
🚧 Files skipped from review as they are similar to previous changes (3)
- docs-site/docs/deploy/vercel-deployment.md
- docs-site/docs/configure/tts-providers.md
- src/components/SettingsModal.tsx
…ngs validation Move getUpstreamStatus and getUpstreamRetryAfterSeconds to a shared utility module for consistent upstream error handling across TTS endpoints. Strengthen audiobook chapter API by introducing runtime validation for incoming settings payloads, ensuring type safety and error reporting for malformed requests. Replace in-memory Map caches with LRUMap for Replicate voice and schema lookups, improving memory management and eviction logic.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores
Tests