STT/TTS: Smart Defaults for Config Params in Native and Kaapi Abstracted Flow#678
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds centralized LLM constants and mappings, introduces defaults for STT/TTS model/voice/input_language, makes KaapiCompletionConfig.provider optional with auto-assignment for stt/tts, adds provider-specific model/voice whitelisting and validation, and updates mappers and providers to use the new constants and defaults. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Validator
participant Constants
participant Mapper
participant Provider
Client->>Validator: submit KaapiCompletionConfig + params
Validator->>Constants: read DEFAULT_* and SUPPORTED_* maps
Validator-->>Client: default provider="google" if type in (stt, tts) and provider missing
Validator->>Validator: validate model/voice against SUPPORTED_* for (provider,type)
Client->>Mapper: request param mapping (type, model, voice, languages)
Mapper->>Constants: map BCP47 -> provider language/voice ids
Mapper-->>Provider: mapped provider-specific request (audio format, voice_id, language_code)
Provider->>Provider: call external API / convert PCM->WAV if needed
Provider-->>Client: AudioOutput / Transcript + metadata
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/models/llm/request.py (1)
227-250:⚠️ Potential issue | 🔴 CriticalGate the ElevenLabs provider literals until the runtime supports them.
These additions let both native and Kaapi configs validate with
elevenlabs/elevenlabs-native, butbackend/app/services/llm/mappers.py:267-309still has no ElevenLabs mapping andbackend/app/services/llm/providers/registry.py:12-27has no registry entry. Requests will pass schema validation and then fail as unsupported providers at runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 227 - 250, The schema currently allows "elevenlabs-native" and "elevenlabs" which will pass validation but have no runtime support; update the provider Literals to remove those values so requests can't declare ElevenLabs until the mapper and registry are implemented: change the provider field Literal in the Request model (the provider: Literal["openai-native", "google-native", "sarvamai-native", "elevenlabs-native"]) to exclude "elevenlabs-native", and likewise update KaapiCompletionConfig.provider (Literal["openai", "google", "sarvamai", "elevenlabs"] | None) to exclude "elevenlabs" (leaving openai/google/sarvamai), or alternatively replace the ElevenLabs entries with a clearly documented TODO/flag guarding them until backend/services/llm/mappers.py and providers/registry.py have the ElevenLabs mapping and registry entry.
🧹 Nitpick comments (1)
backend/app/models/llm/request.py (1)
262-298: Add a return annotation tovalidate_params.This touched validator still lacks an explicit return type.
-> "KaapiCompletionConfig"keeps it aligned with the repo’s Python typing rule.As per coding guidelines "Always add type hints to all function parameters and return values in Python code"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 262 - 298, The validate_params model validator is missing an explicit return type; update its signature to include the forward-reference return annotation -> "KaapiCompletionConfig" (i.e., change def validate_params(self) to def validate_params(self) -> "KaapiCompletionConfig":) so the function that returns self is properly typed; keep the existing `@model_validator`(mode="after") decorator and all logic using param_models, SUPPORTED_MODELS, and SUPPORTED_VOICES unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/models/llm/request.py`:
- Around line 88-89: The validation is currently checking the raw input instead
of the normalized/filled value, so when TTSLLMParams defaults voice to
DEFAULT_TTS_VOICE (e.g., "Kore") providers like sarvamai/elevenlabs can bypass
validation and fail later; update the validation logic to read and validate the
normalized voice from the TTSLLMParams instance (use the filled/normalized
attribute on the params object rather than the original raw input), and apply
the same change in the second validation block referenced around the 288-295
region so both places validate the final voice value returned by TTSLLMParams.
- Around line 276-297: The normalization currently replaces self.params with
validated.model_dump(exclude_none=True), dropping any caller-provided
provider_specific data; restore provider_specific after validation by preserving
it from the original params if present. Locate the validation block around
model_class.model_validate(...) and the assignment self.params =
validated.model_dump(...), then merge the original provider_specific (if it
exists and is not None) into the dumped dict (e.g., set
dumped_params["provider_specific"] = original_params.get("provider_specific") if
present) before assigning back to self.params so downstream code in
llm/mappers.py and providers/gai.py can access it.
In `@backend/app/services/llm/mappers.py`:
- Around line 149-156: The current branch that handles unsupported Google TTS
locales returns an empty params dict and a warning which gets treated as
non-fatal downstream; instead, make unsupported locales fail fast by raising an
exception (e.g., ValueError or a domain-specific BadRequestError) when
google_supported_lang is None. Update the block that checks
BCP47_LOCALE_TO_GEMINI_LANG (the google_supported_lang/google_params logic) to
raise a clear error containing the unsupported language and the supported list
rather than returning ({}, [...]) so execution stops and the request is
rejected.
In `@backend/app/services/llm/providers/gai.py`:
- Around line 154-156: The logger.info call currently prints the full
merged_instruction and raw response fields (merged_instruction, output_language,
input_language) which leaks sensitive prompts and violates logging conventions;
update all such log statements in gai.py (including the call using
merged_instruction and the similar ones around lines 253-257, 261-263, 293) to
instead prefix messages with the function name in square brackets (e.g.,
"[<function_name>]"), only include non-sensitive identifiers, and mask any
sensitive strings via the existing mask_string helper (e.g.,
mask_string(merged_instruction)) so no raw prompt or full response objects are
logged. Ensure you replace direct interpolation of response objects with
concise, non-sensitive summaries (IDs/status) and follow the repo logging prefix
convention for all modified logger.info invocations.
---
Outside diff comments:
In `@backend/app/models/llm/request.py`:
- Around line 227-250: The schema currently allows "elevenlabs-native" and
"elevenlabs" which will pass validation but have no runtime support; update the
provider Literals to remove those values so requests can't declare ElevenLabs
until the mapper and registry are implemented: change the provider field Literal
in the Request model (the provider: Literal["openai-native", "google-native",
"sarvamai-native", "elevenlabs-native"]) to exclude "elevenlabs-native", and
likewise update KaapiCompletionConfig.provider (Literal["openai", "google",
"sarvamai", "elevenlabs"] | None) to exclude "elevenlabs" (leaving
openai/google/sarvamai), or alternatively replace the ElevenLabs entries with a
clearly documented TODO/flag guarding them until backend/services/llm/mappers.py
and providers/registry.py have the ElevenLabs mapping and registry entry.
---
Nitpick comments:
In `@backend/app/models/llm/request.py`:
- Around line 262-298: The validate_params model validator is missing an
explicit return type; update its signature to include the forward-reference
return annotation -> "KaapiCompletionConfig" (i.e., change def
validate_params(self) to def validate_params(self) -> "KaapiCompletionConfig":)
so the function that returns self is properly typed; keep the existing
`@model_validator`(mode="after") decorator and all logic using param_models,
SUPPORTED_MODELS, and SUPPORTED_VOICES unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7a3ee8db-0a32-4822-9ea4-e0398919bc73
⛔ Files ignored due to path filters (1)
backend/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
backend/app/models/llm/request.pybackend/app/services/llm/mappers.pybackend/app/services/llm/providers/gai.py
| validated = model_class.model_validate(self.params) | ||
|
|
||
| if provider is not None: | ||
| key = (provider, self.type) | ||
|
|
||
| allowed_models = SUPPORTED_MODELS.get(key) | ||
| if allowed_models and validated.model not in allowed_models: | ||
| raise ValueError( | ||
| f"Model '{validated.model}' is not supported for provider='{provider}' type='{self.type}'. " | ||
| f"Allowed: {allowed_models}" | ||
| ) | ||
|
|
||
| if self.type == "tts": | ||
| voice = self.params.get("voice") | ||
| allowed_voices = SUPPORTED_VOICES.get(key) | ||
| if allowed_voices and voice and voice not in allowed_voices: | ||
| raise ValueError( | ||
| f"Voice '{voice}' is not supported for provider='{provider}'. " | ||
| f"Allowed: {allowed_voices}" | ||
| ) | ||
|
|
||
| self.params = validated.model_dump(exclude_none=True) |
There was a problem hiding this comment.
provider_specific is dropped by this normalization step.
Line 297 rewrites self.params from the validated *LLMParams model, but none of those models declare provider_specific. Kaapi callers therefore can't reach the new passthrough in backend/app/services/llm/mappers.py:166-169 or the director_notes lookup in backend/app/services/llm/providers/gai.py:269-272.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/models/llm/request.py` around lines 276 - 297, The normalization
currently replaces self.params with validated.model_dump(exclude_none=True),
dropping any caller-provided provider_specific data; restore provider_specific
after validation by preserving it from the original params if present. Locate
the validation block around model_class.model_validate(...) and the assignment
self.params = validated.model_dump(...), then merge the original
provider_specific (if it exists and is not None) into the dumped dict (e.g., set
dumped_params["provider_specific"] = original_params.get("provider_specific") if
present) before assigning back to self.params so downstream code in
llm/mappers.py and providers/gai.py can access it.
There was a problem hiding this comment.
it's intentional cause we dont support those params in v1
There was a problem hiding this comment.
@Prajna1999, understood — dropping provider_specific (and any other non-modelled params) during validation is a deliberate v1 scoping decision. I'll keep that in mind for future reviews.
✏️ Learnings added
Learnt from: Prajna1999
Repo: ProjectTech4DevAI/kaapi-backend PR: 678
File: backend/app/models/llm/request.py:275-296
Timestamp: 2026-03-17T11:26:38.657Z
Learning: In `backend/app/models/llm/request.py`, the normalization step `self.params = validated.model_dump(exclude_none=True)` intentionally drops any caller-provided fields not declared on `STTLLMParams`/`TTSLLMParams`/`TextLLMParams` (e.g., `provider_specific`). This is by design for v1 — those params are explicitly not supported yet. Do not flag this as a bug in reviews.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: nishika26
Repo: ProjectTech4DevAI/kaapi-backend PR: 502
File: backend/app/models/collection.py:29-32
Timestamp: 2025-12-17T10:16:16.173Z
Learning: In backend/app/models/collection.py, treat provider as the LLM provider name (e.g., 'openai') and llm_service_name as the specific service from that provider. These fields serve different purposes and should remain non-redundant. Document their meanings, add clear type hints (e.g., provider: str, llm_service_name: str), and consider a small unit test or validation to ensure they are distinct and used appropriately, preventing accidental aliasing or duplication across the model or serializers.
backend/app/services/llm/mappers.py
Outdated
| google_supported_lang = BCP47_LOCALE_TO_GEMINI_LANG.get(language) | ||
| if google_supported_lang is not None: | ||
| google_params["language"] = language | ||
| else: | ||
| supported = ", ".join(BCP47_LOCALE_TO_GEMINI_LANG.keys()) | ||
| return {}, [ | ||
| f"Unsupported language '{language}' for Google TTS. Supported: {supported}" | ||
| ] |
There was a problem hiding this comment.
Unsupported Google TTS locales should fail the request, not downgrade to a warning.
Returning {} here does not stop execution. backend/app/services/llm/jobs.py:370-390 only appends warnings to metadata, and backend/app/services/llm/providers/gai.py:251-264 then falls back to default TTS params with no language constraint. The bad locale is silently ignored instead of rejected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/llm/mappers.py` around lines 149 - 156, The current
branch that handles unsupported Google TTS locales returns an empty params dict
and a warning which gets treated as non-fatal downstream; instead, make
unsupported locales fail fast by raising an exception (e.g., ValueError or a
domain-specific BadRequestError) when google_supported_lang is None. Update the
block that checks BCP47_LOCALE_TO_GEMINI_LANG (the
google_supported_lang/google_params logic) to raise a clear error containing the
unsupported language and the supported list rather than returning ({}, [...]) so
execution stops and the request is rejected.
|
can we move constants in to seprate folder like constants/provider.py having provider level defaults and config.py for configuration level defaults? |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/services/llm/providers/gai.py (1)
340-343:⚠️ Potential issue | 🟡 MinorLog message reports wrong byte count after conversion.
Line 341 logs
len(raw_audio_bytes)but the message says "Audio conversion successful". After conversion, you should log the converted size, not the original PCM size, to accurately reflect the output.🐛 Log correct byte count
logger.info( - f"[GoogleAIProvider._execute_tts] Audio conversion successful: {actual_format.upper()} ({len(raw_audio_bytes)} bytes) | provider={provider}" + f"[GoogleAIProvider._execute_tts] Audio conversion successful: {actual_format.upper()} ({len(converted_bytes)} bytes from {len(raw_audio_bytes)} PCM bytes) | provider={provider}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 340 - 343, The log in GoogleAIProvider._execute_tts incorrectly reports len(raw_audio_bytes) after conversion; update the logger to report the size of the converted output buffer (the byte array produced by the conversion step, e.g., converted_audio_bytes or whatever variable holds the converted audio) instead of raw_audio_bytes, keeping the rest of the message and actual_format/response_mime_type unchanged so the log reflects the post-conversion byte count.backend/app/models/llm/request.py (1)
300-304:⚠️ Potential issue | 🔴 CriticalDiscriminated union will fail at runtime due to incompatible field syntax.
KaapiCompletionConfig.providerusesLiteral[...] | Nonesyntax, which Pydantic's discriminated union validator does not accept. When Pydantic attempts to build the schema forCompletionConfig, it will raisePydanticUserError: Model 'KaapiCompletionConfig' needs field 'provider' to be of type 'Literal'.Change:
provider: Literal["openai", "google", "sarvamai", "elevenlabs"] | None = Field(None)To:
provider: Literal["openai", "google", "sarvamai", "elevenlabs", None] = Field(None)With this fix,
provider=Nonecorrectly resolves toKaapiCompletionConfigin the discriminated union, and literal strings resolve to the appropriate config type.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 300 - 304, The discriminated union CompletionConfig fails because KaapiCompletionConfig.provider is declared with a unioned optional type (Literal[...] | None) which Pydantic rejects; change the KaapiCompletionConfig.provider annotation so the Literal includes None as one of its allowed literal values (i.e., make the Literal list include None) and keep the Field(default) as is so that provider=None maps to KaapiCompletionConfig and the other literal strings map to their respective configs.
♻️ Duplicate comments (5)
backend/app/services/llm/providers/gai.py (3)
253-264:⚠️ Potential issue | 🟡 MinorMultiple logging statements lack function prefix and leak details.
Lines 254, 258, 262, and 263 log model names and language choices without the required
[GoogleAIProvider._execute_tts]prefix. While these aren't as sensitive as prompts, they should follow the coding convention.🪵 Fix log prefixes
model = generation_params.get("model", DEFAULT_TTS_MODEL) - logger.info(f"Model name is {model}") + logger.debug(f"[GoogleAIProvider._execute_tts] Using model={model}") voice = generation_params.get("voice", DEFAULT_TTS_VOICE) language = generation_params.get("language") - logger.info(f"The language of choice is {language}") google_language = None if language is not None: google_language = BCP47_LOCALE_TO_GEMINI_LANG.get(language) - logger.info(f"The google language is {google_language}") + logger.debug( + f"[GoogleAIProvider._execute_tts] Mapped language={language} to google_language={google_language}" + ) if not google_language: return None, f"Unsupported language '{language}' for Google TTS"As per coding guidelines: "Prefix all log messages with the function name in square brackets".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 253 - 264, Logging in GoogleAIProvider._execute_tts prints model and language details without the required function prefix; update the logger.info calls that print model, language and google_language (the calls currently logging "Model name is {model}", "The language of choice is {language}", and "The google language is {google_language}") to include the prefix "[GoogleAIProvider._execute_tts]" at the start of each message so they follow the project's logging convention.
294-294:⚠️ Potential issue | 🟠 MajorLogging full response object is a security/privacy concern.
Line 294 logs the entire
responseobject which may contain sensitive data, audio bytes, or internal API details. This violates the logging guidelines and creates noisy, potentially massive log entries.🪵 Log only safe identifiers
- logger.info(f"-----Response TTS is {response}") + logger.debug( + f"[GoogleAIProvider._execute_tts] Received TTS response | " + f"response_id={response.response_id}" + )As per coding guidelines: "Prefix all log messages with the function name in square brackets".
154-156:⚠️ Potential issue | 🟠 MajorStop logging full prompts and sensitive instructions.
This log statement dumps
merged_instructionwhich may contain sensitive user instructions or PII. It also lacks the required function name prefix per coding guidelines.🪵 Safer logging
- logger.info( - f"The merged instructions is {merged_instruction} and output language is {output_language} and input language is {input_language}" - ) + logger.debug( + f"[GoogleAIProvider._execute_stt] Prepared STT request | " + f"input_language={input_language}, output_language={output_language}" + )As per coding guidelines: "Prefix all log messages with the function name in square brackets".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 154 - 156, The logger.info call currently prints the full merged_instruction (and lacks the required function-name prefix); remove sensitive content from logs by not including merged_instruction (or any prompt/PII) and instead log a masked/omitted message plus non-sensitive metadata, and add the function name in square brackets as a prefix; e.g., update the logger.info call that references merged_instruction, output_language and input_language to something like: logger.info(f"[<FUNCTION_NAME>] Merged instructions omitted; output_language=%s input_language=%s", output_language, input_language) — replacing <FUNCTION_NAME> with the actual function name where this logger call lives.backend/app/models/llm/request.py (2)
296-297:⚠️ Potential issue | 🟠 Major
provider_specificis dropped during params normalization.Line 296 replaces
self.paramswithvalidated.model_dump(exclude_none=True), but none of the*LLMParamsmodels declare aprovider_specificfield. This means any caller-providedprovider_specificdata (used ingai.py:270-273fordirector_notes) is silently discarded.🐛 Proposed fix to preserve provider_specific
- self.params = validated.model_dump(exclude_none=True) + dumped = validated.model_dump(exclude_none=True) + # Preserve provider_specific if present in original params + if "provider_specific" in self.params: + dumped["provider_specific"] = self.params["provider_specific"] + self.params = dumped return self🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 296 - 297, Normalization replaces self.params with validated.model_dump(exclude_none=True), which drops any caller-provided provider_specific data; preserve provider_specific by merging it back into the normalized params (e.g., after computing validated.model_dump(exclude_none=True) assign to params = ... and if self.params.get("provider_specific") is present, set params["provider_specific"] = self.params["provider_specific"] before assigning self.params = params) or alternatively ensure the LLM params model includes a provider_specific field so validated contains it; reference self.params, validated.model_dump(exclude_none=True), and provider_specific (used by gai.py:270-273 / director_notes) when applying the fix.
287-295:⚠️ Potential issue | 🟠 MajorVoice validation still uses raw
self.params.get("voice")instead ofvalidated.voice.When
voiceis omitted from the request,TTSLLMParamsfills inDEFAULT_TTS_VOICE("Kore"). However, the validation on line 288 reads fromself.params.get("voice"), which may beNoneif the caller didn't provide it. This means:
- The
voice and voice not in allowed_voicescondition short-circuits when voice isNone- The Gemini default "Kore" could be serialized for providers like "sarvamai" or "elevenlabs" without validation
Use
validated.voiceto validate the actual voice that will be used.🐛 Proposed fix
if self.type == "tts": - voice = self.params.get("voice") + voice = validated.voice allowed_voices = SUPPORTED_VOICES.get(key) if allowed_voices and voice and voice not in allowed_voices:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 287 - 295, The TTS voice validation is reading raw self.params.get("voice") instead of the resolved value from the validated params, so replace that use with the resolved value (validated.voice) inside the self.type == "tts" block (the code that checks allowed_voices = SUPPORTED_VOICES.get(key) and raises ValueError), i.e., validate validated.voice against allowed_voices and raise the same ValueError using provider/context if it is not allowed.
🧹 Nitpick comments (5)
backend/app/services/llm/providers/gai.py (4)
343-343:response_mime_typeusesresponse_formatwhich may not matchactual_format.Line 339 sets
response_format = "wav"for unsupported formats, butactual_formatis the authoritative value after conversion. Line 343 constructsresponse_mime_typefromresponse_format, which could be stale if the fallback occurred.However, looking closely: line 339 sets
response_format = "wav"and theactual_formatwould also remain"wav"(set on line 308). So they should match. But usingactual_formatwould be more consistent and defensive.♻️ Use actual_format for mime_type
- response_mime_type = f"audio/{response_format}" + response_mime_type = f"audio/{actual_format}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` at line 343, The code builds response_mime_type from response_format which can be stale when a fallback conversion occurred; change it to use the authoritative actual_format instead (i.e., construct response_mime_type = f"audio/{actual_format}") so the MIME reflects the true converted audio format; update any references around response_mime_type to rely on actual_format (ensure actual_format is available where response_mime_type is set).
318-334: Format conversion error handling could leak empty bytes on partial failure.When conversion fails (
convert_erroris set), the code returns an error. However, on lines 322-324 and 331-333,converted_bytes or b""is used, which means ifconverted_bytesisNonebutconvert_erroris alsoNone(edge case), an empty audio file would be returned silently.The current logic appears safe because the
if convert_error:check should catch failures, but theor b""fallback is defensive in a way that could mask issues.♻️ More explicit error handling
if response_format == "mp3": converted_bytes, convert_error = convert_pcm_to_mp3(raw_audio_bytes) if convert_error: return None, f"Failed to convert audio to MP3: {convert_error}" - encoded_content = base64.b64encode(converted_bytes or b"").decode( - "ascii" - ) + if not converted_bytes: + return None, "MP3 conversion returned empty bytes" + encoded_content = base64.b64encode(converted_bytes).decode("ascii") actual_format = "mp3" elif response_format == "ogg": converted_bytes, convert_error = convert_pcm_to_ogg(raw_audio_bytes) if convert_error: return None, f"Failed to convert audio to OGG: {convert_error}" - encoded_content = base64.b64encode(converted_bytes or b"").decode( - "ascii" - ) + if not converted_bytes: + return None, "OGG conversion returned empty bytes" + encoded_content = base64.b64encode(converted_bytes).decode("ascii") actual_format = "ogg"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 318 - 334, The conversion branches using convert_pcm_to_mp3 and convert_pcm_to_ogg currently fall back to base64.b64encode(converted_bytes or b"") which can silently return an empty audio payload if converted_bytes is None while convert_error is also None; update the logic in the response_format handling (the blocks that call convert_pcm_to_mp3 and convert_pcm_to_ogg) to explicitly check that converted_bytes is not None and non-empty after conversion and, if it is None/empty, return an explicit error (e.g., "Conversion returned empty result") instead of encoding b""; keep the existing convert_error check but remove the defensive "or b''" usage and ensure actual_format is only set when converted_bytes is valid.
171-175: Hardcodedthinking_budget=1000may be excessive or insufficient.The
ThinkingConfigwiththinking_budget=1000andinclude_thoughts=Trueis hardcoded. This could:
- Increase latency and token costs for simple transcriptions
- Be insufficient for complex translation tasks
Consider making this configurable via params or using a more conservative default. Also, the comment on line 169 mentions "switch back default thinking configs for reasoning supported models in future"—clarify when this should happen.
♻️ Make thinking config configurable
+ # Extract thinking config from params, with sensible defaults for STT + include_thoughts = generation_params.get("include_thoughts", False) + thinking_budget = generation_params.get("thinking_budget", 500) if include_thoughts else None + response: GenerateContentResponse = self.client.models.generate_content( model=model, contents=contents, - # switch back default thinking configs for reasoning supported models in future config=GenerateContentConfig( - thinking_config=ThinkingConfig( - include_thoughts=True, thinking_budget=1000 - ), + thinking_config=ThinkingConfig( + include_thoughts=include_thoughts, + thinking_budget=thinking_budget, + ) if include_thoughts else None, temperature=temperature, ), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 171 - 175, Replace the hardcoded ThinkingConfig(include_thoughts=True, thinking_budget=1000) used in the GAI provider with a configurable parameter and a safer default: expose a new argument (e.g., thinking_budget and include_thoughts) on the surrounding function or class initializer that constructs ThinkingConfig (refer to the ThinkingConfig usage in gai.py where thinking_config is passed along with temperature), set a conservative default (e.g., thinking_budget=100) and allow callers to override for heavy reasoning tasks, and update the inline comment about "switch back default thinking configs for reasoning supported models in future" to state under what condition or model version the config should revert (or remove the TODO if not planned). Ensure callers are updated to pass through the new params or rely on the new defaults.
307-310: PCM-to-WAV conversion assumes fixed audio parameters.The
pcm_to_wavfunction is called with default parameters (24kHz, 16-bit, mono) based on the assumption that Gemini TTS always outputs in this format. Per the context snippet fromaudio_utils.py:50-75, these defaults are documented as "Gemini TTS output specifications."If Gemini ever changes its output format or if different models output different formats, the resulting WAV will be silently malformed. Consider adding a comment documenting this dependency or validating the assumption.
📝 Add documentation comment
# Post-process audio format conversion if needed # Gemini TTS natively outputs 24kHz 16-bit raw PCM — wrap in WAV container + # IMPORTANT: pcm_to_wav defaults (24kHz, 16-bit, mono) must match Gemini's output specs. + # If Gemini changes output format, update audio_utils.pcm_to_wav defaults accordingly. actual_format = "wav" wav_bytes = pcm_to_wav(raw_audio_bytes)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 307 - 310, The PCM-to-WAV conversion currently assumes Gemini TTS outputs 24kHz, 16-bit, mono and calls pcm_to_wav(raw_audio_bytes) without explicit params; update the callsite around actual_format / wav_bytes to either (1) pass explicit audio parameters to pcm_to_wav (sample_rate=24000, sample_width=2, channels=1) so the dependency is explicit, or (2) add a clear inline comment referencing audio_utils.py and the Gemini TTS output spec that documents this assumption and that changing models may require updating these params; reference the symbols pcm_to_wav, raw_audio_bytes and actual_format when applying the change so the intent is obvious.backend/app/models/llm/request.py (1)
12-40: Consider moving provider defaults and capability maps to dedicated modules for separation of concerns.The SUPPORTED_MODELS and SUPPORTED_VOICES constants are well-structured, but per prior feedback, moving these to a dedicated
constants/provider.pymodule (along with DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL, DEFAULT_TTS_VOICE) would keeprequest.pyfocused on request/response models and improve maintainability.Note: The OpenAI model names in SUPPORTED_MODELS (gpt-5.4, gpt-5.1, gpt-5-mini, gpt-5-nano) are current as of March 2026.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 12 - 40, Extract DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL, DEFAULT_TTS_VOICE, SUPPORTED_MODELS, and SUPPORTED_VOICES from backend/app/models/llm/request.py into a new module constants/provider.py, export them with the same names, and replace their original definitions with imports in request.py (referencing DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL, DEFAULT_TTS_VOICE, SUPPORTED_MODELS, SUPPORTED_VOICES). Update any local references in request.py to use the imported constants and adjust any tests or modules that import these symbols to import from constants.provider instead of models.llm.request. Ensure no runtime behavior changes and preserve the exact values and keys in SUPPORTED_MODELS and SUPPORTED_VOICES.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/app/models/llm/request.py`:
- Around line 300-304: The discriminated union CompletionConfig fails because
KaapiCompletionConfig.provider is declared with a unioned optional type
(Literal[...] | None) which Pydantic rejects; change the
KaapiCompletionConfig.provider annotation so the Literal includes None as one of
its allowed literal values (i.e., make the Literal list include None) and keep
the Field(default) as is so that provider=None maps to KaapiCompletionConfig and
the other literal strings map to their respective configs.
In `@backend/app/services/llm/providers/gai.py`:
- Around line 340-343: The log in GoogleAIProvider._execute_tts incorrectly
reports len(raw_audio_bytes) after conversion; update the logger to report the
size of the converted output buffer (the byte array produced by the conversion
step, e.g., converted_audio_bytes or whatever variable holds the converted
audio) instead of raw_audio_bytes, keeping the rest of the message and
actual_format/response_mime_type unchanged so the log reflects the
post-conversion byte count.
---
Duplicate comments:
In `@backend/app/models/llm/request.py`:
- Around line 296-297: Normalization replaces self.params with
validated.model_dump(exclude_none=True), which drops any caller-provided
provider_specific data; preserve provider_specific by merging it back into the
normalized params (e.g., after computing validated.model_dump(exclude_none=True)
assign to params = ... and if self.params.get("provider_specific") is present,
set params["provider_specific"] = self.params["provider_specific"] before
assigning self.params = params) or alternatively ensure the LLM params model
includes a provider_specific field so validated contains it; reference
self.params, validated.model_dump(exclude_none=True), and provider_specific
(used by gai.py:270-273 / director_notes) when applying the fix.
- Around line 287-295: The TTS voice validation is reading raw
self.params.get("voice") instead of the resolved value from the validated
params, so replace that use with the resolved value (validated.voice) inside the
self.type == "tts" block (the code that checks allowed_voices =
SUPPORTED_VOICES.get(key) and raises ValueError), i.e., validate validated.voice
against allowed_voices and raise the same ValueError using provider/context if
it is not allowed.
In `@backend/app/services/llm/providers/gai.py`:
- Around line 253-264: Logging in GoogleAIProvider._execute_tts prints model and
language details without the required function prefix; update the logger.info
calls that print model, language and google_language (the calls currently
logging "Model name is {model}", "The language of choice is {language}", and
"The google language is {google_language}") to include the prefix
"[GoogleAIProvider._execute_tts]" at the start of each message so they follow
the project's logging convention.
- Around line 154-156: The logger.info call currently prints the full
merged_instruction (and lacks the required function-name prefix); remove
sensitive content from logs by not including merged_instruction (or any
prompt/PII) and instead log a masked/omitted message plus non-sensitive
metadata, and add the function name in square brackets as a prefix; e.g., update
the logger.info call that references merged_instruction, output_language and
input_language to something like: logger.info(f"[<FUNCTION_NAME>] Merged
instructions omitted; output_language=%s input_language=%s", output_language,
input_language) — replacing <FUNCTION_NAME> with the actual function name where
this logger call lives.
---
Nitpick comments:
In `@backend/app/models/llm/request.py`:
- Around line 12-40: Extract DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL,
DEFAULT_TTS_VOICE, SUPPORTED_MODELS, and SUPPORTED_VOICES from
backend/app/models/llm/request.py into a new module constants/provider.py,
export them with the same names, and replace their original definitions with
imports in request.py (referencing DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL,
DEFAULT_TTS_VOICE, SUPPORTED_MODELS, SUPPORTED_VOICES). Update any local
references in request.py to use the imported constants and adjust any tests or
modules that import these symbols to import from constants.provider instead of
models.llm.request. Ensure no runtime behavior changes and preserve the exact
values and keys in SUPPORTED_MODELS and SUPPORTED_VOICES.
In `@backend/app/services/llm/providers/gai.py`:
- Line 343: The code builds response_mime_type from response_format which can be
stale when a fallback conversion occurred; change it to use the authoritative
actual_format instead (i.e., construct response_mime_type =
f"audio/{actual_format}") so the MIME reflects the true converted audio format;
update any references around response_mime_type to rely on actual_format (ensure
actual_format is available where response_mime_type is set).
- Around line 318-334: The conversion branches using convert_pcm_to_mp3 and
convert_pcm_to_ogg currently fall back to base64.b64encode(converted_bytes or
b"") which can silently return an empty audio payload if converted_bytes is None
while convert_error is also None; update the logic in the response_format
handling (the blocks that call convert_pcm_to_mp3 and convert_pcm_to_ogg) to
explicitly check that converted_bytes is not None and non-empty after conversion
and, if it is None/empty, return an explicit error (e.g., "Conversion returned
empty result") instead of encoding b""; keep the existing convert_error check
but remove the defensive "or b''" usage and ensure actual_format is only set
when converted_bytes is valid.
- Around line 171-175: Replace the hardcoded
ThinkingConfig(include_thoughts=True, thinking_budget=1000) used in the GAI
provider with a configurable parameter and a safer default: expose a new
argument (e.g., thinking_budget and include_thoughts) on the surrounding
function or class initializer that constructs ThinkingConfig (refer to the
ThinkingConfig usage in gai.py where thinking_config is passed along with
temperature), set a conservative default (e.g., thinking_budget=100) and allow
callers to override for heavy reasoning tasks, and update the inline comment
about "switch back default thinking configs for reasoning supported models in
future" to state under what condition or model version the config should revert
(or remove the TODO if not planned). Ensure callers are updated to pass through
the new params or rely on the new defaults.
- Around line 307-310: The PCM-to-WAV conversion currently assumes Gemini TTS
outputs 24kHz, 16-bit, mono and calls pcm_to_wav(raw_audio_bytes) without
explicit params; update the callsite around actual_format / wav_bytes to either
(1) pass explicit audio parameters to pcm_to_wav (sample_rate=24000,
sample_width=2, channels=1) so the dependency is explicit, or (2) add a clear
inline comment referencing audio_utils.py and the Gemini TTS output spec that
documents this assumption and that changing models may require updating these
params; reference the symbols pcm_to_wav, raw_audio_bytes and actual_format when
applying the change so the intent is obvious.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ca6c2220-61dd-4c49-892e-be88ae967ed1
⛔ Files ignored due to path filters (1)
backend/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
backend/app/models/llm/request.pybackend/app/services/llm/mappers.pybackend/app/services/llm/providers/gai.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/services/llm/mappers.py
| language = generation_params.get("language") | ||
| if not language: | ||
| return None, "Missing 'language' in native params" | ||
| logger.info(f"The language of choice is {language}") |
There was a problem hiding this comment.
we are doing lots of logging which eventually lead to cost in cloudwatch, can we reduce it to only necessary logs
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/tests/services/llm/providers/test_eai.py (1)
309-326:⚠️ Potential issue | 🔴 CriticalTest assertion conflicts with implementation change.
The test
test_tts_default_output_formatasserts that the defaultoutput_formatis"mp3_44100_128"(Line 325) and expectsmime_typeof"audio/mpeg"(Line 326). However,eai.pyLine 179 changed the default to"wav_24000".This test will fail. Update it to match the new default:
💚 Proposed fix
def test_tts_default_output_format(self, provider, mock_client, query_params): - """Test TTS defaults to mp3_44100_128 when output_format is not specified.""" + """Test TTS defaults to wav_24000 when output_format is not specified.""" config = NativeCompletionConfig( provider="elevenlabs-native", type="tts", params={ "model_id": "eleven_v3", "voice_id": "JBFqnCBsd6RMkjVDRZzb", }, ) mock_client.text_to_speech.convert.return_value = iter([b"audio"]) result, error = provider.execute(config, query_params, "Test") assert error is None call_kwargs = mock_client.text_to_speech.convert.call_args.kwargs - assert call_kwargs["output_format"] == "mp3_44100_128" - assert result.response.output.content.mime_type == "audio/mpeg" + assert call_kwargs["output_format"] == "wav_24000" + assert result.response.output.content.mime_type == "audio/wav"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/providers/test_eai.py` around lines 309 - 326, The test test_tts_default_output_format needs to be updated to match the implementation change in eai.py (default output_format changed to "wav_24000"); update the assertion that inspects mock_client.text_to_speech.convert.call_args.kwargs["output_format"] to expect "wav_24000" and change the expected result.response.output.content.mime_type from "audio/mpeg" to "audio/wav" so the test aligns with the new default in the provider's TTS logic.
♻️ Duplicate comments (2)
backend/app/services/llm/providers/gai.py (1)
154-156:⚠️ Potential issue | 🟠 MajorSensitive data in log message.
This log statement outputs the full merged instruction text which may contain sensitive prompt content. The past review flagged this issue and it was marked as addressed, but the code still logs the full
merged_instruction.🪵 Proposed fix
- logger.info( - f"The merged instructions is {merged_instruction} and output language is {output_language} and input language is {input_language}" - ) + logger.info( + f"[GoogleAIProvider._execute_stt] Prepared STT request | " + f"input_language={input_language}, output_language={output_language}" + )As per coding guidelines: "Prefix all log messages with the function name in square brackets:
logger.info(f\"[function_name] Message {mask_string(sensitive_value)}\")",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/gai.py` around lines 154 - 156, The logger.info call currently emits the full merged_instruction (sensitive) along with languages; update the logging to avoid exposing prompt content by replacing merged_instruction with a masked version using the existing mask_string helper (e.g., mask_string(merged_instruction)), and prefix the message with the function name in square brackets per guideline; specifically modify the logger.info that references merged_instruction, output_language, and input_language so it reads something like logger.info(f"[<function_name>] Message {mask_string(merged_instruction)} {output_language} {input_language}") ensuring you use the actual function name in place of <function_name>.backend/app/models/llm/request.py (1)
264-271:⚠️ Potential issue | 🟠 MajorValidate the normalized TTS voice, not the raw input.
When
voiceis omitted,TTSLLMParamsfills inDEFAULT_TTS_VOICE("Kore"). However, Line 265 reads fromself.params.get("voice")which is the raw input before normalization. This means:
- User omits voice →
self.params.get("voice")returnsNone→ validation skipped- But
validated.voicecontains "Kore" (Google's default)- For
sarvamaiorelevenlabsproviders, "Kore" is not a valid voice and will fail downstream🐛 Proposed fix
if self.type == "tts": - voice = self.params.get("voice") + voice = validated.voice allowed_voices = SUPPORTED_VOICES.get(key) if allowed_voices and voice and voice not in allowed_voices: raise ValueError( f"Voice '{voice}' is not supported for provider='{provider}'. " f"Allowed: {allowed_voices}" ),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 264 - 271, The validation currently checks the raw input via self.params.get("voice") which can be None while TTSLLMParams later normalizes to DEFAULT_TTS_VOICE; update the check inside the self.type == "tts" branch to use the normalized value (e.g., validated.voice or the normalized/filled voice from TTSLLMParams) when comparing against SUPPORTED_VOICES.get(key), and raise the same ValueError if that normalized voice is not allowed for the resolved provider (keep the provider and allowed_voices wording intact).
🧹 Nitpick comments (6)
backend/app/services/llm/providers/sai.py (2)
76-78: Remove unreachable code afterorfallback.Line 76 uses
params.get("model") or "saaras:v3"which guaranteesmodelis always truthy. The subsequentif not model:check on Lines 77-78 can never be true.♻️ Proposed fix
- model = params.get("model") or "saaras:v3" - if not model: - return None, "Missing 'model' in native params for SarvamAI STT" + model = params.get("model") or "saaras:v3"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/sai.py` around lines 76 - 78, The model variable is assigned with a truthy fallback using params.get("model") or "saaras:v3", so the subsequent guard if not model: in the SarvamAI STT setup is unreachable; remove that redundant check (the lines referencing model = params.get("model") or "saaras:v3" and the following if not model: return None, "Missing 'model'...") or instead change the assignment to params.get("model") without the fallback if you intend to validate absence—update the code around the model variable in the SarvamAI provider (the model assignment and the unreachable if block) accordingly.
170-172: Remove unreachable code afterorfallback.Same issue as STT: Line 170 uses
or "bulbul:v3"which guaranteesmodelis always truthy. The check on Lines 171-172 is unreachable.♻️ Proposed fix
- model = params.get("model") or "bulbul:v3" - if not model: - return None, "Missing 'model' in native params for SarvamAI TTS" + model = params.get("model") or "bulbul:v3"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/providers/sai.py` around lines 170 - 172, The check for a missing model is unreachable because model is assigned with params.get("model") or "bulbul:v3"; remove the redundant if-block that returns None and the error string. In the SarvamAI TTS provider where model is set (variable model using params.get("model") or "bulbul:v3"), keep the fallback assignment and delete the subsequent if not model:return None,... branch so the default "bulbul:v3" is used.backend/app/models/llm/request.py (1)
246-250: Document the implicit provider defaulting behavior.When
providerisNoneforstt/ttstypes, it silently defaults to"google". This affects model validation since Google's supported models differ from other providers. Consider adding a docstring or inline comment explaining this defaulting behavior so API consumers understand why their config was accepted.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 246 - 250, Document that when self.type is "stt" or "tts" and provider is None the code implicitly sets self.provider = "google" (see the provider/self.provider assignment block), since this default affects model validation and supported model lists; add a concise inline comment or update the method/class docstring near the provider assignment to explain the defaulting behavior and its impact on model validation so API consumers know why "google" is chosen when no provider is supplied.backend/app/services/llm/jobs.py (1)
552-554: Unconditional logging ofresult.errorcreates noise on successful executions.This log statement executes for every job, including successful ones where
result.errorisNone. This produces log entries like"Error if any during execution of job: None"which adds noise without value.Consider conditionally logging only when there's an actual error:
🪵 Proposed fix
- logger.info( - f"[execute_job] Error if any during execution of job: {result.error}" - ) + if result.error: + logger.warning( + f"[execute_job] Error during execution of job: {result.error}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/jobs.py` around lines 552 - 554, The current logger.info call that prints f"[execute_job] Error if any during execution of job: {result.error}" runs even when result.error is None; update the code in the execute_job handling (the block using variables result and logger) to only log when result.error is truthy—e.g., wrap the logging in an if result.error: branch and call logger.error (or logger.exception if you have an exception object/stack) with a clear message including result.error and job identifiers; ensure no "None" messages are emitted for successful runs.backend/app/tests/services/llm/test_mappers.py (2)
215-315: Use factory fixtures for repeated STT/TTS setup.The new tests repeatedly construct similar
TTSLLMParams/STTLLMParamsobjects; extracting factories/fixtures will reduce duplication and future drift.As per coding guidelines,
backend/app/tests/**/*.py: Use factory pattern for test fixtures inbackend/app/tests/.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_mappers.py` around lines 215 - 315, The tests repeat constructing TTSLLMParams and STTLLMParams; introduce factory fixtures (e.g., tts_params_factory and stt_params_factory in the tests package or conftest) that return pre-populated TTSLLMParams/STTLLMParams with sensible defaults and allow overrides, then update test functions test_tts_empty_params_uses_smart_defaults, test_stt_empty_params_uses_smart_defaults, test_tts_language_missing_uses_auto_detect, test_tts_unsupported_language_warns_and_auto_detects, test_tts_supported_language_maps_correctly, and test_stt_with_input_and_output_language to call those factories (and .model_dump(exclude_none=True) where needed) instead of constructing params inline; ensure factories expose parameters for voice/language/response_format/input_language/output_language so each test can override only what it needs.
224-226: Assert default voice using the imported constant.Line [225] hardcodes
"Kore"whileDEFAULT_TTS_VOICEis already imported; using the constant makes this test resilient to future default changes.Proposed diff
- assert result["voice"] == "Kore" # DEFAULT_TTS_VOICE + assert result["voice"] == DEFAULT_TTS_VOICE🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_mappers.py` around lines 224 - 226, The test currently asserts the default TTS voice using a hardcoded string ("Kore"); update the assertion to use the imported DEFAULT_TTS_VOICE constant instead so the test remains correct if the default changes—replace the assertion checking result["voice"] in test_mappers.py with an assertion that compares to DEFAULT_TTS_VOICE.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/llm/mappers.py`:
- Around line 315-316: The log message lacks the required function-name prefix;
update the logger.info call that logs sarvam_params (the line
"logger.info(f\"Sarvam params {sarvam_params}\")" before "return sarvam_params,
warnings") to prepend the current function name in square brackets (e.g.,
logger.info(f\"[FUNCTION_NAME] Sarvam params {sarvam_params}\")). Replace
FUNCTION_NAME with the actual enclosing function name and keep the rest of the
message and variables unchanged so the logged output and return (sarvam_params,
warnings) remain the same.
In `@backend/app/tests/services/llm/test_mappers.py`:
- Around line 215-315: The test methods
(test_tts_empty_params_uses_smart_defaults,
test_stt_empty_params_uses_smart_defaults,
test_tts_language_missing_uses_auto_detect,
test_tts_unsupported_language_warns_and_auto_detects,
test_tts_supported_language_maps_correctly,
test_stt_with_input_and_output_language) lack type hints on the self parameter
and return type; update each signature to include a typed self (e.g., self: Any)
and a return annotation -> None, and add "from typing import Any" at the top of
the test file if not already imported. Ensure you only modify the function
signatures (and imports) so the tests' behavior is unchanged.
---
Outside diff comments:
In `@backend/app/tests/services/llm/providers/test_eai.py`:
- Around line 309-326: The test test_tts_default_output_format needs to be
updated to match the implementation change in eai.py (default output_format
changed to "wav_24000"); update the assertion that inspects
mock_client.text_to_speech.convert.call_args.kwargs["output_format"] to expect
"wav_24000" and change the expected result.response.output.content.mime_type
from "audio/mpeg" to "audio/wav" so the test aligns with the new default in the
provider's TTS logic.
---
Duplicate comments:
In `@backend/app/models/llm/request.py`:
- Around line 264-271: The validation currently checks the raw input via
self.params.get("voice") which can be None while TTSLLMParams later normalizes
to DEFAULT_TTS_VOICE; update the check inside the self.type == "tts" branch to
use the normalized value (e.g., validated.voice or the normalized/filled voice
from TTSLLMParams) when comparing against SUPPORTED_VOICES.get(key), and raise
the same ValueError if that normalized voice is not allowed for the resolved
provider (keep the provider and allowed_voices wording intact).
In `@backend/app/services/llm/providers/gai.py`:
- Around line 154-156: The logger.info call currently emits the full
merged_instruction (sensitive) along with languages; update the logging to avoid
exposing prompt content by replacing merged_instruction with a masked version
using the existing mask_string helper (e.g., mask_string(merged_instruction)),
and prefix the message with the function name in square brackets per guideline;
specifically modify the logger.info that references merged_instruction,
output_language, and input_language so it reads something like
logger.info(f"[<function_name>] Message {mask_string(merged_instruction)}
{output_language} {input_language}") ensuring you use the actual function name
in place of <function_name>.
---
Nitpick comments:
In `@backend/app/models/llm/request.py`:
- Around line 246-250: Document that when self.type is "stt" or "tts" and
provider is None the code implicitly sets self.provider = "google" (see the
provider/self.provider assignment block), since this default affects model
validation and supported model lists; add a concise inline comment or update the
method/class docstring near the provider assignment to explain the defaulting
behavior and its impact on model validation so API consumers know why "google"
is chosen when no provider is supplied.
In `@backend/app/services/llm/jobs.py`:
- Around line 552-554: The current logger.info call that prints f"[execute_job]
Error if any during execution of job: {result.error}" runs even when
result.error is None; update the code in the execute_job handling (the block
using variables result and logger) to only log when result.error is truthy—e.g.,
wrap the logging in an if result.error: branch and call logger.error (or
logger.exception if you have an exception object/stack) with a clear message
including result.error and job identifiers; ensure no "None" messages are
emitted for successful runs.
In `@backend/app/services/llm/providers/sai.py`:
- Around line 76-78: The model variable is assigned with a truthy fallback using
params.get("model") or "saaras:v3", so the subsequent guard if not model: in the
SarvamAI STT setup is unreachable; remove that redundant check (the lines
referencing model = params.get("model") or "saaras:v3" and the following if not
model: return None, "Missing 'model'...") or instead change the assignment to
params.get("model") without the fallback if you intend to validate
absence—update the code around the model variable in the SarvamAI provider (the
model assignment and the unreachable if block) accordingly.
- Around line 170-172: The check for a missing model is unreachable because
model is assigned with params.get("model") or "bulbul:v3"; remove the redundant
if-block that returns None and the error string. In the SarvamAI TTS provider
where model is set (variable model using params.get("model") or "bulbul:v3"),
keep the fallback assignment and delete the subsequent if not model:return
None,... branch so the default "bulbul:v3" is used.
In `@backend/app/tests/services/llm/test_mappers.py`:
- Around line 215-315: The tests repeat constructing TTSLLMParams and
STTLLMParams; introduce factory fixtures (e.g., tts_params_factory and
stt_params_factory in the tests package or conftest) that return pre-populated
TTSLLMParams/STTLLMParams with sensible defaults and allow overrides, then
update test functions test_tts_empty_params_uses_smart_defaults,
test_stt_empty_params_uses_smart_defaults,
test_tts_language_missing_uses_auto_detect,
test_tts_unsupported_language_warns_and_auto_detects,
test_tts_supported_language_maps_correctly, and
test_stt_with_input_and_output_language to call those factories (and
.model_dump(exclude_none=True) where needed) instead of constructing params
inline; ensure factories expose parameters for
voice/language/response_format/input_language/output_language so each test can
override only what it needs.
- Around line 224-226: The test currently asserts the default TTS voice using a
hardcoded string ("Kore"); update the assertion to use the imported
DEFAULT_TTS_VOICE constant instead so the test remains correct if the default
changes—replace the assertion checking result["voice"] in test_mappers.py with
an assertion that compares to DEFAULT_TTS_VOICE.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2132727e-92c6-4dda-937a-281c5c9d318e
📒 Files selected for processing (11)
backend/app/models/llm/constants.pybackend/app/models/llm/request.pybackend/app/services/llm/__init__.pybackend/app/services/llm/jobs.pybackend/app/services/llm/mappers.pybackend/app/services/llm/providers/__init__.pybackend/app/services/llm/providers/eai.pybackend/app/services/llm/providers/gai.pybackend/app/services/llm/providers/sai.pybackend/app/tests/services/llm/providers/test_eai.pybackend/app/tests/services/llm/test_mappers.py
| logger.info(f"Sarvam params {sarvam_params}") | ||
| return sarvam_params, warnings |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add function name prefix to log message.
The log message is missing the required function name prefix per coding guidelines.
🪵 Proposed fix
- logger.info(f"Sarvam params {sarvam_params}")
+ logger.info(f"[map_kaapi_to_sarvam_params] Mapped params: {sarvam_params}")As per coding guidelines: "Prefix all log messages with the function name in square brackets."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/llm/mappers.py` around lines 315 - 316, The log message
lacks the required function-name prefix; update the logger.info call that logs
sarvam_params (the line "logger.info(f\"Sarvam params {sarvam_params}\")" before
"return sarvam_params, warnings") to prepend the current function name in square
brackets (e.g., logger.info(f\"[FUNCTION_NAME] Sarvam params
{sarvam_params}\")). Replace FUNCTION_NAME with the actual enclosing function
name and keep the rest of the message and variables unchanged so the logged
output and return (sarvam_params, warnings) remain the same.
| def test_tts_empty_params_uses_smart_defaults(self): | ||
| """Test TTS with empty params uses smart defaults (voice, response_format).""" | ||
| # Simulate empty params after exclude_none=True strips None values | ||
| kaapi_params = {"model": "gemini-2.5-pro"} | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params, completion_type="tts" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["voice"] == "Kore" # DEFAULT_TTS_VOICE | ||
| assert result["response_format"] == "wav" # Default | ||
| assert "language" not in result # None = auto-detect, not set | ||
| assert warnings == [] | ||
|
|
||
| def test_stt_empty_params_uses_smart_defaults(self): | ||
| """Test STT with empty params uses smart defaults (input_language=auto).""" | ||
| kaapi_params = {"model": "gemini-2.5-pro"} | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params, completion_type="stt" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["input_language"] == "auto" # Default | ||
| assert "output_language" not in result # Optional, not set | ||
| assert "instructions" not in result # Optional, not set | ||
| assert warnings == [] | ||
|
|
||
| def test_tts_language_missing_uses_auto_detect(self): | ||
| """Test TTS with missing language parameter (auto-detect behavior).""" | ||
| kaapi_params = TTSLLMParams( | ||
| model="gemini-2.5-pro", voice="Orus", response_format="mp3" | ||
| ) | ||
| # language is None by default, gets stripped by exclude_none=True | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params.model_dump(exclude_none=True), completion_type="tts" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["voice"] == "Orus" | ||
| assert result["response_format"] == "mp3" | ||
| assert "language" not in result # Not set = auto-detect | ||
| assert warnings == [] | ||
|
|
||
| def test_tts_unsupported_language_warns_and_auto_detects(self): | ||
| """Test TTS with unsupported language generates warning and uses auto-detect.""" | ||
| kaapi_params = TTSLLMParams( | ||
| model="gemini-2.5-pro", | ||
| voice="Kore", | ||
| language="fr-FR", # French not in BCP47_LOCALE_TO_GEMINI_LANG | ||
| ) | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params.model_dump(exclude_none=True), completion_type="tts" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["voice"] == "Kore" | ||
| assert result["response_format"] == "wav" # Default | ||
| assert "language" not in result # Not set, falls back to auto-detect | ||
| assert len(warnings) == 1 | ||
| assert "Unsupported language 'fr-FR'" in warnings[0] | ||
| assert "auto-detect" in warnings[0] | ||
|
|
||
| def test_tts_supported_language_maps_correctly(self): | ||
| """Test TTS with supported BCP-47 language maps to Gemini language code.""" | ||
| kaapi_params = TTSLLMParams( | ||
| model="gemini-2.5-pro", voice="Kore", language="hi-IN" | ||
| ) | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params.model_dump(exclude_none=True), completion_type="tts" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["voice"] == "Kore" | ||
| assert ( | ||
| result["language"] == "hi" | ||
| ) # BCP47_LOCALE_TO_GEMINI_LANG maps hi-IN -> hi | ||
| assert result["response_format"] == "wav" # Default | ||
| assert warnings == [] | ||
|
|
||
| def test_stt_with_input_and_output_language(self): | ||
| """Test STT with both input and output language (translation scenario).""" | ||
| kaapi_params = STTLLMParams( | ||
| model="gemini-2.5-pro", | ||
| input_language="hi-IN", | ||
| output_language="en-IN", | ||
| ) | ||
|
|
||
| result, warnings = map_kaapi_to_google_params( | ||
| kaapi_params.model_dump(exclude_none=True), completion_type="stt" | ||
| ) | ||
|
|
||
| assert result["model"] == "gemini-2.5-pro" | ||
| assert result["input_language"] == "hi-IN" | ||
| assert result["output_language"] == "en-IN" | ||
| assert warnings == [] | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import ast
from pathlib import Path
path = Path("backend/app/tests/services/llm/test_mappers.py")
tree = ast.parse(path.read_text())
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef) and 215 <= node.lineno <= 315:
missing_args = [a.arg for a in node.args.args if a.annotation is None]
missing_ret = node.returns is None
if missing_args or missing_ret:
print(f"Line {node.lineno}: {node.name} | missing args={missing_args} | missing return={missing_ret}")
PYRepository: ProjectTech4DevAI/kaapi-backend
Length of output: 673
Add type hints to test methods to comply with coding guidelines.
The following test methods are missing type hints for the self parameter and return type (-> None):
test_tts_empty_params_uses_smart_defaults(line 215)test_stt_empty_params_uses_smart_defaults(line 230)test_tts_language_missing_uses_auto_detect(line 244)test_tts_unsupported_language_warns_and_auto_detects(line 261)test_tts_supported_language_maps_correctly(line 281)test_stt_with_input_and_output_language(line 299)
Per coding guidelines, all Python files must include type hints on function parameters and return values.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/tests/services/llm/test_mappers.py` around lines 215 - 315, The
test methods (test_tts_empty_params_uses_smart_defaults,
test_stt_empty_params_uses_smart_defaults,
test_tts_language_missing_uses_auto_detect,
test_tts_unsupported_language_warns_and_auto_detects,
test_tts_supported_language_maps_correctly,
test_stt_with_input_and_output_language) lack type hints on the self parameter
and return type; update each signature to include a typed self (e.g., self: Any)
and a return annotation -> None, and add "from typing import Any" at the top of
the test file if not already imported. Ensure you only modify the function
signatures (and imports) so the tests' behavior is unchanged.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/app/models/llm/request.py (1)
54-58: Line 56 bakes a Sarvam-only temperature floor into the shared STT schema.
STTLLMParamsis provider-agnostic, and it is also validated directly inbackend/app/services/evaluations/evaluation.py:145-165. Puttingge=0.01on the shared field makes Sarvam’s minimum apply before provider-aware validation ever runs. Keep the model-level bound generic and enforce>=0.01only inside theprovider == "sarvamai"branch.♻️ Suggested change
temperature: float | None = Field( default=None, - ge=0.01, # sarvam minimum 0.01 + ge=0.0, le=2.0, description="Temperature parameter (not supported by all STT providers)", )if provider is not None: key = (provider, self.type) + + if provider == "sarvamai" and self.type == "stt": + temperature = validated.temperature + if temperature is not None and temperature < 0.01: + raise ValueError( + "Temperature must be >= 0.01 for provider='sarvamai' type='stt'." + ) allowed_models = SUPPORTED_MODELS.get(key)Also applies to: 256-290
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/models/llm/request.py` around lines 54 - 58, The shared STT schema has a Sarvam-specific lower bound on the temperature field: remove the ge=0.01 constraint from the temperature Field in STTLLMParams so the model-level schema remains provider-agnostic, and instead enforce temperature >= 0.01 only inside the Sarvam-specific validation branch (the provider == "sarvamai" logic in evaluation.py). Also update the other identical temperature declaration later in the file (the second temperature Field around the 256-290 block) to remove the provider-specific ge constraint so all provider-specific checks are centralized in the Sarvam branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/app/models/llm/request.py`:
- Around line 54-58: The shared STT schema has a Sarvam-specific lower bound on
the temperature field: remove the ge=0.01 constraint from the temperature Field
in STTLLMParams so the model-level schema remains provider-agnostic, and instead
enforce temperature >= 0.01 only inside the Sarvam-specific validation branch
(the provider == "sarvamai" logic in evaluation.py). Also update the other
identical temperature declaration later in the file (the second temperature
Field around the 256-290 block) to remove the provider-specific ge constraint so
all provider-specific checks are centralized in the Sarvam branch.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: de2f3823-5cbf-49a9-9346-506c1ef0617a
📒 Files selected for processing (1)
backend/app/models/llm/request.py
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/tests/services/llm/providers/test_eai.py (1)
360-377:⚠️ Potential issue | 🟡 MinorDocstring does not match test expectation.
The docstring on line 363 states the default model is
eleven_turbo_v2, but the assertion on line 377 expectseleven_v3. Update the docstring to match the actual expected behavior.📝 Proposed fix
def test_tts_uses_default_model_when_missing( self, provider, mock_client, query_params ): - """Test TTS uses default model (eleven_turbo_v2) when model_id is not provided.""" + """Test TTS uses default model (eleven_v3) when model_id is not provided.""" config = NativeCompletionConfig(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/providers/test_eai.py` around lines 360 - 377, The test method test_tts_uses_default_model_when_missing has a docstring claiming the default model is "eleven_turbo_v2" but the test asserts the provider used "eleven_v3"; update the docstring to state the expected default model "eleven_v3" so it matches the assertion in test_tts_uses_default_model_when_missing and keep the rest of the test (mock setup and assertion on mock_client.text_to_speech.convert.call_args.kwargs["model_id"]) unchanged.
🧹 Nitpick comments (2)
backend/app/tests/services/llm/test_mappers.py (1)
26-30: Consider using imported constants in test assertions.The constants
DEFAULT_STT_MODEL,DEFAULT_TTS_MODEL, andDEFAULT_TTS_VOICEare imported but tests use hardcoded strings (e.g.,"Kore"on line 225 instead ofDEFAULT_TTS_VOICE). While explicit values make test expectations clear, consider using constants for consistency if they are expected to remain stable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_mappers.py` around lines 26 - 30, Imported constants DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL, and DEFAULT_TTS_VOICE are not used in assertions; update the tests to use these symbols instead of hardcoded strings (e.g., replace occurrences like "Kore" and other literal model/voice names with DEFAULT_TTS_VOICE, DEFAULT_TTS_MODEL, DEFAULT_STT_MODEL) so expectations derive from the canonical constants; ensure the imports remain and adjust any expected-value comparisons in the test functions that map LLM config/mappers to reference these constants.backend/app/tests/services/llm/test_jobs.py (1)
55-55: Use a fixture factory for repeated completion blobs.The same inline completion payload shape is repeated across many tests (e.g., Line 55, Line 253, Line 1181, Line 1247). A small factory/helper would reduce duplication and future model-default churn.
As per coding guidelines
backend/app/tests/**/*.py: Use factory pattern for test fixtures inbackend/app/tests/.Also applies to: 253-253, 794-794, 844-844, 933-933, 982-982, 1021-1021, 1066-1066, 1100-1100, 1149-1149, 1181-1181, 1247-1247, 1379-1379, 1557-1557
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/llm/test_jobs.py` at line 55, Create a reusable fixture factory that returns the standardized completion response blob used throughout tests (the inline payload shape that includes params={"model": "gpt-4o"}), add it to the tests fixtures (e.g., conftest.py or a fixtures module) as something like completion_blob_factory, and update each test in backend/app/tests/services/llm/test_jobs.py (and the other mentioned tests) to call that factory instead of duplicating the inline payload; ensure the factory accepts overrides (e.g., model name) so callers can change params or fields without duplicating the whole blob.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/tests/api/routes/configs/test_version.py`:
- Line 637: Tests in test_version.py are inconsistent after migrating initial
configs to "gpt-4o": update any request payloads that still use "gpt-4-turbo" or
"gpt-4-version-{i}" so they match the migrated model string and the
SUPPORTED_MODELS whitelist (for provider key ("openai","text")). Locate the test
cases that build request payloads (the helpers or test functions around the
earlier diff showing "model": "gpt-4o" and the success-path assertions that
currently send "gpt-4-turbo" and "gpt-4-version-{i}") and change those model
fields to "gpt-4o" (or to a supported model value per SUPPORTED_MODELS) so
validation passes and expected 201 assertions remain valid.
---
Outside diff comments:
In `@backend/app/tests/services/llm/providers/test_eai.py`:
- Around line 360-377: The test method test_tts_uses_default_model_when_missing
has a docstring claiming the default model is "eleven_turbo_v2" but the test
asserts the provider used "eleven_v3"; update the docstring to state the
expected default model "eleven_v3" so it matches the assertion in
test_tts_uses_default_model_when_missing and keep the rest of the test (mock
setup and assertion on
mock_client.text_to_speech.convert.call_args.kwargs["model_id"]) unchanged.
---
Nitpick comments:
In `@backend/app/tests/services/llm/test_jobs.py`:
- Line 55: Create a reusable fixture factory that returns the standardized
completion response blob used throughout tests (the inline payload shape that
includes params={"model": "gpt-4o"}), add it to the tests fixtures (e.g.,
conftest.py or a fixtures module) as something like completion_blob_factory, and
update each test in backend/app/tests/services/llm/test_jobs.py (and the other
mentioned tests) to call that factory instead of duplicating the inline payload;
ensure the factory accepts overrides (e.g., model name) so callers can change
params or fields without duplicating the whole blob.
In `@backend/app/tests/services/llm/test_mappers.py`:
- Around line 26-30: Imported constants DEFAULT_STT_MODEL, DEFAULT_TTS_MODEL,
and DEFAULT_TTS_VOICE are not used in assertions; update the tests to use these
symbols instead of hardcoded strings (e.g., replace occurrences like "Kore" and
other literal model/voice names with DEFAULT_TTS_VOICE, DEFAULT_TTS_MODEL,
DEFAULT_STT_MODEL) so expectations derive from the canonical constants; ensure
the imports remain and adjust any expected-value comparisons in the test
functions that map LLM config/mappers to reference these constants.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bd8fe88a-c806-43de-beda-088253a1a8dd
📒 Files selected for processing (8)
backend/app/models/llm/constants.pybackend/app/services/llm/providers/sai.pybackend/app/tests/api/routes/configs/test_config.pybackend/app/tests/api/routes/configs/test_version.pybackend/app/tests/services/llm/providers/test_eai.pybackend/app/tests/services/llm/providers/test_sai.pybackend/app/tests/services/llm/test_jobs.pybackend/app/tests/services/llm/test_mappers.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/models/llm/constants.py
| provider="openai-native", | ||
| type="text", | ||
| params={"model": "gpt-4"}, | ||
| params={"model": "gpt-4o"}, |
There was a problem hiding this comment.
Add tests that actually exercise smart defaults for STT/TTS.
Line 55 and similar updates keep provider and params.model explicitly set, so this suite still doesn’t validate the new default-resolution path (including provider auto-assignment). Please add at least one Native and one Kaapi STT/TTS case with omitted defaultable fields and assert resolved values.
Also applies to: 253-253, 428-428, 564-564, 611-611
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Summary
Target issue is #675
Explain the motivation for making this change. What existing problem does the pull request solve?
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit
New Features
Improvements
Tests