Guardrails: add rephrase support for input validators#783
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/services/collections/helpers.py (1)
5-16:⚠️ Potential issue | 🔴 CriticalAdd postponed annotations to fix runtime
NameErrorforCloudStoragetype annotation.The
CloudStoragetype is imported only underTYPE_CHECKINGbut used as a runtime annotation on line 70. Without postponed annotations, Python evaluates this annotation at function definition time, causing aNameErrorwhen the module is imported.Proposed fix
+from __future__ import annotations + from typing import TYPE_CHECKING from uuid import UUID🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/collections/helpers.py` around lines 5 - 16, Add postponed evaluation of annotations so the runtime NameError for CloudStorage (which is only imported under TYPE_CHECKING) is avoided: insert "from __future__ import annotations" at the very top of the module (before other imports) so any function or parameter annotated with CloudStorage (the symbol imported only under TYPE_CHECKING and referenced later in the file) is treated as a string and not evaluated at import time; this will fix the NameError without changing the CloudStorage import or other type hints.backend/app/services/llm/jobs.py (1)
427-458:⚠️ Potential issue | 🟠 MajorRephrased
safe_textwill include prompt-template scaffolding when returned directly.At lines 427-430 the prompt template is interpolated into
query.input.content.valuebefore guardrails run at line 432.apply_input_guardrailsthen callsrun_guardrails_validation(query.input.content.value, …)on that already-templated string, so whenrephrase_needed=Truethesafe_textreturned to the caller is a rephrase oftemplate + user_input, not of the user's input. Returning that directly to the end user as the final "LLM" response (lines 447-458) will expose template boilerplate/instructions in the user-facing answer.Consider one of:
- Run input guardrails against the raw user input (before template interpolation), and only interpolate the (possibly rephrased) value into the template for the LLM path.
- In the rephrase short-circuit, return the rephrased raw input rather than the templated-then-rephrased string.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/jobs.py` around lines 427 - 458, The guardrails are being run on the already-interpolated string (template applied to query.input.content.value) which causes rephrased "safe_text" to include template scaffolding; preserve the raw user input (e.g., save original = query.input.content.value before the template replace), call apply_input_guardrails/run_guardrails_validation on that raw value, and if guardrail_direct_response or rephrase_needed triggers, return the rephrased raw input (not the templated string) as the short-circuit LLM response (update the block that builds llm_response/guardrail_direct_response to use the saved original/rephrased value); only perform template interpolation (template.replace...) when proceeding to the actual LLM call path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/collections/create_collection.py`:
- Line 27: The job total calculation currently treats missing file_size_kb as 0
and under-reports total_size_mb; update the logic to call the imported helper
calculate_total_size_kb for documents that lack file_size_kb (or when
file_size_kb is falsy) and sum that returned value into total_size_kb before
converting to total_size_mb; apply this change wherever file_size_kb is
defaulted to 0 (references: calculate_total_size_kb, total_size_mb,
file_size_kb) so both the initial total computation and the later aggregation
use the helper instead of 0.
In `@backend/app/services/collections/helpers.py`:
- Line 106: Replace the current fallback expression "doc_size_kb =
doc.file_size_kb or 15 * 1024" with an explicit None check and a smaller safe
fallback to avoid treating 0.0 as missing and to prevent oversized batches; for
example set a constant SAFE_FALLBACK_KB = 5 * 1024 and compute "doc_size_kb =
doc.file_size_kb if doc.file_size_kb is not None else SAFE_FALLBACK_KB"
(referencing doc.file_size_kb and doc_size_kb in helpers.py).
- Around line 79-81: The log in calculate_total_size_kb prints the raw filename
(doc.fname); replace that with a masked value using the existing mask_string
utility so sensitive filenames are not logged. Update the logger.info call in
calculate_total_size_kb to keep the same prefix "[calculate_total_size_kb]" and
interpolate mask_string(doc.fname) instead of doc.fname (refer to logger.info,
calculate_total_size_kb, doc.fname, and mask_string to locate and change the
code).
In `@backend/app/services/llm/jobs.py`:
- Around line 441-458: The early-return branch that handles
guardrail_direct_response builds a synthetic LLMResponse and returns a
BlockResult but skips creating any call/audit record and skips telemetry,
execution observation, output validation, and metadata propagation; update the
guardrail_direct_response path to (1) call the existing create_llm_call (or
create a minimal audit record) and set llm_call_id so calls are auditable, (2)
invoke record_llm_call_started and record_llm_call_finished (and
observe_llm_execution) around the synthetic response to preserve telemetry
traces, (3) run apply_output_guardrails on the guardrail_direct_response output
before packaging it into LLMResponse/BlockResult, (4) include
metadata=request_metadata on the returned BlockResult, and (5) replace the
synthetic provider="guardrail" and model="guardrail" with meaningful identifiers
or a documented constant used elsewhere so downstream analytics see consistent
provider/model values (refer to symbols: guardrail_direct_response, Usage,
LLMCallResponse, LLMResponse, TextOutput, TextContent, BlockResult,
create_llm_call, record_llm_call_started, record_llm_call_finished,
observe_llm_execution, apply_output_guardrails, request_metadata).
---
Outside diff comments:
In `@backend/app/services/collections/helpers.py`:
- Around line 5-16: Add postponed evaluation of annotations so the runtime
NameError for CloudStorage (which is only imported under TYPE_CHECKING) is
avoided: insert "from __future__ import annotations" at the very top of the
module (before other imports) so any function or parameter annotated with
CloudStorage (the symbol imported only under TYPE_CHECKING and referenced later
in the file) is treated as a string and not evaluated at import time; this will
fix the NameError without changing the CloudStorage import or other type hints.
In `@backend/app/services/llm/jobs.py`:
- Around line 427-458: The guardrails are being run on the already-interpolated
string (template applied to query.input.content.value) which causes rephrased
"safe_text" to include template scaffolding; preserve the raw user input (e.g.,
save original = query.input.content.value before the template replace), call
apply_input_guardrails/run_guardrails_validation on that raw value, and if
guardrail_direct_response or rephrase_needed triggers, return the rephrased raw
input (not the templated string) as the short-circuit LLM response (update the
block that builds llm_response/guardrail_direct_response to use the saved
original/rephrased value); only perform template interpolation
(template.replace...) when proceeding to the actual LLM call path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c0d8a039-a979-4c6f-8ad3-5f0954490395
📒 Files selected for processing (4)
backend/app/api/docs/documents/upload.mdbackend/app/services/collections/create_collection.pybackend/app/services/collections/helpers.pybackend/app/services/llm/jobs.py
ea9434f to
c2b39fc
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Summary
Target issue is #784
Notes
Summary by CodeRabbit
Release Notes
Improvements
Tests