Skip to content

Fix : Fix revisions commit endpoints silently discard unknown fields …#4415

Closed
justin212407 wants to merge 2 commits into
Agenta-AI:mainfrom
justin212407:revisions-commit-endpoint
Closed

Fix : Fix revisions commit endpoints silently discard unknown fields …#4415
justin212407 wants to merge 2 commits into
Agenta-AI:mainfrom
justin212407:revisions-commit-endpoint

Conversation

@justin212407
Copy link
Copy Markdown

Summary

What changed: Added model_config = ConfigDict(extra="forbid") to all *RevisionData classes across all five affected domains:

  • api/oss/src/core/applications/dtos.py
  • api/oss/src/core/evaluators/dtos.py
  • api/oss/src/core/queries/dtos.py
  • api/oss/src/core/testsets/dtos.py
  • sdks/python/agenta/sdk/models/workflows.py
  • sdks/python/agenta/sdk/models/testsets.py

This causes Pydantic to reject unknown fields in the data payload during revision commit, returning HTTP 422 with a validation error naming the offending field.

Why: Fixes issue #4315 : revision commit endpoints silently discarded unknown top-level fields inside data (e.g., the legacy ag_config wrapper), while still returning HTTP 200 with count: 1, creating a false sense of success. Users had no signal that their data was lost.

What problem it solves:

  • Prevents silent data loss: callers now get immediate feedback (422) if they POST with a stale envelope
  • Extends the fix to all five affected domains uniformly, as explicitly requested in the issue and by reviewer feedback (jp-agenta: "I'd extend the fix to all RevisionData, that is to all git-backed entities")
  • Enables graceful client migration: SDKs and tools can detect the 422 and update their payloads

Testing

Verified locally with curl against the running server.

Stale payload - data: {"ag_config": {...}} now returns HTTP 422:

{
  "detail": [
    {
      "type": "extra_forbidden",
      "loc": ["body", "application_revision_commit", "data", "ag_config"],
      "msg": "Extra inputs are not permitted",
      "input": { "prompt": { "messages": [{"role": "system", "content": "sys"}] } }
    }
  ]
}

Valid payload - data: {"parameters": {...}} continues to return HTTP 200 with count: 1. No extra_forbidden error.


Demo

Before - stale ag_config payload silently accepted, returned count: 1, stored data: {}

image

After - stale ag_config payload rejected with HTTP 422 and clear error naming the offending field

image Screenshot From 2026-05-25 19-40-27

Checklist

  • No UI changes (N/A for demo)
  • Relevant tests pass locally (ready to verify)
  • Relevant linting and formatting pass locally (ready to verify)
  • I have signed the CLA

Related Issues


Files Changed

  • api/oss/src/core/applications/dtos.py- Added model_config = ConfigDict(extra="forbid") to *RevisionData classes
  • api/oss/src/core/evaluators/dtos.py- Added model_config = ConfigDict(extra="forbid") to *RevisionData classes
  • api/oss/src/core/queries/dtos.py - Added model_config = ConfigDict(extra="forbid") to *RevisionData classes
  • api/oss/src/core/testsets/dtos.py - Added model_config = ConfigDict(extra="forbid") to *RevisionData classes
  • sdks/python/agenta/sdk/models/workflows.py - Added model_config = ConfigDict(extra="forbid") to WorkflowRevisionData
  • sdks/python/agenta/sdk/models/testsets.py - Added model_config = ConfigDict(extra="forbid") to *RevisionData classes

…in data

Signed-off-by: justin212407 <charlesjustin2124@gmail.com>
Copilot AI review requested due to automatic review settings May 25, 2026 14:23
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label May 25, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented May 25, 2026

@justin212407 is attempting to deploy a commit to the agenta projects Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot Bot added the bug Something isn't working label May 25, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Refactor
    • Enhanced validation across data models to enforce stricter input validation and reject unexpected or invalid fields.
    • API endpoints now provide improved data integrity checks for applications, evaluators, queries, testsets, and workflows.

Walkthrough

This PR updates revision data models across the backend API and Python SDK to forbid unknown fields via Pydantic's ConfigDict(extra="forbid") configuration. Six models are affected: ApplicationRevisionData, EvaluatorRevisionData, QueryRevisionData, and TestsetRevisionData, each now rejecting stale or malformed payloads instead of silently discarding them.

Changes

Strict validation for revision data models

Layer / File(s) Summary
Forbid extra fields in backend API revision data models
api/oss/src/core/applications/dtos.py, api/oss/src/core/evaluators/dtos.py, api/oss/src/core/queries/dtos.py, api/oss/src/core/testsets/dtos.py
Pydantic imports updated to include ConfigDict. ApplicationRevisionData, EvaluatorRevisionData, QueryRevisionData, and TestsetRevisionData each add model_config = ConfigDict(extra="forbid") to reject unknown fields during validation.
Forbid extra fields in Python SDK revision data models
sdks/python/agenta/sdk/models/testsets.py, sdks/python/agenta/sdk/models/workflows.py
Pydantic imports updated to include ConfigDict. TestsetRevisionData, WorkflowRevisionData, EvaluatorRevisionData, and ApplicationRevisionData each add model_config = ConfigDict(extra="forbid") to enforce strict field validation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly references the core issue being fixed: revision commit endpoints silently discarding unknown fields. It matches the primary change throughout the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining what changed, why, what it solves, testing verification, and related issues.
Linked Issues check ✅ Passed The PR fully addresses the coding requirements from issue #4315: adds ConfigDict(extra='forbid') to all *RevisionData classes across all five affected domains, preventing silent field discard and returning HTTP 422 on unknown fields.
Out of Scope Changes check ✅ Passed All changes are narrowly scoped to adding model_config with extra='forbid' to *RevisionData classes across six specified files, directly addressing the linked issue with no extraneous modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens revision commit validation across git-backed entities to prevent silent data loss when clients send stale/unknown top-level fields inside the data payload (e.g., legacy ag_config). It does so by configuring the relevant Pydantic *RevisionData models to forbid extra fields, causing FastAPI to return HTTP 422 with a precise validation error instead of accepting-and-dropping data.

Changes:

  • Set model_config = ConfigDict(extra="forbid") on workflow/application/evaluator revision data models in the Python SDK (which the OSS API reuses for workflows).
  • Set model_config = ConfigDict(extra="forbid") on QueryRevisionData and TestsetRevisionData DTOs in the OSS API.
  • Update Pydantic imports to include ConfigDict where needed.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sdks/python/agenta/sdk/models/workflows.py Forbid unknown keys in workflow/application/evaluator revision data payloads via ConfigDict(extra="forbid").
sdks/python/agenta/sdk/models/testsets.py Forbid unknown keys in SDK TestsetRevisionData.
api/oss/src/core/testsets/dtos.py Forbid unknown keys in API TestsetRevisionData to avoid silent drops on commits.
api/oss/src/core/queries/dtos.py Forbid unknown keys in API QueryRevisionData to avoid silent drops on commits.
api/oss/src/core/evaluators/dtos.py Forbid unknown keys in evaluator revision data payloads.
api/oss/src/core/applications/dtos.py Forbid unknown keys in application revision data payloads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 121 to 123
class WorkflowRevisionData(BaseModel):
model_config = ConfigDict(extra="forbid")
uri: Optional[str] = None
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual curl verification confirms the 422 behavior works correctly for /applications/revisions/commit. Acceptance tests for all five domains would be a good follow-up. Happy to add those here in this pr or a new one as preffered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 353f9ae9-b40f-4677-9aa8-c06dbfd5f49c

📥 Commits

Reviewing files that changed from the base of the PR and between 3c9b1db and ec69a9f.

📒 Files selected for processing (6)
  • api/oss/src/core/applications/dtos.py
  • api/oss/src/core/evaluators/dtos.py
  • api/oss/src/core/queries/dtos.py
  • api/oss/src/core/testsets/dtos.py
  • sdks/python/agenta/sdk/models/testsets.py
  • sdks/python/agenta/sdk/models/workflows.py



class QueryRevisionData(BaseModel):
model_config = ConfigDict(extra="forbid")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether query payload sanitization swallows validation failures and removes data.
rg -n -C4 '_sanitize_persisted_query_revision_data|model_validate\(|except Exception|payload\.pop\("data"' api/oss/src/core/queries/service.py

Repository: Agenta-AI/agenta

Length of output: 1062


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect QueryRevisionData definition =="
ls -la api/oss/src/core/queries/dtos.py
wc -l api/oss/src/core/queries/dtos.py
rg -n "class QueryRevisionData|model_config" api/oss/src/core/queries/dtos.py
# show around the forbid-extra config
sed -n '120,180p' api/oss/src/core/queries/dtos.py

echo
echo "== Inspect _sanitize_persisted_query_revision_data usage & surrounding flow =="
wc -l api/oss/src/core/queries/service.py
rg -n "_sanitize_persisted_query_revision_data" api/oss/src/core/queries/service.py
# show the whole sanitizer + immediate callsite area
sed -n '1,220p' api/oss/src/core/queries/service.py

echo
echo "== Check router layer behavior when 'data' is missing =="
# locate router methods handling persisted query revisions
rg -n "persist|revision|payload\\[\"data\"\\]|payload\\.pop\\(\"data\"|sanitize" api/oss/src/apis/fastapi/ api/oss/src/core/queries -S || true
# try to show relevant router/service integration
rg -n "core\\.queries|queries\\.service|QueryRevision" api/oss/src/apis/fastapi -S || true

Repository: Agenta-AI/agenta

Length of output: 50373


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate queries router and commit endpoint =="
ls -la api/oss/src/apis/fastapi/queries || true
rg -n "commit_query_revision|/queries/.*/revisions/commit|QueryRevisionCommit" api/oss/src/apis/fastapi/queries -S || true
rg -n "commit.*revision|revisions/commit|QueryRevisionCommit" api/oss/src/apis/fastapi/queries -S || true

echo
echo "== Inspect queries router handler body around commit =="
# Try best-effort: grab the first matching handler and print a small window
match_file="$(rg -l "commit_query_revision" api/oss/src/apis/fastapi/queries -S || true)"
if [ -n "${match_file}" ]; then
  echo "Router file: ${match_file}"
  rg -n "commit_query_revision" "${match_file}" -S | head -n 20
  line="$(rg -n "commit_query_revision" "${match_file}" -S | head -n 1 | cut -d: -f1)"
  start=$((line-40)); end=$((line+80))
  sed -n "${start},${end}p" "${match_file}" | nl -ba | head -n 200
else
  echo "No commit_query_revision call found in queries router."
fi

echo
echo "== Inspect core QueryRevisionCommit DTO and commit service flow =="
rg -n "class QueryRevisionCommit|QueryRevisionCommit\\b" api/oss/src/core/queries/dtos.py api/oss/src/core/queries/service.py -S
# Show commit_query_revision method implementation region
rg -n "async def commit_query_revision" api/oss/src/core/queries/service.py -S
l="$(rg -n "async def commit_query_revision" api/oss/src/core/queries/service.py -S | head -n 1 | cut -d: -f1)"
start=$((l-10)); end=$((l+80))
sed -n "${start},${end}p" api/oss/src/core/queries/service.py | nl -ba | head -n 200

echo
echo "== Inspect FastAPI request models for queries revision commit (extras behavior) =="
# Look for the request model used in router for commit endpoint
rg -n "RevisionCommit|QueryRevisionCommit|revisions/commit" api/oss/src/apis/fastapi/queries -S || true
# If there is a models.py, inspect it briefly
if [ -f api/oss/src/apis/fastapi/queries/models.py ]; then
  echo "Found api/oss/src/apis/fastapi/queries/models.py"
  wc -l api/oss/src/apis/fastapi/queries/models.py
  rg -n "RevisionCommit|QueryRevisionCommit|extra" api/oss/src/apis/fastapi/queries/models.py -S || true
  rg -n "class .*Commit|def .*commit|revision_commit" api/oss/src/apis/fastapi/queries/models.py -S || true
else
  echo "No api/oss/src/apis/fastapi/queries/models.py"
  ls -la api/oss/src/apis/fastapi/queries || true
fi

Repository: Agenta-AI/agenta

Length of output: 3379


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Inspect QueryRevisionCommit DTO definition (type of .data) =="
rg -n "class QueryRevisionCommit\\b" api/oss/src/core/queries/dtos.py
l="$(rg -n "class QueryRevisionCommit\\b" api/oss/src/core/queries/dtos.py | head -n 1 | cut -d: -f1)"
start=$((l-30)); end=$((l+120))
sed -n "${start},${end}p" api/oss/src/core/queries/dtos.py

echo
echo "== Inspect FastAPI request model for commit =="
rg -n "class QueryRevisionCommitRequest" api/oss/src/apis/fastapi/queries/models.py
l="$(rg -n "class QueryRevisionCommitRequest" api/oss/src/apis/fastapi/queries/models.py | head -n 1 | cut -d: -f1)"
start=$((l-20)); end=$((l+60))
sed -n "${start},${end}p" api/oss/src/apis/fastapi/queries/models.py

echo
echo "== Inspect router handler for commit =="
l="$(rg -n "async def commit_query_revision\\b" api/oss/src/apis/fastapi/queries/router.py | head -n 1 | cut -d: -f1)"
start=$((l-30)); end=$((l+90))
sed -n "${start},${end}p" api/oss/src/apis/fastapi/queries/router.py

echo
echo "== Inspect QueriesService.commit_query_revision around sanitize =="
l="$(rg -n "async def commit_query_revision\\b" api/oss/src/core/queries/service.py | head -n 1 | cut -d: -f1)"
start=$((l-40)); end=$((l+80))
sed -n "${start},${end}p" api/oss/src/core/queries/service.py

Repository: Agenta-AI/agenta

Length of output: 13704


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find call sites of commit_query_revision (core service) =="
rg -n "commit_query_revision\\(" api/oss/src/core/queries -S
rg -n "commit_query_revision\\(" api/oss/src -S | head -n 200

echo
echo "== Inspect how QueryRevisionCommit is constructed in those call sites =="
# Show small windows around each occurrence in core/queries/service.py
rg -n "commit_query_revision\\(" api/oss/src/core/queries/service.py -n -S
for l in $(rg -n "commit_query_revision\\(" api/oss/src/core/queries/service.py -S | cut -d: -f1 | head -n 20); do
  start=$((l-20)); end=$((l+40));
  echo "--- service.py around line ${l} ---";
  sed -n "${start},${end}p" api/oss/src/core/queries/service.py | nl -ba | head -n 120
done

echo
echo "== Check router model validation config for QueryRevisionCommitRequest =="
# QueryRevisionCommitRequest just wraps QueryRevisionCommit; check if any custom config is present
l="$(rg -n "class QueryRevisionCommitRequest\\b" api/oss/src/apis/fastapi/queries/models.py | head -n 1 | cut -d: -f1)"
start=$((l-30)); end=$((l+40))
sed -n "${start},${end}p" api/oss/src/apis/fastapi/queries/models.py

Repository: Agenta-AI/agenta

Length of output: 1240


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Find QueryRevisionCommit constructions =="
rg -n "QueryRevisionCommit\\(" api/oss/src/core/queries/service.py api/oss/src/core/queries -S
rg -n "QueryRevisionCommit\\(" api/oss/src -S | head -n 200

echo
echo "== Inspect service.py around internal commit callers (~1052, ~1278) =="
sed -n '980,1120p' api/oss/src/core/queries/service.py | cat -n
echo
sed -n '1200,1360p' api/oss/src/core/queries/service.py | cat -n

echo
echo "== Search for model_construct or raw dict usage for QueryRevisionData =="
rg -n "model_construct\\(|QueryRevisionData\\s*=|QueryRevisionData\\(" api/oss/src/core/queries -S
rg -n "QueryRevisionData\\.model_validate\\(|QueryRevisionData\\(" api/oss/src -S | head -n 200

Repository: Agenta-AI/agenta

Length of output: 13125


Avoid silently dropping QueryRevisionData when payload sanitization fails

  • _sanitize_persisted_query_revision_data catches Exception from QueryRevisionData.model_validate(...) and returns None; _sanitize_query_revision_payload then does payload.pop("data", None), so sanitizer/model-validation failures get converted into “no data”.
  • For the FastAPI commit route, extra fields should already be rejected because QueryRevisionData uses ConfigDict(extra="forbid") and the router accepts QueryRevisionCommitRequest (nested QueryRevisionCommit.data: Optional[QueryRevisionData]); still, invalid data reaching the sanitizer should be surfaced/logged rather than silently dropped.



class TestsetRevisionData(BaseModel):
model_config = ConfigDict(extra="forbid")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether testset payload sanitization swallows validation failures and removes data.
rg -n -C4 '_sanitize_persisted_testset_revision_data|model_validate\(|except Exception|payload\.pop\("data"' api/oss/src/core/testsets/service.py

Repository: Agenta-AI/agenta

Length of output: 2164


Avoid silently dropping invalid testset revision payloads

TestsetRevisionData forbids extras, but api/oss/src/core/testsets/service.py wraps TestsetRevisionData.model_validate(data) in try/except Exception and returns None; the caller then executes payload.pop("data", None) when sanitization fails. This converts validation failures into a silent removal of the revision payload’s data field instead of surfacing an error.

junaway pushed a commit to justin212407/agenta that referenced this pull request May 25, 2026
Although Pydantic v2 inherits model_config from WorkflowRevisionData,
declaring extra="forbid" explicitly on ApplicationRevisionData and
EvaluatorRevisionData (both API and SDK) makes the strict-validation
contract grep-friendly and matches the per-domain expectation in Agenta-AI#4315.

Also drop the now-unused ConfigDict import from workflows/dtos.py;
WorkflowRevisionData inherits its strict config from the SDK base.

Picked up from PR Agenta-AI#4415.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@justin212407
Copy link
Copy Markdown
Author

Closing this since #4345 got merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Revisions commit endpoints silently discard unknown fields in data

2 participants