Skip to content

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Feb 10, 2026

allcontributors bot and others added 6 commits February 10, 2026 17:21
…Paliwal

docs: add Vishesh-Paliwal as a contributor for code
Move AI-Powered Prompt Refinement and Open Observability Spans from
planned to in-progress. Add five new in-progress items: Running
Evaluators in the Playground, Enterprise SSO, US Region, Creating
Agents from the UI, and Webhooks for Deployment Linked to CI.
…ss-items

docs: update roadmap with new in-progress items
Copilot AI review requested due to automatic review settings February 10, 2026 19:28
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 10, 2026
@vercel
Copy link

vercel bot commented Feb 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Building Building Preview, Comment Feb 10, 2026 7:28pm

Request Review

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ jp-agenta
✅ mmabrouk
❌ allcontributors[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

@junaway junaway merged commit 8cfc030 into release/v0.85.3 Feb 10, 2026
7 of 9 checks passed
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 5 potential issues.

Open in Devin Review

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tenant isolation broken in delete_run: project_id filter is discarded by stmt reassignment

The delete_run method has the exact same tenant isolation bug that this PR fixes in close_run/open_run. At dao.py:403-409, the first stmt with project_id filter is completely overwritten by the second stmt = select(EvaluationRunDBE).filter(EvaluationRunDBE.id == run_id), discarding the tenant scope. This allows deleting a run from any project if you know its run_id.

Root Cause and Impact

The code at lines 403-409 does:

stmt = select(EvaluationRunDBE).filter(
    EvaluationRunDBE.project_id == project_id,
)

stmt = select(EvaluationRunDBE).filter(  # overwrites previous stmt!
    EvaluationRunDBE.id == run_id,
)

The second assignment creates a brand new select() statement, completely discarding the project_id filter from the first. The executed query only filters by run_id, meaning a user in project A can delete runs belonging to project B.

Impact: Cross-tenant data destruction. A user can delete evaluation runs belonging to other tenants/projects.

(Refers to lines 407-409)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tenant isolation broken in delete_runs: project_id filter is discarded by stmt reassignment

The delete_runs method has the same tenant isolation bug. At dao.py:435-441, the first stmt with project_id filter is overwritten by a new select() that only filters by run_ids, allowing cross-tenant bulk deletion.

Root Cause and Impact

The code at lines 435-441 does:

stmt = select(EvaluationRunDBE).filter(
    EvaluationRunDBE.project_id == project_id,
)

stmt = select(EvaluationRunDBE).filter(  # overwrites previous stmt!
    EvaluationRunDBE.id.in_(run_ids),
)

Impact: Cross-tenant bulk data destruction. A user can delete multiple evaluation runs belonging to other tenants/projects.

(Refers to lines 439-441)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tenant isolation broken in _get_run_flags: project_id filter is discarded by stmt reassignment

The _get_run_flags helper function at dao.py:2738-2743 has the same tenant isolation bug. The project_id filter is discarded when stmt is reassigned. This function is used as a security gate to check if a run is closed before allowing modifications to scenarios, results, and metrics.

Root Cause and Impact

The code at lines 2738-2743 does:

stmt = select(EvaluationRunDBE.flags).filter(
    EvaluationRunDBE.project_id == project_id,
)
stmt = select(EvaluationRunDBE.flags).filter(  # overwrites previous stmt!
    EvaluationRunDBE.id == run_id,
)

While this function is primarily used for the is_closed check (not directly for tenant isolation enforcement), the missing project_id filter means it could return flags from a run in a different project. If a run with the same ID exists in a different project with is_closed=True, it could incorrectly block operations, or vice versa.

Impact: The is_closed security check operates without tenant scoping, potentially allowing or blocking operations incorrectly across tenants.

(Refers to lines 2741-2743)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tenant isolation broken in fetch_run: project_id filter is discarded by stmt reassignment

The fetch_run method at dao.py:205-211 has the same tenant isolation bug. The project_id filter is discarded when stmt is reassigned, allowing a user to fetch evaluation runs from other tenants.

Root Cause and Impact

The code at lines 205-211 does:

stmt = select(EvaluationRunDBE).filter(
    EvaluationRunDBE.project_id == project_id,
)

stmt = select(EvaluationRunDBE).filter(  # overwrites previous stmt!
    EvaluationRunDBE.id == run_id,
)

Impact: Cross-tenant data leak. A user can read evaluation run data belonging to other tenants/projects.

(Refers to lines 209-211)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tenant isolation broken in edit_run: project_id filter is discarded by stmt reassignment

The edit_run method at dao.py:272-278 has the same tenant isolation bug. The project_id filter is discarded when stmt is reassigned, allowing a user to modify evaluation runs from other tenants.

Root Cause and Impact

The code at lines 272-278 does:

stmt = select(EvaluationRunDBE).filter(
    EvaluationRunDBE.project_id == project_id,
)

stmt = select(EvaluationRunDBE).filter(  # overwrites previous stmt!
    EvaluationRunDBE.id == run.id,
)

Impact: Cross-tenant data modification. A user can edit evaluation runs belonging to other tenants/projects.

(Refers to lines 276-278)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a critical tenant isolation vulnerability in the evaluation runs DAO where SQL query filters were being improperly constructed, potentially allowing cross-tenant data access. The fix corrects 4 methods (close_run, close_runs, open_run, open_runs) by combining project_id and run_id filters in a single .filter() call instead of reassigning the statement. Additionally, the PR includes unrelated changes: moving 5 features from planned to in-progress status in the roadmap, and adding a new contributor.

Changes:

  • Fixed tenant isolation bug in 4 evaluation run methods by properly chaining SQL filters
  • Moved 5 planned features to in-progress status in the roadmap
  • Added Vishesh Paliwal as a contributor

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
api/oss/src/dbs/postgres/evaluations/dao.py Fixed tenant isolation vulnerability in close_run, close_runs, open_run, and open_runs methods by combining filters properly
docs/src/data/roadmap.ts Moved 5 features from plannedFeatures to inProgressFeatures array
README.md Updated contributor count and added Vishesh Paliwal to contributors table
.all-contributorsrc Added Vishesh Paliwal to contributors configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +423 to +492
{
id: "evaluators-in-playground",
title: "Running Evaluators in the Playground",
description:
"Run evaluators directly in the playground to get immediate quality feedback on prompt changes. Evaluate outputs inline as you iterate on prompts. Scores, pass/fail results, and evaluator reasoning appear right next to the LLM response.",
githubUrl: "https://github.com/Agenta-AI/agenta/discussions/3702",
labels: [
{
name: "Playground",
color: "BCFF78",
},
{
name: "Evaluation",
color: "86B7FF",
},
],
},
{
id: "enterprise-sso",
title: "Enterprise SSO",
description:
"Single sign-on (SSO) support for enterprise customers. Integrate your identity provider using SAML or OIDC for secure, centralized authentication. Log in with existing corporate credentials and control user provisioning and access.",
githubUrl: "https://github.com/Agenta-AI/agenta/discussions/3703",
labels: [
{
name: "Security",
color: "000000",
},
],
},
{
id: "us-region",
title: "US Region for Agenta Cloud",
description:
"Agenta Cloud is adding a US-based region. Run your projects with all data stored within the United States. This helps meet data residency requirements that need data to stay in a specific geography.",
githubUrl: "https://github.com/Agenta-AI/agenta/discussions/3704",
labels: [
{
name: "Misc",
color: "000000",
},
],
},
{
id: "agents-from-ui",
title: "Creating Agents from the UI",
description:
"Build and configure AI agents directly from the Agenta UI. Define agent workflows, select tools, and set up orchestration logic without writing code. Test and iterate on agent behavior in the playground, then deploy to production with versioning and observability built in.",
githubUrl: "https://github.com/Agenta-AI/agenta/discussions/3705",
labels: [
{
name: "Playground",
color: "BCFF78",
},
],
},
{
id: "deployment-webhooks",
title: "Webhooks for Deployment Linked to CI",
description:
"Trigger CI/CD pipelines automatically when you deploy a prompt version. Connect Agenta deployments to your existing CI workflows so that deploying a new version kicks off automated tests, approval gates, or release processes.",
githubUrl: "https://github.com/Agenta-AI/agenta/discussions/3706",
labels: [
{
name: "Integration",
color: "FFA500",
},
],
},
];
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change appears to move 5 planned features from the plannedFeatures array to the inProgressFeatures array, but the PR title and description indicate this is a bug fix for tenant isolation in evaluation runs. This discrepancy suggests that the roadmap changes may have been included in this PR by mistake, or the PR description is incomplete and does not mention these roadmap updates.

Copilot uses AI. Check for mistakes.
@dosubot dosubot bot added the bug Something isn't working label Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants