Skip to content

[release] v0.98.1#4262

Merged
bekossy merged 32 commits intomainfrom
release/v0.98.1
May 5, 2026
Merged

[release] v0.98.1#4262
bekossy merged 32 commits intomainfrom
release/v0.98.1

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented May 5, 2026

New version v0.98.1 in

  • (web)
  • web/oss
  • web/ee
  • sdk
  • api
  • services

mmabrouk and others added 22 commits April 28, 2026 11:53
Rename the design folder to reflect the broader scope and reorganize
into a top-level RFC plus per-work-package subfolders.

- Top-level: RFC covering prompt variables, JSON value handling,
  template rendering semantics (mustache as new default for new apps,
  curly deprecated, fstring/jinja2), the per-service variable matrix,
  decisions, work-package layering (B1-B3 backend, F1-F3 frontend, D1
  docs), rollout, test plan, future directions for sharing the prompt
  template across services.
- wp-b1-runtime-foundation/: scoped to judge backend patch (provider
  /secret resolution + temperature removal) and the low-level rendering
  helper extraction. Plan, implementation notes, QA, research,
  variable-and-template analysis, and status all aligned with the RFC's
  WP-B1 scope; helper boundary explicitly excludes message-rendering
  and JSON-return rendering (those move to WP-B2).
Patches the LLM-as-a-judge runtime to share the provider/secret resolution
path with chat/completion and extracts the per-mode template substitution
logic into a single helper module so WP-B2/WP-B3 can build on it without
re-touching the judge or `PromptTemplate`.

Phase 1 — judge backend patch (`auto_ai_critique_v0`):
- resolve provider settings via `SecretsManager.ensure_secrets_in_workflow()` +
  `SecretsManager.get_provider_settings_from_workflow(model)` (custom and
  self-hosted models configured in Model Hub now reach the judge);
- raise `InvalidSecretsV0Error` with the selected model when settings are
  missing, matching chat/completion;
- route the LLM call through `mockllm.acompletion` under
  `mockllm.user_aws_credentials_from(provider_settings)` (replaces the
  module-level `litellm.openai_key = ...` pattern; scrubs ECS/Lambda role
  env vars for the duration of the call);
- stop sending `temperature=0.01`. Newer providers reject the kwarg and
  the judge has no UI to configure it.

Phase 2 — low-level template helper (`agenta.sdk.utils.templating`):
- `render_template(*, template, mode, context) -> str` covering `curly`,
  `fstring`, `jinja2` (mustache lands in WP-B3);
- typed `UnresolvedVariablesError(ValueError)` carries the unresolved set
  so call sites can format their preferred message text;
- both call sites — `PromptTemplate._format_with_template` (chat/completion)
  and the judge's `_format_with_template` — funnel through it. Public
  behavior is unchanged: `PromptTemplate` keeps its legacy
  `"Unreplaced variables in curly template: ['x'].{Hint}"` wording (pinned
  by a regression test); the judge keeps its silent-return-on-Jinja-error
  contract.

Tests (sdk/oss/tests/pytest/unit/, 249/249 passing):
- `test_auto_ai_critique_v0_runtime.py` — provider resolution (standard +
  custom), missing-settings error, no-temperature, response_format /
  json_schema forwarding, context aliases, result normalization;
- `test_render_template_helper.py` — each mode + JSONPath / JSON Pointer
  / literal-key-first / whole-object compact JSON / sandbox violation, plus
  call-site message-text regression tests for both `PromptTemplate` and
  the judge handler.
- status log: record Phase 1 + Phase 2 completion and the post-review
  cleanup pass (typed `UnresolvedVariablesError`, dead-helper removal,
  resolver de-duplication, message-text regression tests).
- code-review/: scope, findings, risks, questions, summary, scorecard from
  the review pass.
…ring

Two bugs surfaced while reviewing the WP-B1 rendering helper for special-character
handling:

1. Backslash doubling. _render_curly defensively called .replace("\\", "\\\\") on
   every substitution value. The defensive escape was meant to neutralize regex
   backreferences, but re.sub with a function callable does not interpret
   backslash escapes in the return value (Python's documented behavior). Net
   effect: every backslash in a user-supplied value reached the LLM doubled —
   e.g. a Windows-style path with one backslash arrived with two. Drop the
   .replace; values now round-trip correctly.

2. Empty placeholder leak. resolve_dot_notation("", data) short-circuited to
   data because the post-split(".") loop never executed, so the runtime
   serialized the whole context dict (including any secrets, ground-truth
   columns, trace fields, etc.) into the prompt whenever a template contained
   {{}}. resolve_dot_notation now raises on empty expr, which surfaces as a
   normal UnresolvedVariablesError.

Tests:

- sdk/oss/tests/pytest/unit/test_render_template_helper.py grew from 21 to 81
  tests covering curly basics, placeholder syntax (whitespace / multiple /
  repeated / multi-line / unicode), value coercion, value safety (no recursive
  rendering, backslash round-trip, regex backref round-trip), error contract
  (unresolved set, deep misses, mid-path scalars, empty placeholder), regex
  edge cases (triple/quadruple braces, mismatched braces, embedded newlines),
  fstring (escape, format specs, index access, value safety), jinja2 (raw
  blocks, filters, conditionals, undefined behavior, sandbox violations), and
  call-site preservation. Both bug fixes are pinned by regression tests.
- Full SDK unit suite: 309/309 passing.

Docs:

- New docs/design/prompt-runtime-unification/appendix-rendering-edge-cases.md
  documents the template/value boundary, per-mode escape mechanisms, the
  curly-mode escape gap, frontend↔backend extractor mismatches, and what's
  pinned by tests.
- WP-B3 in the RFC now carries an explicit note that brace escaping for curly
  is an open question and that mustache (greenfield) is the cleanest place to
  land an explicit escape mechanism.
Companion to qa.md (which covers unit tests). Walks through a real-stack
verification of the WP-B1 changes plus the rendering review pass:

- Section A: new functionality — custom and self-hosted models in the judge,
  via UI and direct calls. Includes the temperature-removal check for
  reasoning models that previously rejected the hard-coded temperature=0.01.
- Sections B–C: regression coverage for variable rendering across chat,
  completion, and judge — every curly mode feature (top-level, nested, array,
  JSONPath, JSON Pointer, literal-key-first, whitespace, repeated, multiple),
  fstring brace-escape, jinja2 filters/conditionals/raw blocks, sandbox
  blocking. Plus the two bug-fix verifications (backslash round-trip, empty
  placeholder no longer leaks context).
- Section D: same regression matrix exercised directly via the API rather
  than the playground, to isolate transport from rendering.
- Section E: UX touch-ups for the new error paths.

Closes with a side note on SDK-direct usage of LLM-as-a-judge: the canonical
path (evaluation service / runtime) is unchanged; the only behavior shift is
for bare-script callers that previously relied on env-var key pickup instead
of bootstrapping the workflow context. Documents the risk and the mitigation
direction for WP-B2.
These were internal review notes that don't belong in the shipped design
workspace. Remove the code-review/ subfolder; everything user-facing
(plan, implementation-notes, qa, manual-qa-checklist, status, README)
stays.
…I/agenta into feat/llm-judge-chat-unification
The legacy admin_router.create_accounts endpoint and the new
fastapi/accounts/router.create_accounts both emit operation IDs that
generate the same TypeScript method name in Fern client codegen.
Excluding the legacy route from the OpenAPI schema removes the
collision at the source, eliminating the need for downstream Fern
post-processors to disambiguate the generated method.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment May 5, 2026 11:04am

Request Review

@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label May 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Centralizes prompt rendering into a new render_template helper (curly/fstring/jinja2), updates PromptTemplate and runtime handlers to use it, patches auto_ai_critique_v0 to use workflow-scoped provider settings and omit temperature from LLM calls, adds unit tests and RFC documentation, and bumps multiple package versions.

Changes

Prompt runtime unification — WP‑B1 (templating, handler, SDK tests, docs)

Layer / File(s) Summary
Data / Types
sdk/agenta/sdk/utils/templating.py
Add TemplateMode, UnresolvedVariablesError, _coerce_to_str, and public render_template(template, mode, context) implementing curly, fstring, and jinja2.
Lookup behavior
sdk/agenta/sdk/utils/resolvers.py
resolve_dot_notation now raises KeyError for empty expressions to prevent {{}} resolving to the whole context.
Core integration
sdk/agenta/sdk/utils/types.py
PromptTemplate._format_with_template delegates to render_template, validates format, and maps UnresolvedVariablesError / KeyError / Jinja errors into TemplateFormatError.
Handler wiring
sdk/agenta/sdk/engines/running/handlers.py
_format_with_template delegates to shared renderer; auto_ai_critique_v0 now calls SecretsManager.ensure_secrets_in_workflow() + get_provider_settings_from_workflow(model), raises InvalidSecretsV0Error when missing, and invokes mockllm.acompletion under mockllm.user_aws_credentials_from(provider_settings) passing **provider_settings and omitting temperature.
Tests
sdk/oss/tests/pytest/unit/test_render_template_helper.py, sdk/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py
Add exhaustive unit tests for render_template (curly/fstring/jinja2) and runtime tests for auto_ai_critique_v0 covering secret resolution, provider settings, omission of temperature, response_format forwarding, template aliases, normalization, and error contracts.
Docs / Process
docs/design/prompt-runtime-unification/**
Add RFC, appendix, findings, WP‑B1 implementation notes/plan/QA/manual checklist/status/research/variable-analysis documenting rules, test plans, sequencing, and edge-case appendices.

Miscellaneous — version bumps, API metadata, contributors, README, frontend UI change

Layer / File(s) Summary
Manifest updates
api/pyproject.toml, sdk/pyproject.toml, services/pyproject.toml, web/package.json, web/ee/package.json, web/oss/package.json
Bumped package versions from 0.98.00.98.1 in six manifest files; no other manifest fields changed.
API metadata
api/oss/src/routers/admin_router.py
Added include_in_schema=False to the /accounts POST route decorator (no handler/signature changes).
Contributors / README
.all-contributorsrc, README.md
Added contributor Devarsh Prajapati and updated All Contributors badge/count and table entry.
Frontend UI
web/oss/src/components/GetStarted/GetStarted.tsx
Use Jotai atom setOnboardingWidgetActivationAtom to open the create-prompt onboarding widget for "test_prompt"; redirect fallback changed to "/apps" and callback deps updated.

Sequence Diagram(s)

sequenceDiagram
    participant Handler as auto_ai_critique_v0
    participant Secrets as SecretsManager
    participant Renderer as render_template
    participant MockLLM as mockllm.acompletion

    Handler->>Secrets: ensure_secrets_in_workflow()
    Handler->>Secrets: get_provider_settings_from_workflow(model)
    Secrets-->>Handler: provider_settings or null
    alt provider_settings missing
        Handler->>Handler: raise InvalidSecretsV0Error
    else provider_settings present
        Handler->>Renderer: render messages/aliases with render_template(...)
        Handler->>MockLLM: user_aws_credentials_from(provider_settings) (enter)
        Handler->>MockLLM: acompletion(messages, response_format, **provider_settings)
        MockLLM-->>Handler: LLM response
        Handler->>Handler: normalize/parse response (JSON parsing, result normalization)
        Handler-->>Caller: evaluation result / errors
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Agenta-AI/agenta#4231: Directly related upstream work implementing WP‑B1 runtime unification and the shared render_template helper.
  • Agenta-AI/agenta#4252: Related change that also modifies the FastAPI /accounts route decorator to add include_in_schema=False.
  • Agenta-AI/agenta#4249: Related edits touching SDK runtime/template stack and PromptTemplate behavior.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.38% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[release] v0.98.1' clearly summarizes the main change: a version release across multiple packages.
Description check ✅ Passed The description lists the affected packages (web, web/oss, web/ee, sdk, api, services) and indicates a new version is being released, which relates to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release/v0.98.1

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

feat(sdk): prompt runtime unification + WP-B1 implementation
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels May 5, 2026
chore(api): hide duplicate /admin/accounts route from OpenAPI
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 05208aa5-9c45-4cee-b89a-399a3038a73e

📥 Commits

Reviewing files that changed from the base of the PR and between 7e85a68 and 8a1b14d.

📒 Files selected for processing (17)
  • docs/design/prompt-runtime-unification/README.md
  • docs/design/prompt-runtime-unification/appendix-rendering-edge-cases.md
  • docs/design/prompt-runtime-unification/findings.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/README.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/implementation-notes.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/manual-qa-checklist.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/plan.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/qa.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/research.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/status.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/variable-and-template-analysis.md
  • sdk/agenta/sdk/engines/running/handlers.py
  • sdk/agenta/sdk/utils/resolvers.py
  • sdk/agenta/sdk/utils/templating.py
  • sdk/agenta/sdk/utils/types.py
  • sdk/oss/tests/pytest/unit/test_auto_ai_critique_v0_runtime.py
  • sdk/oss/tests/pytest/unit/test_render_template_helper.py
✅ Files skipped from review due to trivial changes (4)
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/plan.md
  • docs/design/prompt-runtime-unification/findings.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/research.md
  • docs/design/prompt-runtime-unification/wp-b1-runtime-foundation/status.md

Comment thread sdk/agenta/sdk/engines/running/handlers.py
Comment thread sdk/agenta/sdk/engines/running/handlers.py Outdated
Comment thread sdk/agenta/sdk/utils/resolvers.py
Comment thread sdk/agenta/sdk/utils/types.py
@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented May 5, 2026

Railway Preview Environment

Status Destroyed (PR closed)

Updated at 2026-05-05T11:45:02.197Z

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 5, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
4 out of 5 committers have signed the CLA.

✅ mmabrouk
✅ bekossy
✅ jp-agenta
✅ Devarsh05
❌ allcontributors[bot]
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
web/oss/src/components/GetStarted/GetStarted.tsx (1)

49-63: ⚡ Quick win

Consolidate workspace-context resolution to avoid fallback drift.

The same waitForWorkspaceContext + buildPostLoginPath block is repeated in three callbacks, and fallback behavior has already diverged (/w vs /apps). Extracting a shared resolver (and a shared fallback constant) will keep navigation behavior consistent.

♻️ Suggested refactor sketch
+const WORKSPACE_FALLBACK_PATH = "/apps"
+
+const resolveWorkspacePath = useCallback(async () => {
+  const context = await waitForWorkspaceContext({
+    timeoutMs: 5000,
+    requireProjectId: true,
+    requireWorkspaceId: true,
+    requireOrgData: true,
+  })
+  return buildPostLoginPath(context)
+}, [])

 const navigateToDestination = useCallback(async () => {
   try {
-    const context = await waitForWorkspaceContext({ ... })
-    const path = buildPostLoginPath(context)
+    const path = await resolveWorkspacePath()
     router.push(path)
   } catch (e) {
     console.error("Failed to resolve workspace context", e)
-    router.push("/w")
+    router.push(WORKSPACE_FALLBACK_PATH)
   }
-}, [router])
+}, [router, resolveWorkspacePath])

 // same replacement pattern in handleSelection + handleNext

Also applies to: 71-85, 93-111


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 04fd73be-9acf-4d5b-9a84-6510043acdad

📥 Commits

Reviewing files that changed from the base of the PR and between 14d6f85 and f1dd38b.

📒 Files selected for processing (3)
  • .all-contributorsrc
  • README.md
  • web/oss/src/components/GetStarted/GetStarted.tsx
✅ Files skipped from review due to trivial changes (2)
  • .all-contributorsrc
  • README.md

@bekossy bekossy merged commit 474eec0 into main May 5, 2026
30 of 31 checks passed
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants