[feat] Extend prompt templates#4171
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Railway Preview Environment
Updated at 2026-05-04T07:26:07.339Z |
mmabrouk
left a comment
There was a problem hiding this comment.
Backward compatibility check needed
Please verify that the new prompt fields:
fallback_llm_configsretry_policyfallback_policychat_template_kwargs
do not leak into stored config JSON for old apps that do not use them.
Cases to verify
-
No-op commit on an old app
Open an old app and commit without touching any fallback or retry controls.
Expected: no new keys appear in the stored JSON. -
Popover opened, but no changes made
Open the Retry or Fallback popovers, then commit without changing anything.
Expected: no new keys appear in the stored JSON. -
Add/remove or reset to default
Add a fallback model and then remove it, or set a policy and then click Reset default, then commit.
Expected: no new keys appear in the stored JSON. -
SDK round-trip compatibility
Verify that:PromptTemplate(**old_config).model_dump(exclude_none=True)
mmabrouk
left a comment
There was a problem hiding this comment.
Thanks a lot for the PR @jp-agenta ! I have added a bunch of comments, not all need action.
The ones I feel strongly about is:
chat_template_kwargsunless it's user pull / request / real case- some weird refactoring that should be easy to resovle (unless I misunderstood thing)
- There seems to be a bug in the logic for retrial
For the policy, I think we can keep it as is. I would neither remove it nor change it.
However, for the frontend, I don't think we should add the options as top level options. The playground is already very complex as is, we should hide complexity in submenues for power users to discover it. In this case we should have these under the model sidebar. I have created designs that I will share in the next comment for @ashrafchowdury to look at.
Finally, we should have documentation for this and test the backward compatibility (previous comment)
CleanShot.2026-04-27.at.21.36.38.mp4@ashrafchowdury can you please take over the frontend here. And please take into account @ardaerzin feedback |
|
@ashrafchowdury I have updated the design on how to deal with the kwargs (see the thread). Basically the idea is to hide the json editor under an advanced collapsible. CleanShot.2026-04-28.at.11.36.36.mp4 |
|
Synced the retry/fallback findings back into the ledger and addressed the two Devin retry findings. Fixed:
|
|
@jp-agenta you did not commit |
[fix] Add exponential backoff to retry in prompts
…nta-AI/agenta into feat/extend-prompt-templates
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR extends prompt templates with root-level fallback models, retry configurations, and provider-specific chat template kwargs. Changes encompass design documentation, SDK runtime logic with retry/fallback execution, comprehensive test coverage, user-facing documentation, and new web UI components for configuring fallback policies and retry behavior. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as Prompt Handler
participant Retry as Retry Loop
participant Provider as LLM Provider
participant Fallback as Fallback Loop
Client->>Handler: Execute prompt with fallback_configs, retry_config, retry_policy, fallback_policy
rect rgb(200, 150, 255, 0.5)
Note over Handler,Provider: Primary Model Attempt
Handler->>Retry: Start retry loop (max_retries)
Retry->>Provider: Call LLM with primary llm_config
Provider-->>Retry: Response or Error
alt Provider Error
Retry->>Retry: Classify error (HTTP status, timeout, timeout)
alt Should Retry (per retry_policy & error type)
Retry->>Retry: Apply exponential backoff (base_delay)
Retry->>Provider: Retry LLM call
Note over Retry: Loop up to max_retries
else Should Not Retry (deterministic error)
Retry->>Fallback: Exhausted retries or non-retryable error
end
else Success
Retry-->>Handler: Return response
end
end
rect rgb(255, 150, 150, 0.5)
Note over Fallback,Provider: Fallback Models Attempt
Fallback->>Fallback: Check if fallback_policy permits fallback for error type
alt Policy Allows Fallback & fallback_configs not empty
Fallback->>Fallback: Select next fallback config (in order)
Fallback->>Retry: Start retry loop for fallback (same max_retries)
Retry->>Provider: Call LLM with fallback llm_config
Provider-->>Retry: Response or Error
alt Fallback Success
Retry-->>Handler: Return response
else Fallback Error
alt More Fallbacks
Fallback->>Fallback: Move to next fallback config
Fallback->>Retry: Retry with next config
Note over Fallback,Retry: Repeat until success or exhaustion
else No More Fallbacks
Fallback->>Handler: Re-raise last error
end
end
else Policy Denies or No Fallbacks
Fallback->>Handler: Re-raise error
end
end
Handler-->>Client: Final response or error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
web/tests/tests/fixtures/base.fixture/providerHelpers/index.ts (1)
434-449:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard against empty
currentModelbefore filtering selects.On Line 448,
hasText: currentModel || ""makes an empty model text match any select in the Configure popover, which can pick the wrong control and create flaky tests.Suggested fix
const currentModel = (await modelButton.textContent())?.trim() +if (!currentModel) { + throw new Error("Could not determine current model from the model selector button") +} if (currentModel?.includes(MOCK_MODEL_NAME)) { return } @@ const modelSelect = configurePopover .locator(".ant-select") - .filter({hasText: currentModel || ""}) + .filter({hasText: currentModel}) .first()web/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection.tsx (1)
1313-1330:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRoot-level prompt schemas never reach the Configure popover.
For canonical prompt templates, the popover is supposed to hang off the top-level
messagesfield, but Lines 1316-1321 return early for arrays before thefieldKey === "messages" && promptModelInfo?.isRootLevelbranch can run. That makes model/fallback/retry editing unreachable for root-level prompt schemas, and the refine action is lost with it.Suggested fix
- if ( - fieldValue === null || - fieldValue === undefined || - typeof fieldValue !== "object" || - Array.isArray(fieldValue) - ) { + const allowArrayHeader = + fieldKey === "messages" && promptModelInfo?.isRootLevel + + if ( + !allowArrayHeader && + (fieldValue === null || + fieldValue === undefined || + typeof fieldValue !== "object" || + Array.isArray(fieldValue)) + ) { return null }
🧹 Nitpick comments (3)
sdk/agenta/sdk/utils/types.py (1)
982-984: Replace deprecated.dict(by_alias=True)with.model_dump(by_alias=True)to align with Pydantic v2.Line 983 uses the deprecated Pydantic v1 API
.dict(by_alias=True), while the rest of the file uses.model_dump()(e.g., line 941 uses.model_dump(by_alias=True)for the same purpose). Since the project requires Pydantic v2, update line 983 to use.model_dump(by_alias=True)for consistency.Proposed fix
- if llm_config.response_format: - kwargs["response_format"] = llm_config.response_format.dict(by_alias=True) + if llm_config.response_format: + kwargs["response_format"] = llm_config.response_format.model_dump(by_alias=True)web/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection/AdvancedConfigFields.tsx (1)
127-140: 💤 Low valueUnsafe type cast when validating non-object JSON values.
If the user enters a valid JSON array, string, number, or boolean (e.g.,
[1, 2, 3]ortrue),parsedwon't be aRecord<string, unknown>. The cast at line 134 may causevalidateConfigAgainstSchemato behave unexpectedly or fail silently.Consider adding a type check before validation:
🛡️ Proposed fix to handle non-object JSON
if (parsed === null) { setParseError(null) onChange(fieldKey, null) return } + if (typeof parsed !== "object" || Array.isArray(parsed)) { + setParseError("Expected a JSON object") + return + } + const validationResult = validateConfigAgainstSchema( parsed as Record<string, unknown>, schema as Record<string, unknown>, )docs/designs/extend-prompt-templates/initial.specs.md (1)
192-210: 💤 Low valueRuntime loop pseudocode may not match implementation.
The pseudocode shows fallback only occurs "if max_retries" (line 203), but based on the handler implementation in
handlers.py, fallback is controlled byfallback_policyindependently of retry configuration. The actual implementation in_run_prompt_with_fallbackalways checks_should_fallback(exc, formatted_prompt.fallback_policy)regardless of retry count.Consider updating the pseudocode to reflect the actual behavior:
📝 Suggested pseudocode correction
for current_llm_config in llm_configs: run current_llm_config with retry_config and retry_policy if success: return response - if max_retries: - if fallback_policy allows fallback given final error: - continue to next current_llm_config + if fallback_policy allows fallback given final error: + continue to next current_llm_config fail with final error
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 16499c07-2b1b-4cb9-be35-2f4e8fd50658
⛔ Files ignored due to path filters (3)
api/poetry.lockis excluded by!**/*.locksdk/poetry.lockis excluded by!**/*.lockservices/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (32)
api/oss/tests/pytest/unit/evaluators/test_catalog_types.pyapi/run-tests.pydocs/designs/extend-prompt-templates/findings.mddocs/designs/extend-prompt-templates/gap.mddocs/designs/extend-prompt-templates/initial.specs.mddocs/designs/extend-prompt-templates/plan.mddocs/designs/extend-prompt-templates/proposal.mddocs/designs/extend-prompt-templates/research.mddocs/docs/custom-workflows/03-configuration-types.mdxdocs/docs/prompt-engineering/integrating-prompts/07-fallback-models-and-retry.mdxdocs/docs/prompt-engineering/integrating-prompts/08-chat-template-kwargs.mdxdocs/docs/prompt-engineering/playground/05-fallback-models-and-retry.mdxdocs/docs/prompt-engineering/playground/06-chat-template-kwargs.mdxdocs/docs/reference/sdk/01-configuration-management.mdxdocs/docs/reference/sdk/03-custom-workflow.mdxsdk/agenta/sdk/engines/running/handlers.pysdk/agenta/sdk/engines/running/interfaces.pysdk/agenta/sdk/utils/types.pysdk/oss/tests/pytest/unit/test_prompt_template_extensions.pyweb/oss/src/components/Playground/Components/Modals/RefinePromptModal/hooks/useRefinePrompt.tsweb/oss/src/components/Playground/Components/Modals/RefinePromptModal/types.tsweb/oss/tests/playwright/acceptance/deployment/index.tsweb/packages/agenta-entities/src/shared/execution/requestBodyBuilder.tsweb/packages/agenta-entities/src/trace/core/schema.tsweb/packages/agenta-entity-ui/src/DrillInView/SchemaControls/NumberSliderControl.tsxweb/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection.tsxweb/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection/AdvancedConfigFields.tsxweb/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection/FallbackConfigTab.tsxweb/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection/ModelConfigEditor.tsxweb/packages/agenta-entity-ui/src/DrillInView/components/PlaygroundConfigSection/RetryConfigTab.tsxweb/packages/agenta-ui/src/Editor/plugins/code/index.tsxweb/tests/tests/fixtures/base.fixture/providerHelpers/index.ts
This PR extends
prompt templatesto support extra optional runtime behavior around LLM execution without changing the existingllm_configcontract. It adds root-levelfallback_llm_configs,retry_policy, andfallback_policy. Fallback configs reuse the same LLM config shape as the primary model, retry applies per attempted config, and fallback only happens after retries are exhausted according to the selected fallback policy.It also expands the reusable LLM config shape with
chat_template_kwargsfor provider-specific model parameters, including playground support requested in #3996.