[ENHANCEMENT] Improve Amazon Bedrock support#125
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
edelauna
left a comment
There was a problem hiding this comment.
Thanks for the thorough write-up and the production testing context. This is solid work overall.
One structural ask: this PR is large enough (3,929 additions / 318 deletions across 37 files) and crosses enough boundaries that it is hard to review the model catalog/type changes, runtime request-shaping changes, and settings/control-plane UI changes with confidence in one pass. The max-output-token probe issue is a good example of the risk: UI-discovered state from BedrockMaxTokensProbeButton.tsx / useBedrockMaxTokensProbe.ts flows into
provider settings as awsModelMaxOutputTokens, and then affects model resolution/request construction through resolveBedrockModelInfo and AwsBedrockHandler.
It would help to split this into three sequential PRs:
PR 1 — Types & catalog (packages/types/ only)
Keep the model catalog corrections, Bedrock model metadata changes, promptCacheTtl, the Opus 4.7 entry, and shared Bedrock resolution/types/helpers in packages/types/src/providers/bedrock.ts and related type files/tests. This is mostly data, types, and pure resolution logic, so it should be low risk and easy to land first.
PR 2 — Runtime improvements, depending on PR 1
Keep the provider/runtime behavior together: adaptive thinking payload handling in src/api/providers/bedrock.ts, structured-output fallback and 30-day cache, stripBedrockStrictIncompatibleConstraints, JSON schema updates, and promptCacheTtl plumbing through the cache strategy. This keeps review focused on request construction, retry/cache behavior, and Bedrock compatibility, without also reviewing React/settings state.
PR 3 — Control-plane discovery + settings UI, depending on PR 1
Keep the discovery/probe/UI work together: src/api/providers/bedrock-discovery.ts, the @aws-sdk/client-bedrock dependency and lockfile changes, webviewMessageHandler message cases, Bedrock.tsx, BedrockMaxTokensProbeButton.tsx, useBedrockDiscovery, useBedrockMaxTokensProbe, and the related settings controls/i18n. This can then be reviewed specifically for discovery behavior, saved settings semantics, and UI state isolation.
I think that split would make the series much easier to review and reduce the chance of missing cross-layer regressions.
| supportsImages: true, | ||
| supportsPromptCache: true, | ||
| }, | ||
| "claude-4-opus": { |
There was a problem hiding this comment.
The "claude-4" entry earlier in the map matches first, so any unknown ARN containing claude-4-opus resolves to maxTokens: 8192 instead of 4096 here. Could the more-specific patterns be ordered before the generic ones?
| // parseBaseModelId strips cross-region inference prefixes (e.g. `us.`, `eu.`) and the | ||
| // synthetic `:1m` dropdown suffix. | ||
| const baseModelId = this.parseBaseModelId(modelConfig.id) | ||
| const requiresAdaptiveThinking = BEDROCK_ADAPTIVE_THINKING_MODEL_IDS.includes(baseModelId as any) |
There was a problem hiding this comment.
The rest of the file uses as (typeof BEDROCK_GLOBAL_INFERENCE_MODEL_IDS)[number] for these array-includes checks — would it be worth following the same pattern here instead of as any?
|
|
||
| const client = new BedrockClient(toBedrockClientConfig(options)) | ||
|
|
||
| const [foundationModelsResponse, inferenceProfiles] = await Promise.all([ |
There was a problem hiding this comment.
If AWS is slow or unreachable this Promise.all hangs indefinitely and blocks the settings panel. Is there a timeout or cancellation path planned here?
| } | ||
| try { | ||
| const result = await probe(apiConfiguration, targetModelId) | ||
| setApiConfigurationField("awsModelMaxOutputTokens", result.maxOutputTokens) |
There was a problem hiding this comment.
awsModelMaxOutputTokens is written as a bare number with no reference to which target was probed. If the user probes Opus 4.7 (→ 128K) and then switches to Haiku 4.5 (real cap 64K), the stale override is applied unconditionally and the next request fails. Should this be keyed by target id, or cleared when the selected target changes?
|
Thanks for these reviews! I attempted to address them with a new commit. This needs some testing. |
edelauna
left a comment
There was a problem hiding this comment.
Thanks for addressing the inline feedback — the pattern ordering fix, typed assertions, discovery timeout, and scoped max-output override all look good.
I do want to reiterate my earlier ask to split this into three sequential PRs. Having done a deeper pass through the full diff, the cross-layer concerns reinforce why it's needed:
Findings from a detailed review underscore the split:
- The highest-risk issues I'm seeing (credentials in React Query cache keys, probe cost/abort gaps, useEffect re-render cascades, missing staleTime on discovery) are all concentrated in the UI/discovery layer. Reviewing these alongside 900+ lines of model catalog changes dilutes focus on the parts that need the most scrutiny.
- The runtime layer (structured-output fallback, adaptive thinking, compile-wait retry loop) has its own review surface — retry bounds, cache TTL semantics, double 1M-context application in getModel() — that deserves isolated attention.
- The types/catalog layer is low risk and could land immediately, unblocking the other two.
- There's also minor scope drift (multi-point-strategy.ts guard changes) that would naturally separate out.
My original proposed split still holds:
- PR 1 — Types & catalog (packages/types/): model metadata, resolution helpers, expandBedrockTargetsWith1MVariants, tests. Low risk, land first.
- PR 2 — Runtime (depends on PR 1): adaptive thinking, structured-output fallback + 30-day cache,
stripBedrockStrictIncompatibleConstraints, JSON schema normalization, promptCacheTtl plumbing. - PR 3 — Discovery + settings UI (depends on PR 1): @aws-sdk/client-bedrock, bedrock-discovery.ts, probe flow, React hooks/components, webview message handler, i18n.
This doesn't change the total amount of work — it just makes each piece reviewable in isolation and reduces the chance of cross-layer regressions slipping through.
|
Thanks for the review! I’ll split it up, it’s no problem for me :) |
Related GitHub Issue
Closes: #124
Description
This PR improves Amazon Bedrock support across provider metadata, invocation target resolution, settings UI, and request handling.
Key changes:
Reviewer focus:
This PR was prepared with agentic AI assistance from the CRC/Zoo porting work, then reviewed and shaped for Zoo Code contribution. The implementation has also been tested in production business use for several weeks before submission.
This is opened as a draft because issue #124 still needs Zoo maintainer approval/assignment under the contribution process.
Test Procedure
The implementation has been exercised in production business use for several weeks.
Reviewers can reproduce the focused local validation with:
Manual verification recommended:
Pre-Submission Checklist
Screenshots / Videos
TODO: Attach screenshots of the Bedrock settings model/profile selector and max-output-token probe UI if maintainers want UI evidence before review.
Documentation Updates
Additional Notes
Related but not duplicate Zoo work:
@ai-sdk/amazon-bedrockdependency.tool_choice='required'when tools are present.Those are adjacent but do not cover the provider discovery/context/output-token behavior in this PR.
Get in Touch
Discord:
coorbin