Skip to content

fix: recognize Bedrock inference profile ARNs for prompt caching#1883

Closed
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone:fix/bedrock-arn-cache-support
Closed

fix: recognize Bedrock inference profile ARNs for prompt caching#1883
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone:fix/bedrock-arn-cache-support

Conversation

@giulio-leone
Copy link
Contributor

Issue

Closes #1705

Problem

_cache_strategy only checked for 'claude' or 'anthropic' substrings in the model_id:

def _cache_strategy(self) -> str | None:
    model_id = self.config.get('model_id', '').lower()
    if 'claude' in model_id or 'anthropic' in model_id:
        return 'anthropic'
    return None

This works for system inference profile IDs (e.g. us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for application inference profile ARNs which have the form:

arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/abc123def456

These ARNs don't contain 'claude' or 'anthropic', so automatic caching was silently disabled with a warning:

WARNING | model_id=<arn:aws:bedrock:...> | cache_config is enabled but this model does not support automatic caching

Solution

Added detection for inference profile ARNs: when model_id starts with arn: and contains inference-profile, optimistically enable caching.

Design Rationale

  • No extra API call: Avoided calling GetInferenceProfile to resolve the underlying model, which would add latency, require additional IAM permissions (bedrock:GetInferenceProfile), and create a network dependency on every converse call.
  • Safe default: Currently only Anthropic Claude models support prompt caching on Bedrock. For non-caching models, the cache point is silently ignored by the Converse API — no error is raised.
  • Covers both ARN types: Matches both application-inference-profile and inference-profile (cross-region) ARNs.

Testing

  • Added test_cache_strategy_anthropic_for_inference_profile_arn: Tests both application and cross-region inference profile ARNs
  • All 125 Bedrock tests pass (123 existing + 1 new with 2 assertions)

Changes

  • src/strands/models/bedrock.py: Extended _cache_strategy to detect inference profile ARNs
  • tests/strands/models/test_bedrock.py: Added test for ARN-based cache strategy detection

@giulio-leone
Copy link
Contributor Author

Friendly ping — adds recognition of Bedrock inference profile ARNs for prompt caching, which currently only matches standard model ARNs.

The _cache_strategy property only checked for 'claude' or 'anthropic'
substrings in the model_id.  This works for system inference profile
IDs (e.g. us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for
application inference profile ARNs which have the form:

  arn:aws:bedrock:<region>:<account>:application-inference-profile/<id>

These ARNs don't contain 'claude' or 'anthropic', so automatic
caching was silently disabled even when the user explicitly set
cache_config=CacheConfig(strategy='auto').

The fix detects ARNs that contain 'inference-profile' and
optimistically enables caching.  Currently only Anthropic Claude
models support prompt caching on Bedrock; for non-caching models the
cache point is silently ignored by the Converse API.

Closes #1705
@giulio-leone giulio-leone force-pushed the fix/bedrock-arn-cache-support branch from aebdfe0 to 5ef6a32 Compare March 23, 2026 06:04
@github-actions github-actions bot added size/s and removed size/s labels Mar 23, 2026
@giulio-leone
Copy link
Contributor Author

Refreshed onto main @ fd8168a (v1.32.0+2) — 2026-03-23

Root cause confirmed still live: _cache_strategy matches model IDs via substring ("claude" in model_id), but Bedrock application/cross-region inference profile ARNs look like arn:aws:bedrock:us-east-1::inference-profile/us.anthropic.claude-… — after .lower() the ARN contains neither "claude" nor "anthropic" as a bare substring in the expected position, so the property returns None and prompt caching is silently disabled for users who specify their model via an inference profile ARN.

Fix: After the existing substring check, add an explicit arn: prefix + inference-profile infix guard that returns "anthropic" optimistically (Bedrock will silently ignore the cache point for non-caching models).

Runtime proof on rebased branch 5ef6a32:

claude model              -> 'anthropic'  ✓
inference profile ARN     -> 'anthropic'  ✓ (was None before fix)
cross-region profile ARN  -> 'anthropic'  ✓ (was None before fix)
titan model               -> None         ✓ (unaffected)
  • test_cache_strategy_anthropic_for_inference_profile_arn: PASSED
  • All 12 cache/region tests: 12/12 PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] _supports_caching doesn't recognize Bedrock application inference profile ARNs

1 participant