fix: recognize Bedrock inference profile ARNs for prompt caching#1883
fix: recognize Bedrock inference profile ARNs for prompt caching#1883giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
Conversation
4b7c9a2 to
aebdfe0
Compare
|
Friendly ping — adds recognition of Bedrock inference profile ARNs for prompt caching, which currently only matches standard model ARNs. |
The _cache_strategy property only checked for 'claude' or 'anthropic' substrings in the model_id. This works for system inference profile IDs (e.g. us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for application inference profile ARNs which have the form: arn:aws:bedrock:<region>:<account>:application-inference-profile/<id> These ARNs don't contain 'claude' or 'anthropic', so automatic caching was silently disabled even when the user explicitly set cache_config=CacheConfig(strategy='auto'). The fix detects ARNs that contain 'inference-profile' and optimistically enables caching. Currently only Anthropic Claude models support prompt caching on Bedrock; for non-caching models the cache point is silently ignored by the Converse API. Closes #1705
aebdfe0 to
5ef6a32
Compare
|
Refreshed onto Root cause confirmed still live: Fix: After the existing substring check, add an explicit Runtime proof on rebased branch
|
Issue
Closes #1705
Problem
_cache_strategyonly checked for'claude'or'anthropic'substrings in themodel_id:This works for system inference profile IDs (e.g.
us.anthropic.claude-haiku-4-5-20251001-v1:0) but fails for application inference profile ARNs which have the form:These ARNs don't contain 'claude' or 'anthropic', so automatic caching was silently disabled with a warning:
Solution
Added detection for inference profile ARNs: when
model_idstarts witharn:and containsinference-profile, optimistically enable caching.Design Rationale
GetInferenceProfileto resolve the underlying model, which would add latency, require additional IAM permissions (bedrock:GetInferenceProfile), and create a network dependency on everyconversecall.application-inference-profileandinference-profile(cross-region) ARNs.Testing
test_cache_strategy_anthropic_for_inference_profile_arn: Tests both application and cross-region inference profile ARNsChanges
src/strands/models/bedrock.py: Extended_cache_strategyto detect inference profile ARNstests/strands/models/test_bedrock.py: Added test for ARN-based cache strategy detection