Skip to content

Commit 7075479

Browse files
committed
feat: Introduce service tier mapping between OpenAI and Bedrock
- Added mapping of OpenAI service tiers (`priority`, `flex`, `default`) to Bedrock service tiers. - Updated request and response handling with new `service_tier` parameter. - Enhanced API documentation and test cases to reflect the changes. - Removed legacy performance configuration and latency mappings.
1 parent 6629291 commit 7075479

13 files changed

+359
-81
lines changed

docs/api_openai_chat_completions.md

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ This OpenAI-compatible endpoint provides access to AWS Bedrock foundation models
7979
| Output tokens | :material-check-circle:{ .success } | Billing unit |
8080
| Reasoning tokens | :material-minus-circle:{ .partial } | Estimated |
8181
| **Other** | | |
82-
| Service tiers | :material-check-circle:{ .success } | Mapped to Bedrock latency options |
82+
| Service tiers | :material-check-circle:{ .success } | Mapped to Bedrock service tiers and latency options |
8383
| `store` / `metadata` | :material-close-circle:{ .unsupported } | OpenAI-specific features |
8484
| `safety_identifier` / `user` | :material-minus-circle:{ .partial } | Logged |
8585
| Bedrock Guardrails | :material-plus-circle:{ .extra-feature } | Content safety policies |
@@ -226,44 +226,47 @@ Simply reference your S3 images using the `s3://` URI scheme in `image_url` fiel
226226
- Performance - Optimized data transfer within AWS infrastructure
227227
- Large images - No size limitations of data URIs or base64 encoding
228228

229-
### AWS Bedrock Guardrails
229+
## Available Request Headers
230230

231-
Protect your applications with content filtering and safety policies using AWS Bedrock Guardrails. This implementation supports the same guardrails integration as AWS Bedrock's native OpenAI-compatible endpoint.
231+
This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
232232

233-
!!! info "Documentation"
234-
See [AWS Bedrock OpenAI Chat Completions API - Include a guardrail in a chat completion](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-chat-completions.html#inference-chat-completions-guardrails) for detailed configuration instructions.
233+
### Content Safety (Guardrails)
235234

236-
**How to Use:**
235+
| Header | Purpose | Valid Values |
236+
|--------------------------------------|------------------------------------|---------------------------------------|
237+
| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier |
238+
| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version | Version number (e.g., `1`) |
239+
| `X-Amzn-Bedrock-Trace` | Guardrail trace level | `disabled`, `enabled`, `enabled_full` |
240+
241+
### Performance Optimization
242+
243+
| Header | Purpose | Valid Values |
244+
|--------------------------------------------|------------------------|-------------------------------|
245+
| `X-Amzn-Bedrock-Service-Tier` | Service tier selection | `priority`, `default`, `flex` |
246+
| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization | `standard`, `optimized` |
237247

238-
Add guardrail headers to your chat completion requests to apply your configured safety policies:
248+
**Example with all headers:**
239249

240250
```bash
241251
curl -X POST "$BASE/v1/chat/completions" \
242252
-H "Authorization: Bearer $OPENAI_API_KEY" \
243253
-H "Content-Type: application/json" \
244254
-H "X-Amzn-Bedrock-GuardrailIdentifier: your-guardrail-id" \
245255
-H "X-Amzn-Bedrock-GuardrailVersion: 1" \
246-
-H "X-Amzn-Bedrock-Trace: ENABLED" \
256+
-H "X-Amzn-Bedrock-Trace: enabled" \
257+
-H "X-Amzn-Bedrock-Service-Tier: priority" \
258+
-H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
247259
-d '{
248260
"model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
249261
"messages": [{"role": "user", "content": "Hello!"}]
250262
}'
251263
```
252264

253-
**Headers:**
254-
255-
- **`X-Amzn-Bedrock-GuardrailIdentifier`** (required): The ID of your configured guardrail
256-
- **`X-Amzn-Bedrock-GuardrailVersion`** (required): The version number of your guardrail
257-
- **`X-Amzn-Bedrock-Trace`** (optional): Set to `ENABLED` to enable trace logging for debugging
258-
259-
**What Happens:**
260-
261-
- Requests are validated against your guardrail policies before reaching the model
262-
- Responses are filtered according to your content safety rules
263-
- Violations are blocked and return appropriate error responses
265+
!!! info "Detailed Documentation"
266+
For complete information about these headers, configuration options, and use cases, see:
264267

265-
!!! note "Unsupported Parameter"
266-
The `tagSuffix` parameter is not supported in this implementation.
268+
- [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
269+
- [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
267270

268271
### Provider-Specific Parameters
269272

docs/api_openai_embeddings.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,45 @@ Transform text into semantic vectors. Power your search, recommendations, and si
6060

6161
</div>
6262

63+
## Available Request Headers
64+
65+
This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
66+
67+
### Content Safety (Guardrails)
68+
69+
| Header | Purpose | Valid Values |
70+
|--------------------------------------|------------------------------------|---------------------------------------|
71+
| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier |
72+
| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version | Version number (e.g., `1`) |
73+
| `X-Amzn-Bedrock-Trace` | Guardrail trace level | `disabled`, `enabled`, `enabled_full` |
74+
75+
### Performance Optimization
76+
77+
| Header | Purpose | Valid Values |
78+
|--------------------------------------------|------------------------|-------------------------------|
79+
| `X-Amzn-Bedrock-Service-Tier` | Service tier selection | `priority`, `default`, `flex` |
80+
| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization | `standard`, `optimized` |
81+
82+
**Example with headers:**
83+
84+
```bash
85+
curl -X POST "$BASE/v1/embeddings" \
86+
-H "Authorization: Bearer $OPENAI_API_KEY" \
87+
-H "Content-Type: application/json" \
88+
-H "X-Amzn-Bedrock-Service-Tier: flex" \
89+
-H "X-Amzn-Bedrock-PerformanceConfig-Latency: standard" \
90+
-d '{
91+
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
92+
"input": ["Batch text 1", "Batch text 2", "Batch text 3"]
93+
}'
94+
```
95+
96+
!!! info "Detailed Documentation"
97+
For complete information about these headers, configuration options, and use cases, see:
98+
99+
- [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
100+
- [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
101+
63102
## Advanced Features
64103

65104
### Provider-Specific Parameters

docs/api_openai_images_generations.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,45 @@ Generate images with AWS Bedrock image models like Stability AI and Amazon Nova
7979
!!! tip "Performance Optimization"
8080
For faster image downloads, especially for high-resolution images or globally distributed users, enable S3 Transfer Acceleration by setting `AWS_S3_ACCELERATE=true`. This uses CloudFront edge locations to accelerate file downloads, providing 50-500% faster speeds for users far from your S3 bucket region. See [S3 Transfer Acceleration configuration](operations_configuration.md#aws-s3-accelerate) for setup details.
8181

82+
## Available Request Headers
83+
84+
This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
85+
86+
### Content Safety (Guardrails)
87+
88+
| Header | Purpose | Valid Values |
89+
|--------------------------------------|------------------------------------|---------------------------------------|
90+
| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier |
91+
| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version | Version number (e.g., `1`) |
92+
| `X-Amzn-Bedrock-Trace` | Guardrail trace level | `disabled`, `enabled`, `enabled_full` |
93+
94+
### Performance Optimization
95+
96+
| Header | Purpose | Valid Values |
97+
|--------------------------------------------|------------------------|-------------------------------|
98+
| `X-Amzn-Bedrock-Service-Tier` | Service tier selection | `priority`, `default`, `flex` |
99+
| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization | `standard`, `optimized` |
100+
101+
**Example with headers:**
102+
103+
```bash
104+
curl -X POST "$BASE/v1/images/generations" \
105+
-H "Authorization: Bearer $OPENAI_API_KEY" \
106+
-H "Content-Type: application/json" \
107+
-H "X-Amzn-Bedrock-Service-Tier: priority" \
108+
-H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
109+
-d '{
110+
"model": "amazon.nova-canvas-v1:0",
111+
"prompt": "A serene mountain landscape at sunset"
112+
}'
113+
```
114+
115+
!!! info "Detailed Documentation"
116+
For complete information about these headers, configuration options, and use cases, see:
117+
118+
- [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
119+
- [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
120+
82121
## Advanced Features
83122

84123
### Provider-Specific Parameters

docs/operations_configuration.md

Lines changed: 96 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2537,11 +2537,11 @@ export AWS_BEDROCK_GUARDRAIL_TRACE=enabled
25372537

25382538
Override global guardrail settings for individual requests using HTTP headers:
25392539

2540-
| Header | Purpose |
2541-
|--------|---------|
2542-
| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID |
2543-
| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version |
2544-
| `X-Amzn-Bedrock-Trace` | Trace level |
2540+
| Header | Purpose | Valid Values |
2541+
|--------------------------------------|-------------------|---------------------------------------|
2542+
| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID | Your guardrail identifier |
2543+
| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version | Version number (e.g., `1`) |
2544+
| `X-Amzn-Bedrock-Trace` | Trace level | `disabled`, `enabled`, `enabled_full` |
25452545

25462546
```bash title="Example cURL Request"
25472547
curl -X POST https://api.example.com/v1/chat/completions \
@@ -2561,6 +2561,97 @@ The `amazon-bedrock-guardrailConfig` object in the request body is supported for
25612561

25622562
---
25632563

2564+
## Bedrock Service Tier and Performance Configuration
2565+
2566+
Amazon Bedrock service tiers and performance configurations allow you to optimize AI workload performance and cost trade-offs. Configure latency optimization and throughput priority for your inference requests.
2567+
2568+
!!! info "AWS Documentation"
2569+
For detailed information about service tiers, see:
2570+
2571+
- [Amazon Bedrock Service Tiers](https://aws.amazon.com/blogs/aws/new-amazon-bedrock-service-tiers-help-you-match-ai-workload-performance-with-cost/)
2572+
2573+
### Service Tiers
2574+
2575+
Service tiers help you match AI workload performance with cost by selecting the appropriate throughput and latency characteristics:
2576+
2577+
- **`priority`** - Highest priority processing with guaranteed capacity and fastest response times. Best for latency-sensitive applications.
2578+
- **`default`** - Standard processing with balanced performance and cost. Suitable for most production workloads.
2579+
- **`flex`** - Cost-optimized processing with flexible scheduling. Best for batch jobs and non-time-sensitive workloads.
2580+
2581+
### Performance Configuration
2582+
2583+
Performance configuration allows you to optimize for latency:
2584+
2585+
- **`standard`** - Standard latency profile with balanced performance
2586+
- **`optimized`** - Optimized for lowest possible latency
2587+
2588+
### Per-Request Configuration
2589+
2590+
Configure service tier and performance settings per request using HTTP headers. These headers are available on all Bedrock-based routes (Chat Completions, Embeddings, Images):
2591+
2592+
| Header | Purpose | Valid Values |
2593+
|--------------------------------------------|------------------------|-------------------------------|
2594+
| `X-Amzn-Bedrock-Service-Tier` | Service tier selection | `priority`, `default`, `flex` |
2595+
| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization | `standard`, `optimized` |
2596+
2597+
```bash title="Example: Chat Completions with Priority Tier and Optimized Latency"
2598+
curl -X POST https://api.example.com/v1/chat/completions \
2599+
-H "Authorization: Bearer sk-..." \
2600+
-H "X-Amzn-Bedrock-Service-Tier: priority" \
2601+
-H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
2602+
-H "Content-Type: application/json" \
2603+
-d '{
2604+
"model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
2605+
"messages": [{"role": "user", "content": "Hello!"}]
2606+
}'
2607+
```
2608+
2609+
```bash title="Example: Embeddings with Flex Tier for Batch Processing"
2610+
curl -X POST https://api.example.com/v1/embeddings \
2611+
-H "Authorization: Bearer sk-..." \
2612+
-H "X-Amzn-Bedrock-Service-Tier: flex" \
2613+
-H "Content-Type: application/json" \
2614+
-d '{
2615+
"model": "amazon.nova-2-multimodal-embeddings-v1:0",
2616+
"input": ["text 1", "text 2", "text 3"]
2617+
}'
2618+
```
2619+
2620+
```bash title="Example: Image Generation with Default Tier"
2621+
curl -X POST https://api.example.com/v1/images/generations \
2622+
-H "Authorization: Bearer sk-..." \
2623+
-H "X-Amzn-Bedrock-Service-Tier: default" \
2624+
-H "Content-Type: application/json" \
2625+
-d '{
2626+
"model": "amazon.nova-canvas-v1:0",
2627+
"prompt": "A serene mountain landscape"
2628+
}'
2629+
```
2630+
2631+
!!! tip "When to Use Each Tier"
2632+
**Priority Tier:**
2633+
2634+
- Real-time customer-facing applications
2635+
- Interactive chatbots and assistants
2636+
- Applications requiring guaranteed low latency
2637+
- Production workloads with strict SLAs
2638+
2639+
**Default Tier:**
2640+
2641+
- Standard production workloads
2642+
- General-purpose API usage
2643+
- Applications with moderate latency requirements
2644+
2645+
**Flex Tier:**
2646+
2647+
- Batch processing and bulk operations
2648+
- Offline content generation
2649+
- Data processing pipelines
2650+
- Non-time-sensitive workloads
2651+
- Cost-optimized inference at scale
2652+
2653+
---
2654+
25642655
## Audio and Text-to-Speech
25652656

25662657
#### `DEFAULT_TTS_MODEL` { #default-tts-model }

docs/roadmap.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -132,10 +132,11 @@ Expands multimodal embedding capabilities, adds prompt caching support, and intr
132132

133133
### 💬 Chat Completions
134134

135-
| Provider | Endpoint/Feature | AWS Backend |
136-
|--------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
137-
| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Prompt caching `/v1/chat/completions` `prompt_cache_key` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock - prompt caching |
138-
| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | `/v1/chat/completions` GPT5.1 API update (`reasoning_effort=none`) | |
135+
| Provider | Endpoint/Feature | AWS Backend |
136+
|--------------------------------------------------------------------------------|---------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
137+
| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Prompt caching `/v1/chat/completions` `prompt_cache_key` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock - prompt caching |
138+
| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | `/v1/chat/completions` GPT5.1 API update (`reasoning_effort=none`) | |
139+
| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Priority & flex service tiers `/v1/chat/completions` `service_tier` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} [Amazon Bedrock - service tiers NEW](https://aws.amazon.com/fr/blogs/aws/new-amazon-bedrock-service-tiers-help-you-match-ai-workload-performance-with-cost/) |
139140

140141
### 🧠 Embeddings
141142

@@ -155,6 +156,13 @@ Expands multimodal embedding capabilities, adds prompt caching support, and intr
155156
| Server-side ARN mapping | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
156157
| Client-side ARN passing (optional) | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
157158

159+
### 📋 Headers for Chat Completions/Embeddings/Image Generation
160+
161+
| Provider | Endpoint/Feature | AWS Backend |
162+
|--------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
163+
| ![Amazon](styles/logo_amazon.svg){: style="height:20px;width:20px"} **Amazon** | Service tier configuration header `X-Amzn-Bedrock-Service-Tier` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
164+
| ![Amazon](styles/logo_amazon.svg){: style="height:20px;width:20px"} **Amazon** | Performance latency configuration header `X-Amzn-Bedrock-PerformanceConfig-Latency` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
165+
158166
### Fixes
159167

160168
- `/v1/chat/completions`: Fix default value passed to the converse API for tools without parameters.

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ requires-python = ">=3.13"
77
dependencies = [
88
"fastapi",
99
"aioboto3>=15.1.0",
10-
"aiobotocore>=2.24.0",
10+
"botocore>=1.40.76",
1111
"pydantic>=2",
1212
"pydantic-settings>=2",
1313
"sse-starlette",

0 commit comments

Comments
 (0)