stdapi-ai
diff --git a/‎docs/api_openai_chat_completions.md‎
Lines changed: 24 additions & 21 deletions b/‎docs/api_openai_chat_completions.md‎
Lines changed: 24 additions & 21 deletions
diff --git a/‎docs/api_openai_embeddings.md‎
Lines changed: 39 additions & 0 deletions b/‎docs/api_openai_embeddings.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎docs/api_openai_images_generations.md‎
Lines changed: 39 additions & 0 deletions b/‎docs/api_openai_images_generations.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎docs/operations_configuration.md‎
Lines changed: 96 additions & 5 deletions b/‎docs/operations_configuration.md‎
Lines changed: 96 additions & 5 deletions
diff --git a/‎docs/roadmap.md‎
Lines changed: 12 additions & 4 deletions b/‎docs/roadmap.md‎
Lines changed: 12 additions & 4 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion b/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion
@@ -79,7 +79,7 @@ This OpenAI-compatible endpoint provides access to AWS Bedrock foundation models
 | Output tokens                            |   :material-check-circle:{ .success }    | Billing unit                                                    |
 | Reasoning tokens                         |   :material-minus-circle:{ .partial }    | Estimated                                                       |
 | **Other**                                |                                          |                                                                 |
-| Service tiers                            |   :material-check-circle:{ .success }    | Mapped to Bedrock latency options                               |
+| Service tiers                            |   :material-check-circle:{ .success }    | Mapped to Bedrock service tiers and latency options             |
 | `store` / `metadata`                     | :material-close-circle:{ .unsupported }  | OpenAI-specific features                                        |
 | `safety_identifier` / `user`             |   :material-minus-circle:{ .partial }    | Logged                                                          |
 | Bedrock Guardrails                       | :material-plus-circle:{ .extra-feature } | Content safety policies                                         |
@@ -226,44 +226,47 @@ Simply reference your S3 images using the `s3://` URI scheme in `image_url` fiel
 - Performance - Optimized data transfer within AWS infrastructure
 - Large images - No size limitations of data URIs or base64 encoding
 
-### AWS Bedrock Guardrails
+## Available Request Headers
 
-Protect your applications with content filtering and safety policies using AWS Bedrock Guardrails. This implementation supports the same guardrails integration as AWS Bedrock's native OpenAI-compatible endpoint.
+This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
 
-!!! info "Documentation"
-    See [AWS Bedrock OpenAI Chat Completions API - Include a guardrail in a chat completion](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-chat-completions.html#inference-chat-completions-guardrails) for detailed configuration instructions.
+### Content Safety (Guardrails)
 
-**How to Use:**
+| Header                               | Purpose                            | Valid Values                          |
+|--------------------------------------|------------------------------------|---------------------------------------|
+| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier             |
+| `X-Amzn-Bedrock-GuardrailVersion`    | Guardrail version                  | Version number (e.g., `1`)            |
+| `X-Amzn-Bedrock-Trace`               | Guardrail trace level              | `disabled`, `enabled`, `enabled_full` |
+
+### Performance Optimization
+
+| Header                                     | Purpose                | Valid Values                  |
+|--------------------------------------------|------------------------|-------------------------------|
+| `X-Amzn-Bedrock-Service-Tier`              | Service tier selection | `priority`, `default`, `flex` |
+| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization   | `standard`, `optimized`       |
 
-Add guardrail headers to your chat completion requests to apply your configured safety policies:
+**Example with all headers:**
 
 ```bash
 curl -X POST "$BASE/v1/chat/completions" \
   -H "Authorization: Bearer $OPENAI_API_KEY" \
   -H "Content-Type: application/json" \
   -H "X-Amzn-Bedrock-GuardrailIdentifier: your-guardrail-id" \
   -H "X-Amzn-Bedrock-GuardrailVersion: 1" \
-  -H "X-Amzn-Bedrock-Trace: ENABLED" \
+  -H "X-Amzn-Bedrock-Trace: enabled" \
+  -H "X-Amzn-Bedrock-Service-Tier: priority" \
+  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
   -d '{
     "model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
     "messages": [{"role": "user", "content": "Hello!"}]
   }'
 ```
 
-**Headers:**
-
-- **`X-Amzn-Bedrock-GuardrailIdentifier`** (required): The ID of your configured guardrail
-- **`X-Amzn-Bedrock-GuardrailVersion`** (required): The version number of your guardrail
-- **`X-Amzn-Bedrock-Trace`** (optional): Set to `ENABLED` to enable trace logging for debugging
-
-**What Happens:**
-
-- Requests are validated against your guardrail policies before reaching the model
-- Responses are filtered according to your content safety rules
-- Violations are blocked and return appropriate error responses
+!!! info "Detailed Documentation"
+    For complete information about these headers, configuration options, and use cases, see:
 
-!!! note "Unsupported Parameter"
-    The `tagSuffix` parameter is not supported in this implementation.
+    - [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
+    - [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
 
 ### Provider-Specific Parameters
 
 
@@ -60,6 +60,45 @@ Transform text into semantic vectors. Power your search, recommendations, and si
 
 </div>
 
+## Available Request Headers
+
+This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
+
+### Content Safety (Guardrails)
+
+| Header                               | Purpose                            | Valid Values                          |
+|--------------------------------------|------------------------------------|---------------------------------------|
+| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier             |
+| `X-Amzn-Bedrock-GuardrailVersion`    | Guardrail version                  | Version number (e.g., `1`)            |
+| `X-Amzn-Bedrock-Trace`               | Guardrail trace level              | `disabled`, `enabled`, `enabled_full` |
+
+### Performance Optimization
+
+| Header                                     | Purpose                | Valid Values                  |
+|--------------------------------------------|------------------------|-------------------------------|
+| `X-Amzn-Bedrock-Service-Tier`              | Service tier selection | `priority`, `default`, `flex` |
+| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization   | `standard`, `optimized`       |
+
+**Example with headers:**
+
+```bash
+curl -X POST "$BASE/v1/embeddings" \
+  -H "Authorization: Bearer $OPENAI_API_KEY" \
+  -H "Content-Type: application/json" \
+  -H "X-Amzn-Bedrock-Service-Tier: flex" \
+  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: standard" \
+  -d '{
+    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
+    "input": ["Batch text 1", "Batch text 2", "Batch text 3"]
+  }'
+```
+
+!!! info "Detailed Documentation"
+    For complete information about these headers, configuration options, and use cases, see:
+
+    - [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
+    - [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
+
 ## Advanced Features
 
 ### Provider-Specific Parameters
 
@@ -79,6 +79,45 @@ Generate images with AWS Bedrock image models like Stability AI and Amazon Nova
 !!! tip "Performance Optimization"
     For faster image downloads, especially for high-resolution images or globally distributed users, enable S3 Transfer Acceleration by setting `AWS_S3_ACCELERATE=true`. This uses CloudFront edge locations to accelerate file downloads, providing 50-500% faster speeds for users far from your S3 bucket region. See [S3 Transfer Acceleration configuration](operations_configuration.md#aws-s3-accelerate) for setup details.
 
+## Available Request Headers
+
+This endpoint supports standard Bedrock headers for enhanced control over your requests. All headers are optional and can be combined as needed.
+
+### Content Safety (Guardrails)
+
+| Header                               | Purpose                            | Valid Values                          |
+|--------------------------------------|------------------------------------|---------------------------------------|
+| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID for content filtering | Your guardrail identifier             |
+| `X-Amzn-Bedrock-GuardrailVersion`    | Guardrail version                  | Version number (e.g., `1`)            |
+| `X-Amzn-Bedrock-Trace`               | Guardrail trace level              | `disabled`, `enabled`, `enabled_full` |
+
+### Performance Optimization
+
+| Header                                     | Purpose                | Valid Values                  |
+|--------------------------------------------|------------------------|-------------------------------|
+| `X-Amzn-Bedrock-Service-Tier`              | Service tier selection | `priority`, `default`, `flex` |
+| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization   | `standard`, `optimized`       |
+
+**Example with headers:**
+
+```bash
+curl -X POST "$BASE/v1/images/generations" \
+  -H "Authorization: Bearer $OPENAI_API_KEY" \
+  -H "Content-Type: application/json" \
+  -H "X-Amzn-Bedrock-Service-Tier: priority" \
+  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
+  -d '{
+    "model": "amazon.nova-canvas-v1:0",
+    "prompt": "A serene mountain landscape at sunset"
+  }'
+```
+
+!!! info "Detailed Documentation"
+    For complete information about these headers, configuration options, and use cases, see:
+
+    - [Bedrock Guardrails Configuration](operations_configuration.md#bedrock-guardrails)
+    - [Service Tier and Performance Configuration](operations_configuration.md#bedrock-service-tier-and-performance-configuration)
+
 ## Advanced Features
 
 ### Provider-Specific Parameters
 
@@ -2537,11 +2537,11 @@ export AWS_BEDROCK_GUARDRAIL_TRACE=enabled
 
 Override global guardrail settings for individual requests using HTTP headers:
 
-| Header | Purpose |
-|--------|---------|
-| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID |
-| `X-Amzn-Bedrock-GuardrailVersion` | Guardrail version |
-| `X-Amzn-Bedrock-Trace` | Trace level |
+| Header                               | Purpose           | Valid Values                          |
+|--------------------------------------|-------------------|---------------------------------------|
+| `X-Amzn-Bedrock-GuardrailIdentifier` | Guardrail ID      | Your guardrail identifier             |
+| `X-Amzn-Bedrock-GuardrailVersion`    | Guardrail version | Version number (e.g., `1`)            |
+| `X-Amzn-Bedrock-Trace`               | Trace level       | `disabled`, `enabled`, `enabled_full` |
 
 ```bash title="Example cURL Request"
 curl -X POST https://api.example.com/v1/chat/completions \
@@ -2561,6 +2561,97 @@ The `amazon-bedrock-guardrailConfig` object in the request body is supported for
 
 ---
 
+## Bedrock Service Tier and Performance Configuration
+
+Amazon Bedrock service tiers and performance configurations allow you to optimize AI workload performance and cost trade-offs. Configure latency optimization and throughput priority for your inference requests.
+
+!!! info "AWS Documentation"
+    For detailed information about service tiers, see:
+
+    - [Amazon Bedrock Service Tiers](https://aws.amazon.com/blogs/aws/new-amazon-bedrock-service-tiers-help-you-match-ai-workload-performance-with-cost/)
+
+### Service Tiers
+
+Service tiers help you match AI workload performance with cost by selecting the appropriate throughput and latency characteristics:
+
+- **`priority`** - Highest priority processing with guaranteed capacity and fastest response times. Best for latency-sensitive applications.
+- **`default`** - Standard processing with balanced performance and cost. Suitable for most production workloads.
+- **`flex`** - Cost-optimized processing with flexible scheduling. Best for batch jobs and non-time-sensitive workloads.
+
+### Performance Configuration
+
+Performance configuration allows you to optimize for latency:
+
+- **`standard`** - Standard latency profile with balanced performance
+- **`optimized`** - Optimized for lowest possible latency
+
+### Per-Request Configuration
+
+Configure service tier and performance settings per request using HTTP headers. These headers are available on all Bedrock-based routes (Chat Completions, Embeddings, Images):
+
+| Header                                     | Purpose                | Valid Values                  |
+|--------------------------------------------|------------------------|-------------------------------|
+| `X-Amzn-Bedrock-Service-Tier`              | Service tier selection | `priority`, `default`, `flex` |
+| `X-Amzn-Bedrock-PerformanceConfig-Latency` | Latency optimization   | `standard`, `optimized`       |
+
+```bash title="Example: Chat Completions with Priority Tier and Optimized Latency"
+curl -X POST https://api.example.com/v1/chat/completions \
+  -H "Authorization: Bearer sk-..." \
+  -H "X-Amzn-Bedrock-Service-Tier: priority" \
+  -H "X-Amzn-Bedrock-PerformanceConfig-Latency: optimized" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "anthropic.claude-sonnet-4-5-20250929-v1:0",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
+```bash title="Example: Embeddings with Flex Tier for Batch Processing"
+curl -X POST https://api.example.com/v1/embeddings \
+  -H "Authorization: Bearer sk-..." \
+  -H "X-Amzn-Bedrock-Service-Tier: flex" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "amazon.nova-2-multimodal-embeddings-v1:0",
+    "input": ["text 1", "text 2", "text 3"]
+  }'
+```
+
+```bash title="Example: Image Generation with Default Tier"
+curl -X POST https://api.example.com/v1/images/generations \
+  -H "Authorization: Bearer sk-..." \
+  -H "X-Amzn-Bedrock-Service-Tier: default" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "amazon.nova-canvas-v1:0",
+    "prompt": "A serene mountain landscape"
+  }'
+```
+
+!!! tip "When to Use Each Tier"
+    **Priority Tier:**
+
+    - Real-time customer-facing applications
+    - Interactive chatbots and assistants
+    - Applications requiring guaranteed low latency
+    - Production workloads with strict SLAs
+
+    **Default Tier:**
+
+    - Standard production workloads
+    - General-purpose API usage
+    - Applications with moderate latency requirements
+
+    **Flex Tier:**
+
+    - Batch processing and bulk operations
+    - Offline content generation
+    - Data processing pipelines
+    - Non-time-sensitive workloads
+    - Cost-optimized inference at scale
+
+---
+
 ## Audio and Text-to-Speech
 
 #### `DEFAULT_TTS_MODEL` { #default-tts-model }
 
@@ -132,10 +132,11 @@ Expands multimodal embedding capabilities, adds prompt caching support, and intr
 
 ### 💬 Chat Completions
 
-| Provider                                                                       | Endpoint/Feature                                                    | AWS Backend                                                                                                         |
-|--------------------------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|
-| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Prompt caching `/v1/chat/completions` `prompt_cache_key`            | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock - prompt caching |
-| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | `/v1/chat/completions` GPT5.1 API update  (`reasoning_effort=none`) |                                                                                                                     |
+| Provider                                                                       | Endpoint/Feature                                                    | AWS Backend                                                                                                                                                                                                                                      |
+|--------------------------------------------------------------------------------|---------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Prompt caching `/v1/chat/completions` `prompt_cache_key`            | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock - prompt caching                                                                                                                              |
+| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | `/v1/chat/completions` GPT5.1 API update  (`reasoning_effort=none`) |                                                                                                                                                                                                                                                  |
+| ![OpenAI](styles/logo_openai.svg){: style="height:20px;width:20px"} **OpenAI** | Priority & flex service tiers `/v1/chat/completions` `service_tier` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} [Amazon Bedrock - service tiers NEW](https://aws.amazon.com/fr/blogs/aws/new-amazon-bedrock-service-tiers-help-you-match-ai-workload-performance-with-cost/) |
 
 ### 🧠 Embeddings
 
@@ -155,6 +156,13 @@ Expands multimodal embedding capabilities, adds prompt caching support, and intr
 | Server-side ARN mapping            | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock                                  |
 | Client-side ARN passing (optional) | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock                                  |
 
+### 📋 Headers for Chat Completions/Embeddings/Image Generation
+
+| Provider                                                                       | Endpoint/Feature                                                                    | AWS Backend                                                                                        |
+|--------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
+| ![Amazon](styles/logo_amazon.svg){: style="height:20px;width:20px"} **Amazon** | Service tier configuration header `X-Amzn-Bedrock-Service-Tier`                     | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
+| ![Amazon](styles/logo_amazon.svg){: style="height:20px;width:20px"} **Amazon** | Performance latency configuration header `X-Amzn-Bedrock-PerformanceConfig-Latency` | ![Amazon Bedrock](styles/logo_amazon_bedrock.svg){: style="height:20px;width:20px"} Amazon Bedrock |
+
 ### Fixes
 
 - `/v1/chat/completions`: Fix default value passed to the converse API for tools without parameters.
 
@@ -7,7 +7,7 @@ requires-python = ">=3.13"
 dependencies = [
     "fastapi",
     "aioboto3>=15.1.0",
-    "aiobotocore>=2.24.0",
+    "botocore>=1.40.76",
     "pydantic>=2",
     "pydantic-settings>=2",
     "sse-starlette",