BerriAI · ishaan-jaff · Oct 3, 2025 · Sep 24, 2025 · Sep 24, 2025 · Sep 25, 2025
diff --git a/docs/my-website/docs/embedding/supported_embedding.md b/docs/my-website/docs/embedding/supported_embedding.md
@@ -266,7 +266,59 @@ print(response)
 | Titan Embeddings - G1 | `embedding(model="amazon.titan-embed-text-v1", input=input)` |
 | Cohere Embeddings - English | `embedding(model="cohere.embed-english-v3", input=input)` |
 | Cohere Embeddings - Multilingual | `embedding(model="cohere.embed-multilingual-v3", input=input)` |
+| TwelveLabs Marengo (Async) | `embedding(model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0", input=input, input_type="text")` | [Async Invoke Docs](../providers/bedrock_embedding#async-invoke-embedding) |
 
+## TwelveLabs Bedrock Embedding Models
+
+TwelveLabs Marengo models support multimodal embeddings (text, image, video, audio) and require the `input_type` parameter to specify the input format.
+
+### Usage
+
+```python
+from litellm import embedding
+import os
+
+# Set AWS credentials
+os.environ["AWS_ACCESS_KEY_ID"] = ""
+os.environ["AWS_SECRET_ACCESS_KEY"] = ""
+os.environ["AWS_REGION_NAME"] = "us-east-1"
+
+# Text embedding
+response = embedding(
+    model="bedrock/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["Hello world from LiteLLM!"],
+    input_type="text"  # Required parameter
+)
+
+# Image embedding (base64)
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQ..."],
+    input_type="image",  # Required parameter
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+
+# Video embedding (S3 URL)
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["s3://your-bucket/video.mp4"],
+    input_type="video",  # Required parameter
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+```
+
+### Required Parameters
+
+| Parameter | Description | Values |
+|-----------|-------------|--------|
+| `input_type` | Type of input content | `"text"`, `"image"`, `"video"`, `"audio"` |
+
+### Supported Models
+
+| Model Name | Function Call | Notes |
+|------------|---------------|-------|
+| TwelveLabs Marengo 2.7 (Sync) | `embedding(model="bedrock/us.twelvelabs.marengo-embed-2-7-v1:0", input=input, input_type="text")` | Text embeddings only |
+| TwelveLabs Marengo 2.7 (Async) | `embedding(model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0", input=input, input_type="text/image/video/audio")` | All input types, requires `output_s3_uri` |
 
 ## Cohere Embedding Models
 https://docs.cohere.com/reference/embed

diff --git a/docs/my-website/docs/providers/bedrock_embedding.md b/docs/my-website/docs/providers/bedrock_embedding.md
@@ -8,6 +8,182 @@
 | Cohere | `bedrock/cohere.*` | [Cohere Embeddings](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-cohere-embed.html) |
 | TwelveLabs | `bedrock/us.twelvelabs.*` | [TwelveLabs](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-twelvelabs.html) |
 
+## Async Invoke Support
+
+LiteLLM supports AWS Bedrock's async-invoke feature for embedding models that require asynchronous processing, particularly useful for large media files (video, audio) or when you need to process embeddings in the background.
+
+### Supported Models
+
+| Provider | Async Invoke Route | Use Case |
+|----------|-------------------|----------|
+| TwelveLabs Marengo | `bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0` | Video, audio, image, and text embeddings |
+
+### Required Parameters
+
+When using async-invoke, you must provide:
+
+| Parameter | Description | Required |
+|-----------|-------------|----------|
+| `output_s3_uri` | S3 URI where the embedding results will be stored | ✅ Yes |
+| `input_type` | Type of input: `"text"`, `"image"`, `"video"`, or `"audio"` | ✅ Yes |
+| `aws_region_name` | AWS region for the request | ✅ Yes |
+
+### Usage
+
+#### Basic Async Invoke
+
+```python
+from litellm import embedding
+
+# Text embedding with async-invoke
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["Hello world from LiteLLM async invoke!"],
+    aws_region_name="us-east-1",
+    input_type="text",
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+
+print(f"Job submitted! Invocation ARN: {response._hidden_params._invocation_arn}")
+```
+
+#### Video/Audio Embedding
+
+```python
+# Video embedding (requires async-invoke)
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["s3://your-bucket/video.mp4"],  # S3 URL for video
+    aws_region_name="us-east-1",
+    input_type="video",
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+
+print(f"Video embedding job submitted! ARN: {response._hidden_params._invocation_arn}")
+```
+
+#### Image Embedding with Base64
+
+```python
+import base64
+
+# Load and encode image
+with open("image.jpg", "rb") as img_file:
+    img_data = base64.b64encode(img_file.read()).decode('utf-8')
+    img_base64 = f"data:image/jpeg;base64,{img_data}"
+
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=[img_base64],
+    aws_region_name="us-east-1",
+    input_type="image",
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+```
+
+### Retrieving Job Information
+
+#### Getting Job ID and Invocation ARN
+
+The async-invoke response includes the invocation ARN in the hidden parameters:
+
+```python
+response = embedding(
+    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+    input=["Hello world"],
+    aws_region_name="us-east-1",
+    input_type="text",
+    output_s3_uri="s3://your-bucket/async-invoke-output/"
+)
+
+# Access invocation ARN
+invocation_arn = response._hidden_params._invocation_arn
+print(f"Invocation ARN: {invocation_arn}")
+
+# Extract job ID from ARN (last part after the last slash)
+job_id = invocation_arn.split("/")[-1]
+print(f"Job ID: {job_id}")
+```
+
+#### Checking Job Status
+
+Use LiteLLM's `retrieve_batch` function to check if your job is still processing:
+
+```python
+from litellm import retrieve_batch
+
+def check_async_job_status(invocation_arn, aws_region_name="us-east-1"):
+    """Check the status of an async invoke job using LiteLLM batch API"""
+    try:
+        response = retrieve_batch(
+            batch_id=invocation_arn,
+            custom_llm_provider="bedrock",
+            aws_region_name=aws_region_name
+        )
+        return response
+    except Exception as e:
+        print(f"Error checking job status: {e}")
+        return None
+
+# Check status
+status = check_async_job_status(invocation_arn, "us-east-1")
+if status:
+    print(f"Job Status: {status.status}")
+    print(f"Output Location: {status.output_file_id}")
+```
+
+**Note:** The actual embedding results are stored in S3. The `output_file_id` from the batch status can be used to locate the results file in your S3 bucket.
+
+### Error Handling
+
+#### Common Errors
+
+| Error | Cause | Solution |
+|-------|-------|----------|
+| `ValueError: output_s3_uri cannot be empty` | Missing S3 output URI | Provide a valid S3 URI |
+| `ValueError: Input type 'video' requires async_invoke route` | Using video/audio without async-invoke | Use `bedrock/async_invoke/` model prefix |
+| `ValueError: input_type is required` | Missing input type parameter | Specify `input_type` parameter |
+
+#### Example Error Handling
+
+```python
+try:
+    response = embedding(
+        model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
+        input=["Hello world"],
+        aws_region_name="us-east-1",
+        input_type="text",
+        output_s3_uri="s3://your-bucket/output/"  # Required for async-invoke
+    )
+    print("Job submitted successfully!")
+
+except ValueError as e:
+    if "output_s3_uri cannot be empty" in str(e):
+        print("Error: Please provide a valid S3 output URI")
+    elif "requires async_invoke route" in str(e):
+        print("Error: Use async_invoke model for video/audio inputs")
+    else:
+        print(f"Error: {e}")
+except Exception as e:
+    print(f"Unexpected error: {e}")
+```
+
+### Best Practices
+
+1. **Use async-invoke for large files**: Video and audio files are better processed asynchronously
+2. **Use LiteLLM batch API**: Use `retrieve_batch()` instead of direct Bedrock API calls for status checking
+3. **Monitor job status**: Check job status periodically using the batch API to know when results are ready
+4. **Handle errors gracefully**: Implement proper error handling for network issues and job failures
+5. **Set appropriate timeouts**: Consider the processing time for large files
+6. **Use S3 for large inputs**: For video/audio, use S3 URLs instead of base64 encoding
+
+### Limitations
+
+- Async-invoke is currently only supported for TwelveLabs Marengo models
+- Results are stored in S3 and must be retrieved separately using the output file ID
+- Job status checking requires using LiteLLM's `retrieve_batch()` function
+- No built-in polling mechanism in LiteLLM (must implement your own status checking loop)
+
 ### API keys
 This can be set as env variables or passed as **params to litellm.embedding()**
 ```python

diff --git a/litellm/__init__.py b/litellm/__init__.py
@@ -1161,6 +1161,7 @@ def add_known_models():
 )
 from .llms.cohere.chat.transformation import CohereChatConfig
 from .llms.bedrock.embed.cohere_transformation import BedrockCohereEmbeddingConfig
+from .llms.bedrock.embed.twelvelabs_marengo_transformation import TwelveLabsMarengoEmbeddingConfig
 from .llms.openai.openai import OpenAIConfig, MistralEmbeddingConfig
 from .llms.openai.image_variations.transformation import OpenAIImageVariationConfig
 from .llms.deepinfra.chat.transformation import DeepInfraConfig