Skip to content

Conversation

Sameerlite
Copy link
Collaborator

@Sameerlite Sameerlite commented Sep 24, 2025

Title

Add AWS Bedrock Async-Invoke Support for Embedding Models

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
image

Changes

Summary

Adds support for AWS Bedrock's async-invoke feature for embedding models, enabling asynchronous processing of large media files (video, audio) and background embedding generation.

Key Features

  • Async-Invoke Support: New model prefix bedrock/async_invoke/ for TwelveLabs Marengo models
  • Job Tracking: Returns invocation ARN in _hidden_params for status monitoring
  • Batch API Integration: Uses retrieve_batch() for job status checking
  • Comprehensive Testing: 15+ test cases covering all scenarios
  • Full Documentation: Complete usage guide with examples

Usage Examples

Basic Async-Invoke

from litellm import embedding

# Submit async job
response = embedding(
    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
    input=["Hello world from LiteLLM async invoke!"],
    aws_region_name="us-east-1",
    output_s3_uri="s3://your-bucket/async-invoke-output/",

    # new litellm param
    input_type: "text" 
)

# Get job tracking info
invocation_arn = response._hidden_params._invocation_arn
print(f"Job submitted! ARN: {invocation_arn}")

Video/Audio Processing

# Process video file asynchronously
response = embedding(
    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
    input=["s3://your-bucket/video.mp4"],
    aws_region_name="us-east-1",
    inputType="video",
    output_s3_uri="s3://your-bucket/async-invoke-output/"
)

Job Status Monitoring

from litellm import retrieve_batch

# Check job status
status = retrieve_batch(
    batch_id=invocation_arn,
    custom_llm_provider="bedrock",
    aws_region_name="us-east-1"
)

print(f"Status: {status.status}")
print(f"Output: {status.output_file_id}")

Files Changed

  • litellm/llms/bedrock/embed/embedding.py - Core async-invoke logic
  • litellm/llms/bedrock/embed/twelvelabs_marengo_transformation.py - TwelveLabs-specific handling
  • litellm/batches/main.py - Batch API integration for status checking
  • litellm/types/utils.py - Response object updates
  • tests/test_litellm/llms/bedrock/embed/test_bedrock_async_invoke_embedding.py - Comprehensive test suite
  • docs/providers/bedrock_embedding.md - Complete documentation

This feature enables efficient processing of large media files and background embedding generation, making LiteLLM more suitable for production workloads with heavy media processing requirements.

Copy link

vercel bot commented Sep 24, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Oct 2, 2025 9:40am

@Sameerlite Sameerlite changed the title Litellm x twelvelabs bedrock (feat)Litellm x twelvelabs bedrock[Async Invoke Support] Sep 24, 2025
@krrishdholakia
Copy link
Contributor

@Sameerlite can you rebase with main, so we can see it with working tests

Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed

```python
from litellm import embedding

# Text embedding with async-invoke
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update interface to this

# Text embedding with async-invoke
response = embedding(
    model="bedrock/async_invoke/us.twelvelabs.marengo-embed-2-7-v1:0",
    input=["Hello world from LiteLLM async invoke!"],
    aws_region_name="us-east-1",
    output_s3_uri="s3://your-bucket/async-invoke-output/",

    # new litellm param
    input_type: "text" 
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code as requested

@krrishdholakia
Copy link
Contributor

bump on this? @Sameerlite

Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ishaan-jaff ishaan-jaff merged commit 544db8d into main Oct 3, 2025
36 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants