Skip to content

AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots).

Notifications You must be signed in to change notification settings

francomano/rest-api-response-streaming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

REST API Streaming with Lambda Custom Runtime + Bedrock

A compact POC that streams HTTP responses from API Gateway (REST) using a custom Python Lambda runtime inside a container image. It supports:

  • Response streaming (chunked HTTP) from API Gateway
  • Optional chaining to a second Lambda via Function URL
  • Bedrock streaming via converse_stream

Architecture (current behavior)

  • API Gateway (REST) → Lambda 1 (custom runtime)
  • Lambda 1 streams back to API Gateway via Runtime API
  • Optional: Lambda 1 → Lambda 2 (Function URL) → streams back
  • Bedrock calls use bedrock-runtime.converse_stream

Prerequisites

  • AWS CLI configured
  • Docker running
  • Bedrock access in the region you use
  • Inference profiles for models that require them (e.g., Amazon Nova)

Deploy

chmod +x deploy_streaming.sh
./deploy_streaming.sh

The script will:

  1. Build & push container image to ECR
  2. Deploy CloudFormation (API Gateway + 2 Lambdas)
  3. Print endpoint and API key

Required Env Vars

The script exits if these are missing:

export BEDROCK_PROFILE_FIRST="arn-or-id"
export BEDROCK_PROFILE_SECOND="arn-or-id"

Optional overrides:

export AWS_REGION=us-east-1
export BEDROCK_REGION=us-east-1
export BEDROCK_MODEL_FIRST="amazon.nova-micro-v1:0"
export BEDROCK_MODEL_SECOND="amazon.nova-micro-v1:0"
export IMAGE_TAG=auto   # auto = timestamp

Test

curl -N -H "x-api-key: $API_KEY_VALUE" \
  "$BASE_URL/stream?prompt=Hello"

Route through the second Lambda:

curl -N -H "x-api-key: $API_KEY_VALUE" \
  "$BASE_URL/stream?use_second=true&prompt=Hello"

Notes

  • If Bedrock returns throttling/quota errors, the request is rejected before streaming starts.
  • If you see Internal server error, check Lambda logs.

About

AWS announced support for response streaming in Amazon API Gateway to significantly improve the responsiveness of your REST APIs by progressively streaming response payloads back to the client. With this new capability, you can use streamed responses to enhance user experience when building LLM-driven applications (such as AI agents and chatbots).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages