A compact POC that streams HTTP responses from API Gateway (REST) using a custom Python Lambda runtime inside a container image. It supports:
- Response streaming (chunked HTTP) from API Gateway
- Optional chaining to a second Lambda via Function URL
- Bedrock streaming via
converse_stream
- API Gateway (REST) → Lambda 1 (custom runtime)
- Lambda 1 streams back to API Gateway via Runtime API
- Optional: Lambda 1 → Lambda 2 (Function URL) → streams back
- Bedrock calls use
bedrock-runtime.converse_stream
- AWS CLI configured
- Docker running
- Bedrock access in the region you use
- Inference profiles for models that require them (e.g., Amazon Nova)
chmod +x deploy_streaming.sh
./deploy_streaming.shThe script will:
- Build & push container image to ECR
- Deploy CloudFormation (API Gateway + 2 Lambdas)
- Print endpoint and API key
The script exits if these are missing:
export BEDROCK_PROFILE_FIRST="arn-or-id"
export BEDROCK_PROFILE_SECOND="arn-or-id"Optional overrides:
export AWS_REGION=us-east-1
export BEDROCK_REGION=us-east-1
export BEDROCK_MODEL_FIRST="amazon.nova-micro-v1:0"
export BEDROCK_MODEL_SECOND="amazon.nova-micro-v1:0"
export IMAGE_TAG=auto # auto = timestampcurl -N -H "x-api-key: $API_KEY_VALUE" \
"$BASE_URL/stream?prompt=Hello"Route through the second Lambda:
curl -N -H "x-api-key: $API_KEY_VALUE" \
"$BASE_URL/stream?use_second=true&prompt=Hello"- If Bedrock returns throttling/quota errors, the request is rejected before streaming starts.
- If you see
Internal server error, check Lambda logs.