A streamlined, ECS-based infrastructure for serving Kimi K2.5 and Claude models (Opus 4.6, Sonnet 4.5) via AWS Bedrock.
flowchart TB
OpenCode["OpenCode CLI"] -->|"Invoke"| Client["Client<br>(CLI/API)"]
subgraph Auth["Authentication"]
Client
OIDC["OIDC Provider"]
end
Client -->|"1. OAuth Login"| OIDC
OIDC -->|"2. JWT Token"| Client
subgraph ALB["ALB Routing"]
Listener["HTTPS Listener<br>(443)"]
APIKey["API Key Route<br>(X-API-Key Header)"]
JWT["JWT Validation<br>(Bearer Token)"]
end
Client -->|"3a. X-API-Key"| Listener
Client -->|"3b. Bearer Token"| Listener
Listener -->|"API Key Auth"| APIKey
Listener -->|"JWT Auth"| JWT
JWT -->|"Validate"| OIDC
subgraph Router["ECS Router (Python)"]
ECS["ECS Fargate<br>Service"]
CheckJWT{"JWT<br>Present?"}
CheckAPIKey{"API Key<br>Present?"}
AuthCheck["Validate API Key"]
Route["Model Router"]
end
APIKey -->|"Forward"| ECS
JWT -->|"Forward"| ECS
ECS -->|"Check Headers"| CheckJWT
CheckJWT -->|"No JWT"| CheckAPIKey
CheckJWT -->|"Has JWT"| Route
CheckAPIKey -->|"Has API Key"| AuthCheck
CheckAPIKey -->|"No API Key"| Route
AuthCheck -->|"Read/Write"| DynamoDB[("DynamoDB<br>API Keys")]
AuthCheck -->|"Valid Key"| Route
subgraph Bedrock["AWS Bedrock"]
Route -->|"Anthropic Models<br>→ Converse API"| Converse["Bedrock<br>Converse API"]
Route -->|"Kimi Models<br>→ Mantle"| Mantle["Bedrock<br>Mantle"]
end
style OpenCode fill:#e3f2fd
style Client fill:#e1f5ff
style OIDC fill:#ffebee
style Listener fill:#fff4e1
style ECS fill:#e8f5e9
style CheckJWT fill:#fff9c4
style CheckAPIKey fill:#fff9c4
style DynamoDB fill:#e0f2f1
style Converse fill:#f3e5f5
style Mantle fill:#f3e5f5
- ECS Fargate: Containerized router service (no EC2 management)
- Multi-Model Support: Kimi K2.5, Claude Opus 4.6, and Claude Sonnet 4.5 via Bedrock
- Dual Routing: Bedrock Converse API for Anthropic models, Bedrock Mantle for others
- Dual ALB Setup: JWT validation for API, OIDC for browser traffic
- API Key Authentication: Long-lived keys for CI/CD and automation (no browser required)
- Structured Logging: JSON logs for CloudWatch Insights
- Auto-Scaling: 1-3 tasks based on CPU utilization
opencode-stack/
├── src/
│ ├── main.ts # CDK app entry point
│ ├── stacks/
│ │ ├── network-stack.ts # VPC, subnets, networking
│ │ ├── shared-certificate-stack.ts # ACM certificates
│ │ ├── auth-stack.ts # OIDC auth (Cognito or external)
│ │ ├── api-stack.ts # ECS service, ALB, JWT
│ │ └── distribution-stack.ts # Distribution landing page
│ ├── constructs/
│ │ ├── vpc-endpoints-construct.ts # Bedrock VPC endpoints
│ │ └── alb-security-group-construct.ts # ALB security group
│ └── types/
│ └── index.ts # Shared TypeScript types
├── services/
│ ├── router/
│ │ ├── main.py # Bedrock proxy service
│ │ ├── Dockerfile
│ │ └── requirements.txt
│ └── distribution/
│ ├── lambda/
│ │ └── index.py # Landing page handler
│ └── assets/ # Client binaries
│ ├── opencode-auth-* # Auth CLI binaries
│ ├── install.sh
│ └── opencode.json
├── auth/
│ └── opencode-auth/ # Go-based auth CLI
│ ├── main.go
│ ├── auth/ # Authentication logic
│ ├── apikey/ # API key management client
│ └── proxy/ # Proxy server
└── scripts/
└── deploy.sh # Deployment script
- AWS CLI configured (see AWS Configuration)
- Node.js 18+
- AWS CDK CLI:
npm install -g aws-cdk - Docker or Finch (for building container images)
Configure AWS credentials using one of the following methods:
Option 1: AWS SSO (Recommended)
aws configure sso
aws sso login --profile your-profile
export AWS_PROFILE=your-profileOption 2: Environment Variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_REGION=us-east-1Option 3: IAM Role (EC2/ECS/Lambda) Credentials are automatically provided via the instance metadata service.
npm install# Configure credentials (see AWS Configuration section above)
aws configure
# or
export AWS_PROFILE=your-profileUse the deployment script for a fully automated deployment:
# Deploy everything
./scripts/deploy.sh
# Or deploy specific phases
./scripts/deploy.sh phase1 # Foundation
./scripts/deploy.sh phase4 # Router onlySee scripts/README.md for detailed usage.
Note: The OIDC ALB client secret is created automatically when you run ./scripts/deploy.sh auth (Cognito mode). For external OIDC providers, ./scripts/setup.sh handles secret creation during the interactive setup.
If you prefer manual deployment:
# Build CDK
npm run build
# Bootstrap (first time only)
npx cdk bootstrap aws://YOUR_ACCOUNT_ID/us-east-1
# Deploy all stacks
npx cdk deploy --all
# Or deploy specific stacks (in order)
npx cdk deploy OpenCodeNetwork-dev # Phase 1: VPC
npx cdk deploy OpenCodeCertificate-dev # Phase 1: Certificate
npx cdk deploy OpenCodeAuth-dev # Phase 2: Auth
npx cdk deploy OpenCodeApi-dev # Phase 3: API/Router
npx cdk deploy OpenCodeDistribution-dev # Phase 4: DistributionUsing Docker (default):
cd services/router
docker build -t bedrock-router .
aws ecr get-login-password | docker login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
docker tag bedrock-router:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/bedrock-router-dev:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/bedrock-router-dev:latestUsing Finch:
cd services/router
finch build -t bedrock-router .
aws ecr get-login-password | finch login --username AWS --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
finch tag bedrock-router:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/bedrock-router-dev:latest
finch push 123456789012.dkr.ecr.us-east-1.amazonaws.com/bedrock-router-dev:latest| Variable | Default | Description |
|---|---|---|
PORT |
8080 | HTTP server port |
LOG_LEVEL |
INFO | Logging level (DEBUG, INFO, WARN, ERROR) |
BEDROCK_MANTLE_URL |
https://bedrock-mantle.us-east-1.api.aws | Bedrock Mantle endpoint |
BEDROCK_MODEL_MAP |
- | JSON string for model mapping |
API_KEYS_TABLE_NAME |
- | DynamoDB table name for API key storage |
Set in cdk.json or via command line:
{
"context": {
"environment": "dev",
"hostedZoneId": "Z0123456789ABCDEFGHIJ",
"hostedZoneName": "example.com",
"certificateDomain": "*.oc.example.com",
"apiDomain": "oc.example.com",
"webDomain": "downloads.oc.example.com"
}
}GET /health- Basic health check (returns HTTP 200)GET /ready- Deep health check (validates Bedrock token generation)
GET /v1/models- List available modelsPOST /v1/chat/completions- Chat completions (streaming supported)
POST /v1/api-keys- Create a new API keyGET /v1/api-keys- List your API keysDELETE /v1/api-keys/{key_prefix}- Revoke an API key
# Kimi K2.5
curl -X POST https://oc.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k25",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Claude Opus 4.6
curl -X POST https://oc.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Claude Sonnet 4.5
curl -X POST https://oc.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Using API Key (for CI/CD and automation)
curl -X POST https://oc.example.com/v1/chat/completions \
-H "X-API-Key: oc_YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus",
"messages": [{"role": "user", "content": "Hello!"}]
}'| Model | Alias | Context Window | Features |
|---|---|---|---|
| Kimi K2.5 | kimi-k25 |
256K tokens | Text, Image, Tool calling |
| Claude Opus 4.6 | claude-opus |
200K tokens | Text, Image, Reasoning, Tool calling, Prompt caching |
| Claude Sonnet 4.5 | claude-sonnet |
200K tokens | Text, Image, Reasoning, Tool calling, Prompt caching |
The router automatically detects the target model and routes to the appropriate backend:
- Anthropic models (Claude Opus/Sonnet): Use Bedrock Converse API via boto3 with automatic prompt caching
- Other models (Kimi): Use Bedrock Mantle proxy
Both routes return OpenAI-compatible responses.
- Client obtains JWT from Cognito via
opencode-auth login - ALB validates JWT via JWKS endpoint
- Request forwarded to ECS router
- Router uses IAM role to generate Bedrock tokens
For detailed JWT validation documentation, see JWT Validation Guide.
- User creates API key via
opencode-auth apikey create(requires prior JWT auth) - Client sends
X-API-Key: oc_...header with requests - ALB forwards to ECS router (no JWT validation for API key requests)
- Router validates key hash against DynamoDB, then routes normally
API keys are ideal for headless environments that can't perform browser-based OAuth flows. Keys are tied to user identity, expire after 1-365 days, and can be managed via the CLI.
# Create and save an API key
opencode-auth apikey create --description "CI pipeline" --expires-in-days 90 --save
# Restart proxy to use the new key
opencode-auth proxy restart
# All proxy traffic now authenticates with the API key automatically- User authenticates via Cognito
- Cognito federates to your Identity Provider
- ALB manages OIDC session
- User can access distribution landing page
- Log Group:
/ecs/bedrock-router-{environment} - Format: Structured JSON
- Fields:
timestamp,level,message,request_id,duration_ms
- ECS CPU/Memory utilization
- ALB request count/latency
- Bedrock token refresh events
cd services/router
pip install -r requirements.txt
python main.pynpm test# Synthesize CloudFormation
npx cdk synth
# Diff changes
npx cdk diff
# Deploy specific stack
npx cdk deploy OpenCodeApi-dev
# Destroy stack
npx cdk destroy OpenCodeApi-dev- VPC and networking (NetworkStack)
- ACM certificates (SharedCertificateStack)
- ECS cluster and ECR repository
- IAM roles and permissions
- Cognito user pool with OIDC IdP (CognitoStack)
- JWT ALB integration (ApiStack)
- ECS Fargate service with Bedrock proxy
- Target groups and listener rules
- Distribution service with Lambda
- Landing page for downloads
- OIDC ALB for browser auth
- S3 bucket for assets
- opencode-auth CLI (Go-based)
- Client configuration
- Multi-platform binaries
- Documentation
Current Status: All phases complete Next: Testing and optimization
- Lambda-based share API with event-sourcing storage
- WebSocket real-time updates
- Standalone HTML viewer with dark mode
- CloudFormation templates for Lambda + API Gateway + S3 + DynamoDB
See docs/SHARE-FEATURE.md for full details.
- IAM Roles: ECS task role with minimal Bedrock permissions
- Token Refresh: 1-hour TTL with automatic refresh
- Network: Private subnets, no public IP on tasks
- Encryption: HTTPS/TLS for all communications
- Secrets: No hardcoded credentials, use IAM roles
- JWT Validation: ALB-level JWT validation via JWKS endpoint (details)
- API Keys: SHA-256 hashed storage, mandatory expiration, max 10 per user
- ECS Fargate: Pay per use, scales to zero (min 1 task)
- ALB: Shared across services
- Bedrock: On-demand pricing, no reserved capacity
- Estimated Cost: ~$50-100/month for 50 developers
If you're using opencode, run /troubleshoot for interactive diagnosis or /jwt-debug for authentication issues.
Check CloudWatch logs:
aws logs tail /ecs/bedrock-router-dev --followVerify IAM permissions:
aws iam simulate-principal-policy \
--policy-source-arn TASK_ROLE_ARN \
--action-names bedrock:InvokeModelCheck /ready endpoint manually:
curl http://localhost:8080/ready- Create feature branch
- Make changes
- Run tests:
npm test - Build:
npm run build - Submit PR
This library is licensed under the MIT-0 License. See the LICENSE file.
- Issues: Create a GitHub issue
Last Updated: 2026-02-17