A fully serverless, cost-efficient AWS architecture for exposing enterprise artifacts (coding guidelines, documentation) via semantic search to IDE-integrated APIs.
- Amazon S3:
my-enterprise-artifacts-bucket: Raw artifacts (PDFs, Markdown)my-enterprise-vectors-bucket: Vector embeddings and indexes (S3 Vectors)
- AWS Lambda:
- Ingestion Lambda: Processes uploads, chunks text, generates embeddings
- Query Lambda: Handles API requests, performs vector search
- Application Load Balancer: HTTPS API endpoint with authentication
- Amazon Bedrock (Optional): Titan Embeddings fallback
- Amazon Cognito (Optional): API authentication
- S3 Vectors: 70-95% savings vs traditional vector DBs (~$0.023/GB/month)
- Open-source embeddings (sentence-transformers) to avoid Bedrock costs
- Low-memory Lambda functions (128-512MB)
- ALB instead of API Gateway (~$0.0225/hour + data)
- S3 Intelligent-Tiering for storage optimization
.
├── lambda/
│ ├── ingestion/
│ │ ├── handler.py # Ingestion Lambda handler
│ │ ├── requirements.txt # Dependencies
│ │ └── Dockerfile # Container image (optional)
│ └── query/
│ ├── handler.py # Query Lambda handler
│ ├── requirements.txt # Dependencies
│ └── Dockerfile # Container image (optional)
├── utils/
│ ├── s3_vectors.py # S3 vector operations
│ ├── embedding.py # Embedding generation
│ └── text_processing.py # Text chunking utilities
├── infrastructure/
│ ├── app.py # AWS CDK main app
│ ├── stacks/
│ │ ├── storage_stack.py # S3 buckets
│ │ ├── compute_stack.py # Lambda functions
│ │ └── network_stack.py # ALB, VPC
│ └── cdk.json # CDK configuration
├── scripts/
│ ├── deploy.sh # Deployment script
│ └── upload_artifact.py # Sample upload script
└── requirements.txt # Root dependencies
- AWS CLI configured with credentials
- Python 3.9+
- AWS CDK installed (
npm install -g aws-cdk) - Docker (for Lambda container images)
pip install -r requirements.txtcd infrastructure
cdk bootstrap aws://ACCOUNT-ID/us-east-1
cdk deploy --allpython scripts/upload_artifact.py --file path/to/coding-guidelines.pdfcurl -X POST https://YOUR-ALB-ENDPOINT/query \
-H "Content-Type: application/json" \
-d '{"query": "What are the guidelines for error handling in Java?"}'Performs semantic search on enterprise artifacts.
Request:
{
"query": "What are the guidelines for error handling in Java?",
"top_k": 5
}Response:
{
"results": [
{
"document_id": "coding-guidelines.pdf",
"chunk": "Error handling in Java should use try-catch blocks...",
"score": 0.89,
"metadata": {
"page": 15,
"s3_uri": "s3://my-enterprise-artifacts-bucket/coding-guidelines.pdf"
}
}
]
}- CloudWatch Logs: Lambda execution logs
- CloudWatch Metrics: Lambda invocations, duration, errors
- S3 Metrics: Storage, request counts
- S3 Storage (100GB): ~$2.30
- S3 API Calls (1M): ~$0.40
- Lambda (1M invocations): ~$0.20 + compute
- ALB: ~$16.20 (720 hours)
- Total: ~$20-30/month for moderate usage
- IAM roles with least-privilege access
- ALB with HTTPS/TLS
- Optional: Amazon Cognito for authentication
- S3 bucket encryption at rest
- VPC isolation for Lambda functions
# Test ingestion locally
cd lambda/ingestion
python -m pytest tests/
# Test query locally
cd lambda/query
python -m pytest tests/Upload to S3 bucket - EventBridge will trigger ingestion automatically:
aws s3 cp my-document.pdf s3://my-enterprise-artifacts-bucket/- QUICKSTART.md - Get started in 15 minutes
- ARCHITECTURE.md - Detailed architecture documentation
- DIAGRAMS.md - Visual flow diagrams and architecture
- TROUBLESHOOTING.md - Common issues and solutions
- IDE_INTEGRATION.md - IDE integration examples
- PROJECT_SUMMARY.md - Complete project overview
See TROUBLESHOOTING.md for common issues.
MIT