Ask questions about your own documents using a fully serverless RAG (Retrieval-Augmented Generation) pipeline on AWS. Upload a PDF, JSON, DOCX, or other supported file, ask a question, get a grounded answer with cited sources — no hallucinations, no servers, no maintenance.
Document Upload
│
▼
┌─────────┐ S3 Event ┌────────────────────┐ Bedrock Titan ┌────────────────────┐
│ S3 │ ─────────────── ▶│ Ingestion Lambda │ ──────────────────▶│ OpenSearch │
│ Bucket │ │ │ Embed chunks │ Serverless │
└─────────┘ │ • Extract text │ │ (Vector Store) │
│ • Chunk text │◀────────────────── │ │
│ • Embed via Titan │ Store vectors └────────────────────┘
└────────────────────┘ ▲
│ KNN Search
User Question │
│ │
▼ │
┌─────────────┐ POST /chat ┌─────────────────────┐ Embed question │
│ API Gateway │ ─────────────▶│ Query Lambda │ ──────────────────────────┘
└─────────────┘ │ │
│ • Embed question │ Top-K chunks
│ • KNN vector search│ ◀──────────────
│ • Build RAG prompt │
│ • Generate answer │
└─────────────────────┘
│ Bedrock Nova
▼
JSON Answer
+ Sources cited
| Service | Purpose |
|---|---|
| S3 | Document storage + Lambda trigger |
| Lambda | Ingestion (doc → vectors) and Query (search + generate) |
| Amazon Bedrock — Titan Embed v2 | Text embeddings (1024 dimensions) |
| Amazon Bedrock — Amazon Nova | Answer generation |
| OpenSearch Serverless (VECTORSEARCH) | KNN vector similarity search |
| API Gateway | REST API for chat interface |
| IAM | Least-privilege roles per Lambda |
| CloudFormation | Full infrastructure as code |
.pdf .txt .md .docx .csv .json
- AWS account with Bedrock model access enabled for:
amazon.titan-embed-text-v2:0amazon.nova-micro-v1:0(or your chosen generation model)
- AWS CLI configured (
aws configure) - Python 3.11+ with
pip
Windows (PowerShell):
git clone https://github.com/YOUR_USERNAME/aws-rag-chatbot.git
cd aws-rag-chatbot
.\scripts\build_and_deploy.ps1 -StackName rag-chatbot -Region us-east-1Linux / macOS:
git clone https://github.com/YOUR_USERNAME/aws-rag-chatbot.git
cd aws-rag-chatbot
chmod +x scripts/*.sh
./scripts/build_and_deploy.sh rag-chatbot us-east-1Deployment takes ~5 minutes (OpenSearch Serverless collection creation is the slow step).
On completion, the script prints stack outputs including the API endpoint and frontend URL.
Windows:
.\scripts\upload_doc.ps1 -StackName rag-chatbot -Region us-east-1 -PdfPath C:\path\to\document.pdfLinux / macOS:
./scripts/upload_doc.sh rag-chatbot us-east-1 /path/to/document.pdfWait ~30-60 seconds for ingestion to complete. Large files (1000+ chunks) may take several minutes.
Windows:
.\scripts\test_chat.ps1 -StackName rag-chatbot -Region us-east-1 -Question "What are the key findings?"Linux / macOS:
./scripts/test_chat.sh rag-chatbot us-east-1 "What are the key findings?"Or call the API directly:
Invoke-RestMethod -Method Post `
-Uri "https://<api-id>.execute-api.us-east-1.amazonaws.com/prod/chat" `
-ContentType "application/json" `
-Body '{"question": "Summarize the main topics."}'Response format:
{
"answer": "The document covers...",
"sources": ["your-document.pdf"],
"chunks_used": 5
}All parameters have sensible defaults. Override at deploy time:
aws cloudformation deploy \
--stack-name rag-chatbot \
--template-file cloudformation/template.yaml \
--parameter-overrides \
DeploymentBucket=my-deploy-bucket \
GenerationModel=amazon.nova-lite-v1:0 \
TopK=8 \
ChunkSize=400 \
--capabilities CAPABILITY_NAMED_IAM| Parameter | Default | Description |
|---|---|---|
EmbeddingModel |
amazon.titan-embed-text-v2:0 |
Bedrock embedding model |
GenerationModel |
amazon.nova-micro-v1:0 |
Bedrock generation model |
ChunkSize |
500 |
Words per chunk |
ChunkOverlap |
50 |
Overlap between chunks |
TopK |
5 |
Number of chunks retrieved per query |
aws-rag-chatbot/
├── cloudformation/
│ └── template.yaml # Full IaC — all AWS resources
├── lambda/
│ ├── ingestion/
│ │ ├── handler.py # Doc extract → chunk → embed → index
│ │ └── requirements.txt
│ ├── query/
│ │ ├── handler.py # Question → embed → search → generate
│ │ └── requirements.txt
│ └── presign/
│ ├── handler.py # Pre-signed S3 upload URLs
│ └── requirements.txt
├── frontend/
│ └── index.html # Single-page web UI
├── scripts/
│ ├── build_and_deploy.sh/.ps1 # One-command deploy
│ ├── upload_doc.sh/.ps1 # Upload a document
│ └── test_chat.sh/.ps1 # Test the API
└── .env.example
For light usage (~100 questions/day, ~10 documents):
| Service | Estimated Monthly Cost |
|---|---|
| OpenSearch Serverless | ~$24 (0.5 OCU minimum) |
| Bedrock Titan Embeddings | ~$0.10 |
| Bedrock Nova Generation | ~$0.01 |
| Lambda | ~$0.00 (free tier) |
| API Gateway | ~$0.04 |
| S3 | ~$0.01 |
| Total | ~$25/month |
The OpenSearch Serverless OCU cost dominates. For dev/testing, tear down the stack when not in use.
# Empty the documents bucket first, then delete the stack
$AccountId = (aws sts get-caller-identity --query Account --output text)
aws s3 rm "s3://rag-chatbot-docs-${AccountId}" --recursive
aws s3 rm "s3://rag-chatbot-frontend-${AccountId}" --recursive
aws cloudformation delete-stack --stack-name rag-chatbot --region us-east-1
# Also delete CloudWatch log groups if desiredIngestion Lambda times out
- Large files with many chunks may exceed the default 900s timeout
- Consider increasing
Timeoutin the CFN template or splitting large files
"No documents indexed" response
- Check ingestion Lambda logs in CloudWatch
- Confirm the file extension is supported (
.pdf,.txt,.md,.docx,.csv,.json) - Allow sufficient time after upload before querying
Access denied to Bedrock
- Confirm model access is enabled in the Bedrock console for your region
- Check the Lambda IAM role has
bedrock:InvokeModelfor the correct model ARN
OpenSearch 403 errors
- The data access policy may take 1-2 minutes to propagate after stack creation
- Verify the Lambda role ARNs match what's in the access policy
Dimension mismatch on index writes
- The ingestion Lambda auto-detects and recreates the index if dimensions changed
- This happens automatically on the next document upload
MIT