Skip to content

masterq1/aws-rag-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AWS Serverless RAG Document Chatbot

Ask questions about your own documents using a fully serverless RAG (Retrieval-Augmented Generation) pipeline on AWS. Upload a PDF, JSON, DOCX, or other supported file, ask a question, get a grounded answer with cited sources — no hallucinations, no servers, no maintenance.

Architecture

Document Upload
    │
    ▼
┌─────────┐     S3 Event     ┌────────────────────┐    Bedrock Titan    ┌────────────────────┐
│   S3    │ ─────────────── ▶│  Ingestion Lambda  │ ──────────────────▶│  OpenSearch        │
│ Bucket  │                  │                    │    Embed chunks     │  Serverless        │
└─────────┘                  │  • Extract text    │                     │  (Vector Store)    │
                             │  • Chunk text      │◀──────────────────  │                    │
                             │  • Embed via Titan │    Store vectors     └────────────────────┘
                             └────────────────────┘                               ▲
                                                                                  │ KNN Search
User Question                                                                     │
    │                                                                             │
    ▼                                                                             │
┌─────────────┐   POST /chat  ┌─────────────────────┐   Embed question           │
│ API Gateway │ ─────────────▶│    Query Lambda     │ ──────────────────────────┘
└─────────────┘               │                     │
                              │  • Embed question   │   Top-K chunks
                              │  • KNN vector search│ ◀──────────────
                              │  • Build RAG prompt │
                              │  • Generate answer  │
                              └─────────────────────┘
                                        │ Bedrock Nova
                                        ▼
                                   JSON Answer
                                   + Sources cited

AWS Services Used

Service Purpose
S3 Document storage + Lambda trigger
Lambda Ingestion (doc → vectors) and Query (search + generate)
Amazon Bedrock — Titan Embed v2 Text embeddings (1024 dimensions)
Amazon Bedrock — Amazon Nova Answer generation
OpenSearch Serverless (VECTORSEARCH) KNN vector similarity search
API Gateway REST API for chat interface
IAM Least-privilege roles per Lambda
CloudFormation Full infrastructure as code

Supported File Types

.pdf .txt .md .docx .csv .json

Prerequisites

  • AWS account with Bedrock model access enabled for:
    • amazon.titan-embed-text-v2:0
    • amazon.nova-micro-v1:0 (or your chosen generation model)
  • AWS CLI configured (aws configure)
  • Python 3.11+ with pip

Deploy

Windows (PowerShell):

git clone https://github.com/YOUR_USERNAME/aws-rag-chatbot.git
cd aws-rag-chatbot

.\scripts\build_and_deploy.ps1 -StackName rag-chatbot -Region us-east-1

Linux / macOS:

git clone https://github.com/YOUR_USERNAME/aws-rag-chatbot.git
cd aws-rag-chatbot
chmod +x scripts/*.sh
./scripts/build_and_deploy.sh rag-chatbot us-east-1

Deployment takes ~5 minutes (OpenSearch Serverless collection creation is the slow step).

On completion, the script prints stack outputs including the API endpoint and frontend URL.

Usage

1. Upload a document

Windows:

.\scripts\upload_doc.ps1 -StackName rag-chatbot -Region us-east-1 -PdfPath C:\path\to\document.pdf

Linux / macOS:

./scripts/upload_doc.sh rag-chatbot us-east-1 /path/to/document.pdf

Wait ~30-60 seconds for ingestion to complete. Large files (1000+ chunks) may take several minutes.

2. Ask a question

Windows:

.\scripts\test_chat.ps1 -StackName rag-chatbot -Region us-east-1 -Question "What are the key findings?"

Linux / macOS:

./scripts/test_chat.sh rag-chatbot us-east-1 "What are the key findings?"

Or call the API directly:

Invoke-RestMethod -Method Post `
  -Uri "https://<api-id>.execute-api.us-east-1.amazonaws.com/prod/chat" `
  -ContentType "application/json" `
  -Body '{"question": "Summarize the main topics."}'

Response format:

{
  "answer": "The document covers...",
  "sources": ["your-document.pdf"],
  "chunks_used": 5
}

Configuration

All parameters have sensible defaults. Override at deploy time:

aws cloudformation deploy \
  --stack-name rag-chatbot \
  --template-file cloudformation/template.yaml \
  --parameter-overrides \
      DeploymentBucket=my-deploy-bucket \
      GenerationModel=amazon.nova-lite-v1:0 \
      TopK=8 \
      ChunkSize=400 \
  --capabilities CAPABILITY_NAMED_IAM
Parameter Default Description
EmbeddingModel amazon.titan-embed-text-v2:0 Bedrock embedding model
GenerationModel amazon.nova-micro-v1:0 Bedrock generation model
ChunkSize 500 Words per chunk
ChunkOverlap 50 Overlap between chunks
TopK 5 Number of chunks retrieved per query

Project Structure

aws-rag-chatbot/
├── cloudformation/
│   └── template.yaml          # Full IaC — all AWS resources
├── lambda/
│   ├── ingestion/
│   │   ├── handler.py          # Doc extract → chunk → embed → index
│   │   └── requirements.txt
│   ├── query/
│   │   ├── handler.py          # Question → embed → search → generate
│   │   └── requirements.txt
│   └── presign/
│       ├── handler.py          # Pre-signed S3 upload URLs
│       └── requirements.txt
├── frontend/
│   └── index.html              # Single-page web UI
├── scripts/
│   ├── build_and_deploy.sh/.ps1  # One-command deploy
│   ├── upload_doc.sh/.ps1        # Upload a document
│   └── test_chat.sh/.ps1         # Test the API
└── .env.example

Cost Estimate

For light usage (~100 questions/day, ~10 documents):

Service Estimated Monthly Cost
OpenSearch Serverless ~$24 (0.5 OCU minimum)
Bedrock Titan Embeddings ~$0.10
Bedrock Nova Generation ~$0.01
Lambda ~$0.00 (free tier)
API Gateway ~$0.04
S3 ~$0.01
Total ~$25/month

The OpenSearch Serverless OCU cost dominates. For dev/testing, tear down the stack when not in use.

Cleanup

# Empty the documents bucket first, then delete the stack
$AccountId = (aws sts get-caller-identity --query Account --output text)
aws s3 rm "s3://rag-chatbot-docs-${AccountId}" --recursive
aws s3 rm "s3://rag-chatbot-frontend-${AccountId}" --recursive
aws cloudformation delete-stack --stack-name rag-chatbot --region us-east-1
# Also delete CloudWatch log groups if desired

Troubleshooting

Ingestion Lambda times out

  • Large files with many chunks may exceed the default 900s timeout
  • Consider increasing Timeout in the CFN template or splitting large files

"No documents indexed" response

  • Check ingestion Lambda logs in CloudWatch
  • Confirm the file extension is supported (.pdf, .txt, .md, .docx, .csv, .json)
  • Allow sufficient time after upload before querying

Access denied to Bedrock

  • Confirm model access is enabled in the Bedrock console for your region
  • Check the Lambda IAM role has bedrock:InvokeModel for the correct model ARN

OpenSearch 403 errors

  • The data access policy may take 1-2 minutes to propagate after stack creation
  • Verify the Lambda role ARNs match what's in the access policy

Dimension mismatch on index writes

  • The ingestion Lambda auto-detects and recreates the index if dimensions changed
  • This happens automatically on the next document upload

License

MIT

About

Serverless RAG document chatbot on AWS — upload docs, ask questions, get grounded answers with citations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors