"VoyaNode" is a state-of-the-art Retrieval-Augmented Generation (RAG) system designed to transform travel documents (PDFs) into an interactive, intelligent travel assistant. Built on AWS Serverless architecture, it leverages high-performance vector search and generative AI to provide factual, context-aware travel advice.
- Agentic RAG: Uses Anthropic Claude 3.5 Haiku to reason through travel documents and provide precise, context-aware answers.
- Asynchronous Processing: A dedicated SQS Worker handles document chunking and vector embedding in the background.
- Vector Search: Powered by Amazon OpenSearch Serverless (AOSS) with k-NN indexing.
- Production Ready: Includes systemd service configurations and Docker support for seamless deployment on EC2.
- Live Monitoring: Real-time indexing status and chunk-count tracking in the data management dashboard.
The system follows a modern RAG pipeline:
- Ingestion: PDF files are uploaded to Amazon S3.
- Messaging: S3 triggers an event to Amazon SQS.
- Processing: A Python Worker retrieves the file, performs intelligent chunking, and generates 1024-dimension embeddings using Amazon Titan v2.
- Storage: Vectors and metadata are stored in an OpenSearch Serverless k-NN index.
- Retrieval: When a user asks a question, the system finds relevant chunks and uses Claude 3.5 to synthesize a response.
- Flask (Python Web Framework)
- HTML5 & CSS3 (Modern Glassmorphism Design)
- JavaScript (Asynchronous AJAX requests)
- Python 3.12
- Gunicorn (WSGI HTTP Server)
- Nginx (Reverse Proxy & Static File Hosting)
- Claude 3.5 Haiku: Reasoning and response synthesis.
- Titan Embeddings v2: High-performance 1024-dim vector generation.
- Amazon OpenSearch Serverless: Vector engine for k-NN retrieval.
- Amazon S3: Scalable object storage for PDF documents.
- Amazon SQS: Message queuing for decoupled, async processing.
- Docker & Docker Compose: Container orchestration.
VoyaNode/
βββ app.py # Main Flask API: Manages UI, Chat, and RAG retrieval logic.
βββ worker.py # Background Processor: Listens to SQS and handles PDF indexing.
βββ config.py # Configuration Management: Loads and validates .env variables.
βββ requirements.txt # Python Dependencies.
βββ Dockerfile # Containerization for EC2 deployment.
βββ README.md # Project Documentation.
β
βββ .env.example # Template for secrets and environment variables.
βββ .gitignore # Git exclusion file (prevents uploading .env and sensitive keys).
βββ .dockerignore # Docker exclusion file (prevents unnecessary files in the image).
β
βββ systemd/ # Linux Service Configurations:
β βββ rag-api.service # Service to ensure the API runs 24/7 via Gunicorn.
β βββ rag-worker.service # Service to keep the asynchronous processor active.
β
βββ scripts/ # Automation & Maintenance:
β βββ create_infra.py # IaC Script: Automatically creates S3, SQS, and AOSS collections.
β βββ create_index.py # Database Init: Sets up the k-NN vector index.
β βββ upload_data.py # Data Ingestion: Batch uploads local PDFs to S3.
β βββ smoke_test.py # Health Check: Validates AWS connections before runtime.
β
βββ utils/ # Core Logic (The "Engine"):
β βββ s3_utils.py # Storage Management: Upload/Download/Delete logic for S3.
β βββ chunking.py # Document Processing: PDF cleaning and semantic chunking.
β βββ opensearch_utils.py # Vector DB Engine: k-NN search and indexing management.
β βββ bedrock_utils.py # GenAI Interface: Embedding generation and Claude 3.5 chat.
β
βββ templates/ # Frontend Components (HTML):
β βββ base.html # Site skeleton (Header/Footer).
β βββ index.html # Chat interface: Interaction with the Travel Advisor.
β βββ data.html # Data Dashboard: File status, indexing logs, and Wipe functionality.
β βββ about.html # Project info: Explaining RAG and System Architecture.
β
βββ static/ # Styles & Client-side Logic:
β βββ css/main.css # Visual styling: Modern Glassmorphism theme.
β βββ js/main.js # Browser logic: Real-time chat updates and AJAX calls.
β βββ images/ # UI Assets: Logos and background graphics.
β
βββ data/ # Local Storage:
βββ raw/ # Source PDFs for initial ingestion.
βββ processed/ # Temporary folder for files during processing.
Follow these steps to deploy "VoyaNode" on your local machine or an AWS EC2 instance.
- AWS Account: Access to Amazon Bedrock (Claude 3.5 Haiku & Titan v2 models must be enabled).
- Docker & Docker Compose: Installed and running.
- Python 3.12+: Required for running infrastructure setup scripts.
- AWS CLI: Configured with appropriate permissions (S3, SQS, AOSS, Bedrock).
# Clone the repository
git clone https://github.com/AdamMes/voyanode.git
cd VoyaNode
# Create environment file from template
cp .env.example .env
# Important: Open .env and fill in your AWS_REGION, S3_BUCKET, SQS_URL, and OS_HOST.Before running the application, use the provided automation scripts to provision your AWS environment:
- Provision Resources: Create the S3 bucket, SQS queue, and OpenSearch Collection.
Note: Wait until the OpenSearch Collection status is 'Active' in the AWS Console.
python scripts/create_infra.py
- Initialize Vector Index: Create the
voyanode-indexwith the required k-NN settings.python scripts/create_index.py
The entire stack (API, Worker, and Nginx) is containerized for easy deployment. Run the following command to start the system:
# Build and start containers in detached mode
docker compose up -d --buildThe UI will be accessible at http://localhost (or your EC2 Public IP).
For a robust production environment on AWS EC2, follow these additional optimizations:
If using a t3.medium instance, it is highly recommended to enable a Swap file to prevent OOM (Out Of Memory) crashes during PDF processing:
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstabTo ensure the services auto-restart on failure or reboot (when not using Docker), use the provided service files:
sudo cp systemd/*.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now rag-api.service rag-worker.serviceThe Python Worker occasionally faced OOM (Out Of Memory) crashes during intensive PDF embedding tasks.
- Optimization: Implemented a 2GB Linux Swap file to expand virtual memory, ensuring the Worker remains stable during high-load processing.
Initial requests to OpenSearch Serverless occasionally exceeded default 10-second limits.
- Optimization: Configured the OpenSearch Python client with a 30-second timeout to handle serverless scaling delays.
Resolved compatibility issues between development (Apple Silicon M-series) and production (Intel x86_64 EC2).
- Optimization: Used a multi-platform build strategy (--platform linux/amd64) for Docker images to ensure binary compatibility.
This project is licensed under the MIT License.
- Project Name: VoyaNode Travel Advisor
- Course: AI Engineer Certification
- Developer: Adam Meshulam


