Role: Lead Architect & Developer Tech Stack: Java 21, Spring AI, PostgreSQL (PgVector), Docker, OpenAI API
This project is an AI-powered Question Answering system designed to retrieve precise information from unstructured enterprise documents. It addresses the specific challenges of context window limits and hallucinations in industrial applications.
graph TD
User[Client] -->|Query| API[Spring Boot API]
subgraph "RAG Pipeline (Spring AI)"
API -->|Orchestrate| Service[RAG Service]
Service -->|1. Hybrid Search| Retriever[Retriever Strategy]
Service -->|2. Augment| Prompt[Prompt Template]
end
subgraph "Storage Layer"
Retriever -->|Keyword| ES[Elasticsearch]
Retriever -->|Semantic| Vector[PostgreSQL / PgVector]
end
subgraph "LLM Integration"
Prompt -->|Context+Query| Model[OpenAI / Bedrock]
Model -->|Response| User
end
Hybrid Search Algorithm: Implemented a weighted retrieval strategy combining Elasticsearch (BM25) for keyword precision and PgVector for semantic understanding.
Scalable Pipeline: Architected a microservices-ready ingestion pipeline using Spring AI to orchestrate document chunking and embedding generation.
Context Optimization: Solved context window constraints by optimizing chunk sizes based on semantic boundaries.