Skip to content

kousen/ragdemo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG Demo

Parallel implementations of Retrieval-Augmented Generation (RAG) in Java (Spring AI), Java (LangChain4j), and Python (LangChain) for teaching purposes.

All demos follow the same flow: Load PDF → Chunk → Embed → Store → Retrieve → Generate

Choose Your Implementation

Java (Spring AI) Java (LangChain4j) Python
Framework Spring AI 1.0.3 LangChain4j 1.10.0 LangChain
LLM OpenAI (gpt-5-mini) OpenAI (gpt-5-mini) OpenAI (gpt-5-mini)
Embeddings text-embedding-3-small text-embedding-3-small text-embedding-3-small
Vector Store SimpleVectorStore (in-memory) InMemoryEmbeddingStore InMemoryVectorStore
PDF Parser PagePdfDocumentReader Apache Tika PyPDFLoader
IDE IntelliJ IDEA IntelliJ IDEA PyCharm / VS Code

Prerequisites

  • OpenAI API Key — Set as environment variable OPENAI_API_KEY
  • Java 21+ (for Java versions)
  • Python 3.10+ (for Python version)

Quick Start

Java (Spring AI)

cd springai
export OPENAI_API_KEY=sk-...
./gradlew bootRun

Java (LangChain4j)

cd langchain4j
export OPENAI_API_KEY=sk-...
./gradlew run

Python (LangChain)

cd python
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt
export OPENAI_API_KEY=sk-...
python -m ragdemo.main

Production Track (Supabase + pgvector)

This main branch focuses on a minimal, in-memory setup for teaching core RAG ideas quickly.

For a persistence-backed, production-style version across all three implementations (Spring AI, LangChain4j, and Python), use the supabase-pgvector branch:

git switch supabase-pgvector

That branch README includes full setup and run instructions for Supabase/pgvector.

Mode Vector Storage Setup Effort Best For
Main branch In-memory stores Low Learning the RAG pipeline mechanics
supabase-pgvector branch PostgreSQL + pgvector (Supabase) Medium Persistence, realism, and deployment discussions

Project Structure

ragdemo/
├── springai/                        # Spring AI implementation
│   ├── build.gradle.kts
│   └── src/main/java/edu/trincoll/ragdemo/
│       ├── RagDemoApplication.java      # Spring Boot entry point
│       ├── RagDemoRunner.java           # Interactive CLI
│       ├── config/RagConfig.java        # VectorStore + ChatClient beans
│       └── service/
│           ├── DocumentLoaderService.java   # PDF loading + chunking
│           └── RagService.java              # Q&A via ChatClient
│
├── langchain4j/                     # LangChain4j implementation (no Spring)
│   ├── build.gradle.kts
│   └── src/main/java/edu/trincoll/ragdemo/
│       ├── RagDemo.java                 # Plain Java CLI entry point
│       ├── RagService.java              # RAG pipeline + InMemoryEmbeddingStore
│       └── DocumentLoader.java          # PDF loading via Apache Tika
│
└── python/                          # Python/LangChain implementation
    └── src/ragdemo/
        ├── main.py              # CLI entry point
        ├── document_loader.py   # PDF loading + chunking
        ├── vector_store.py      # Embedding + storage
        └── rag_chain.py         # LCEL chain (retriever → prompt → LLM)

How RAG Works

┌─────────────────────────────────────────────────────────────────┐
│                        INDEXING PHASE                           │
├─────────────────────────────────────────────────────────────────┤
│  PDF  ──▶  Load  ──▶  Chunk  ──▶  Embed  ──▶  Vector Store      │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                        QUERY PHASE                              │
├─────────────────────────────────────────────────────────────────┤
│  Question ──▶ Embed ──▶ Search ──▶ Retrieve ──▶ Augment ──▶ LLM │
└─────────────────────────────────────────────────────────────────┘
  1. Load — Extract text from PDF documents
  2. Chunk — Split into smaller pieces (~800 tokens) for better retrieval
  3. Embed — Convert text chunks to vector embeddings
  4. Store — Save embeddings in vector store for similarity search
  5. Retrieve — Find chunks most similar to the user's question
  6. Generate — Send question + retrieved context to LLM for answer

Sample Questions

The demo includes the "Attention Is All You Need" paper. Try asking:

  • What is the Transformer architecture?
  • How does self-attention work?
  • What are the key contributions of this paper?
  • What is multi-head attention?

Adding More Documents

Drop additional PDF files into the documents/ folder:

  • Java (Spring AI): springai/src/main/resources/documents/
  • Java (LangChain4j): langchain4j/src/main/resources/documents/
  • Python: python/documents/

The application will automatically load all PDFs on startup.

Running Tests

Java (Spring AI)

cd springai
./gradlew test

Java (LangChain4j)

cd langchain4j
./gradlew test

Python

cd python
pip install -e ".[dev]"
pytest

License

MIT License — See LICENSE for details.

Author

Kenneth Kousen

About

RAG demo implementations in Java (Spring AI) and Python (LangChain) for teaching

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors