A powerful web application that uses LangChain RAG (Retrieval-Augmented Generation) to analyze PDF documents and provide AI-powered summaries.
This application demonstrates advanced LangChain RAG concepts by:
- Uploading PDF documents from your device
- Extracting text using PDF parsing
- Creating text chunks with intelligent splitting
- Generating embeddings with OpenAI
- Building vector stores for semantic search
- Generating comprehensive summaries using RAG
- Text Chunking: Splitting documents into manageable pieces
- Embeddings: Converting text to vector representations
- Vector Stores: Storing and searching semantic information
- Context Retrieval: Finding relevant information for AI responses
RecursiveCharacterTextSplitter
: Intelligent text chunkingMemoryVectorStore
: In-memory vector storageOpenAIEmbeddings
: Text-to-vector conversionChatOpenAI
: Conversational AI modelPromptTemplate
: Structured prompt creationStringOutputParser
: Response formatting
- Chain Composition:
prompt.pipe(model).pipe(outputParser)
- Vector Similarity Search: Semantic document retrieval
- Context-Aware Generation: AI responses based on document content
- PDF Upload Interface: Drag-and-drop file selection
- Progress Indicators: Real-time processing feedback
- Summary Display: Formatted AI-generated summaries
- Error Handling: User-friendly error messages
- PDF Processing: Text extraction from PDF files
- Text Chunking: Intelligent document segmentation
- Embedding Generation: Vector representations
- RAG Pipeline: Context retrieval + AI generation
- Summary Creation: Comprehensive document analysis
- Node.js (version 16 or higher) - for local development
- Vercel account (free) - for deployment
- OpenAI API key - for embeddings and AI generation
Ready to deploy! Just follow the deployment steps below.
If you want to run locally first:
-
Install Node.js Visit nodejs.org and download the LTS version.
-
Install Dependencies
npm install
-
Start Development Server
npm run dev
-
Open Browser Go to
http://localhost:3000
and start analyzing PDFs!
- π PDF Upload: Drag-and-drop interface
- π§ RAG Processing: Advanced LangChain pipeline
- π Document Stats: Character count, chunk analysis
- π Secure Processing: Server-side AI processing
- β‘ Real-time Feedback: Processing indicators
-
Create GitHub Repository
git init git add . git commit -m "Initial commit: PDF RAG Analyzer with LangChain" git branch -M main git remote add origin https://github.com/yourusername/your-repo-name.git git push -u origin main
-
Deploy with Vercel
- Go to vercel.com and sign up
- Click "New Project" β Import your GitHub repository
- Vercel will auto-detect your Vite + React setup
- Click "Deploy" (no configuration needed!)
-
Add Environment Variables In your Vercel dashboard:
- Go to Project Settings β Environment Variables
- Add
OPENAI_API_KEY
with your actual OpenAI API key - Redeploy
-
Secure & Live! Your app will be available at
https://your-project-name.vercel.app
βββ api/
β βββ pdf-rag.js # LangChain RAG backend
βββ src/
β βββ main.tsx # React entry point
β βββ App.tsx # Main layout component
β βββ components/
β βββ PDFRAGComponent.tsx # PDF upload & summary interface
βββ vercel.json # Deployment configuration
βββ package.json # Dependencies & scripts
βββ README.md # This file
PDF Upload β Text Extraction β Text Chunking β Embeddings
Chunks β OpenAI Embeddings β Memory Vector Store
Vector Search β Context Retrieval β AI Generation β Summary
- β RAG Implementation: Complete retrieval-augmented generation
- β Vector Operations: Embeddings and similarity search
- β Chain Composition: Building complex AI pipelines
- β Document Processing: PDF parsing and text extraction
- β Context Management: Intelligent information retrieval
- β Interface Design: Complex data structure typing
- β Async Operations: Promise handling with proper types
- β Error Handling: Type-safe error management
- β File Processing: Binary data handling
- β File Upload: Drag-and-drop interfaces
- β State Management: Complex application state
- β User Feedback: Progress indicators and error states
- β Component Design: Reusable, typed components
npm run dev # Start development server
npm run build # Build for production
npm run preview # Preview production build locally
npm run clean # Remove build files
- Add More Document Types: Support for Word, TXT, etc.
- Implement Chat Interface: Ask questions about uploaded documents
- Add Vector Persistence: Store embeddings in databases
- Multi-Document Analysis: Compare multiple documents
- Custom Chunking Strategies: Experiment with different splitting methods
This RAG implementation can be extended for:
- Document Analysis: Legal, medical, academic papers
- Knowledge Management: Corporate document repositories
- Research Assistance: Academic paper summarization
- Content Creation: Automated content generation from sources
Happy learning with LangChain RAG! ππ