Skip to content

scoutts2/typescript-langchain-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

PDF RAG Analyzer

A powerful web application that uses LangChain RAG (Retrieval-Augmented Generation) to analyze PDF documents and provide AI-powered summaries.

🎯 What This App Does

This application demonstrates advanced LangChain RAG concepts by:

  • Uploading PDF documents from your device
  • Extracting text using PDF parsing
  • Creating text chunks with intelligent splitting
  • Generating embeddings with OpenAI
  • Building vector stores for semantic search
  • Generating comprehensive summaries using RAG

πŸš€ Key LangChain Concepts You'll Learn

🧠 RAG (Retrieval-Augmented Generation)

  • Text Chunking: Splitting documents into manageable pieces
  • Embeddings: Converting text to vector representations
  • Vector Stores: Storing and searching semantic information
  • Context Retrieval: Finding relevant information for AI responses

πŸ”§ LangChain Components

  • RecursiveCharacterTextSplitter: Intelligent text chunking
  • MemoryVectorStore: In-memory vector storage
  • OpenAIEmbeddings: Text-to-vector conversion
  • ChatOpenAI: Conversational AI model
  • PromptTemplate: Structured prompt creation
  • StringOutputParser: Response formatting

⚑ Advanced LangChain Patterns

  • Chain Composition: prompt.pipe(model).pipe(outputParser)
  • Vector Similarity Search: Semantic document retrieval
  • Context-Aware Generation: AI responses based on document content

πŸ—οΈ Architecture

Frontend (React + TypeScript)

  • PDF Upload Interface: Drag-and-drop file selection
  • Progress Indicators: Real-time processing feedback
  • Summary Display: Formatted AI-generated summaries
  • Error Handling: User-friendly error messages

Backend (LangChain RAG)

  • PDF Processing: Text extraction from PDF files
  • Text Chunking: Intelligent document segmentation
  • Embedding Generation: Vector representations
  • RAG Pipeline: Context retrieval + AI generation
  • Summary Creation: Comprehensive document analysis

πŸ“‹ Prerequisites

  • Node.js (version 16 or higher) - for local development
  • Vercel account (free) - for deployment
  • OpenAI API key - for embeddings and AI generation

πŸš€ Getting Started

Quick Start (Production)

Ready to deploy! Just follow the deployment steps below.

Local Development

If you want to run locally first:

  1. Install Node.js Visit nodejs.org and download the LTS version.

  2. Install Dependencies

    npm install
  3. Start Development Server

    npm run dev
  4. Open Browser Go to http://localhost:3000 and start analyzing PDFs!

Features

  • πŸ“„ PDF Upload: Drag-and-drop interface
  • 🧠 RAG Processing: Advanced LangChain pipeline
  • πŸ“Š Document Stats: Character count, chunk analysis
  • πŸ”’ Secure Processing: Server-side AI processing
  • ⚑ Real-time Feedback: Processing indicators

πŸš€ Deployment Instructions

Deploy to Vercel (Recommended)

  1. Create GitHub Repository

    git init
    git add .
    git commit -m "Initial commit: PDF RAG Analyzer with LangChain"
    git branch -M main
    git remote add origin https://github.com/yourusername/your-repo-name.git
    git push -u origin main
  2. Deploy with Vercel

    • Go to vercel.com and sign up
    • Click "New Project" β†’ Import your GitHub repository
    • Vercel will auto-detect your Vite + React setup
    • Click "Deploy" (no configuration needed!)
  3. Add Environment Variables In your Vercel dashboard:

    • Go to Project Settings β†’ Environment Variables
    • Add OPENAI_API_KEY with your actual OpenAI API key
    • Redeploy
  4. Secure & Live! Your app will be available at https://your-project-name.vercel.app

πŸ“ Project Structure

β”œβ”€β”€ api/
β”‚   └── pdf-rag.js          # LangChain RAG backend
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.tsx           # React entry point
β”‚   β”œβ”€β”€ App.tsx            # Main layout component
β”‚   └── components/
β”‚       └── PDFRAGComponent.tsx  # PDF upload & summary interface
β”œβ”€β”€ vercel.json            # Deployment configuration
β”œβ”€β”€ package.json           # Dependencies & scripts
└── README.md              # This file

πŸ”¬ How RAG Works

Step 1: Document Processing

PDF Upload β†’ Text Extraction β†’ Text Chunking β†’ Embeddings

Step 2: Vector Storage

Chunks β†’ OpenAI Embeddings β†’ Memory Vector Store

Step 3: Summary Generation

Vector Search β†’ Context Retrieval β†’ AI Generation β†’ Summary

πŸŽ“ Learning Outcomes

LangChain Mastery

  • βœ… RAG Implementation: Complete retrieval-augmented generation
  • βœ… Vector Operations: Embeddings and similarity search
  • βœ… Chain Composition: Building complex AI pipelines
  • βœ… Document Processing: PDF parsing and text extraction
  • βœ… Context Management: Intelligent information retrieval

TypeScript Skills

  • βœ… Interface Design: Complex data structure typing
  • βœ… Async Operations: Promise handling with proper types
  • βœ… Error Handling: Type-safe error management
  • βœ… File Processing: Binary data handling

React Development

  • βœ… File Upload: Drag-and-drop interfaces
  • βœ… State Management: Complex application state
  • βœ… User Feedback: Progress indicators and error states
  • βœ… Component Design: Reusable, typed components

πŸ”§ Available Scripts

npm run dev      # Start development server
npm run build    # Build for production
npm run preview  # Preview production build locally
npm run clean    # Remove build files

πŸš€ Next Steps for Learning

  • Add More Document Types: Support for Word, TXT, etc.
  • Implement Chat Interface: Ask questions about uploaded documents
  • Add Vector Persistence: Store embeddings in databases
  • Multi-Document Analysis: Compare multiple documents
  • Custom Chunking Strategies: Experiment with different splitting methods

πŸ“š Resources

🎯 Real-World Applications

This RAG implementation can be extended for:

  • Document Analysis: Legal, medical, academic papers
  • Knowledge Management: Corporate document repositories
  • Research Assistance: Academic paper summarization
  • Content Creation: Automated content generation from sources

Happy learning with LangChain RAG! πŸŽ“πŸ“„

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published