Upload any PDF document and ask questions about it using AI. Get instant answers backed by the content of your documents!
DocuChat is an intelligent document assistant that lets you have conversations with your PDF files. Instead of reading through hundreds of pages, simply upload your PDF and ask questions in plain English!
Example:
- Upload a research paper β Ask "What are the main findings?"
- Upload a manual β Ask "How do I reset the device?"
- Upload a contract β Ask "What are the payment terms?"
- π Upload PDF Files - Supports any PDF document up to 10MB
- π¬ Ask Questions - Get instant answers from your documents
- π― Accurate Responses - AI answers based only on your document content
- π Source Citations - See which parts of the document were used to answer
- π Fast & Private - Everything runs on your computer
- πΎ No Cloud Required - Your documents stay on your machine
- You upload a PDF β DocuChat reads and understands it
- You ask a question β DocuChat searches for relevant information
- AI generates an answer β Based only on your document's content
- You get the answer β With references to where it found the information
Think of it as having a super-smart assistant who has read your entire document and can instantly answer any question about it!
-
Node.js - The engine that runs DocuChat
- Download from: https://nodejs.org/
- Choose the "LTS" (Long Term Support) version
- Version needed: 18 or higher
-
Ollama - The AI brain that understands and answers questions
- Download from: https://ollama.ai/
- It's free and runs on your computer
- Operating System: Windows, Mac, or Linux
- RAM: At least 8GB (16GB recommended)
- Storage: 5GB free space
- Internet: Only needed for initial setup
- Go to https://nodejs.org/
- Click the big green button that says "LTS"
- Download and run the installer
- Follow the installation wizard (click "Next" through the steps)
- Verify installation:
- Open Terminal (Mac/Linux) or Command Prompt (Windows)
- Type:
node --version - You should see something like
v20.x.x
- Go to https://ollama.ai/
- Click "Download"
- Choose your operating system (Mac/Windows/Linux)
- Install the application
- Ollama will start automatically after installation
These are the AI "brains" DocuChat needs to work:
Open Terminal/Command Prompt and run:
# This downloads the model that understands document content
ollama pull nomic-embed-text
# This downloads the model that answers questions
ollama pull llama3.2:3bWait time: Each download takes 5-10 minutes depending on your internet speed.
Option A - If you have Git:
git clone https://github.com/your-repo/docuchat.git
cd docuchatOption B - Download ZIP:
- Download the ZIP file from the repository
- Extract it to a folder (like Documents/docuchat)
- Open Terminal/Command Prompt
- Navigate to that folder:
cd path/to/docuchat
In the Terminal/Command Prompt (make sure you're in the docuchat folder):
npm installThis will take 2-3 minutes. You'll see lots of text scrolling - that's normal!
mkdir uploadsThis creates a folder where uploaded PDFs are temporarily stored.
-
Make sure Ollama is running (it usually runs in the background)
-
Open Terminal/Command Prompt
-
Navigate to DocuChat folder:
cd path/to/docuchat -
Start DocuChat:
npm start
-
Look for this message:
π DocuChat server running on http://localhost:3000 -
Open your web browser and go to:
http://localhost:3000
You should see the DocuChat interface! π
- Click the "Choose File" or "Upload PDF" button
- Select a PDF from your computer (max 10MB)
- Click "Upload"
- Wait for processing (you'll see a progress indicator)
- Success! Your document is now ready for questions
Processing Time:
- Small PDF (< 10 pages): 10-30 seconds
- Medium PDF (10-50 pages): 30-90 seconds
- Large PDF (50+ pages): 1-3 minutes
- Type your question in the chat box
- Click "Send" or press Enter
- Wait for the answer (usually 5-15 seconds)
- Read the response with source references
Good Question Examples:
- "What is the main topic of this document?"
- "Can you summarize the key points?"
- "What does it say about [specific topic]?"
- "Who are the authors?"
- "What are the conclusions?"
Tips for Better Answers:
- β Be specific in your questions
- β Ask one thing at a time
- β Use keywords from the document
- β Don't ask about things not in the document
- β Avoid overly broad questions
Each answer includes:
- The Answer - AI-generated response based on your document
- Sources - Snippets from the document that were used
- Page Numbers - Where the information was found
- Confidence - How relevant each source is
Document: Scientific study on climate change
Questions you can ask:
- "What was the research methodology?"
- "What are the main findings?"
- "What data was collected?"
- "What are the limitations of this study?"
Document: Smartphone user guide
Questions you can ask:
- "How do I take a screenshot?"
- "What are the technical specifications?"
- "How do I reset the device?"
- "What's the battery life?"
Document: Service agreement
Questions you can ask:
- "What are the payment terms?"
- "What is the cancellation policy?"
- "What are my obligations?"
- "What is the contract duration?"
Solution:
- Make sure Ollama is installed
- Check if Ollama is running (look for Ollama icon in system tray)
- Try restarting Ollama
- On Mac/Linux, run:
ollama serve
Solution: Run these commands again:
ollama pull nomic-embed-text
ollama pull llama3.2:3bSolution: Something else is using port 3000. Either:
- Stop the other application
- Or change DocuChat's port:
Then access at: http://localhost:3001
PORT=3001 npm start
Solution: The PDF is over 10MB. Try:
- Compress the PDF using online tools
- Split the PDF into smaller parts
- Or contact support to increase the limit
Possible Reasons:
- Question too vague - Be more specific
- Information not in document - AI can only use what's in the PDF
- Complex document - Try asking about smaller sections
- Poor PDF quality - Scanned images don't work well
Solutions:
- Rephrase your question
- Ask about specific sections
- Make sure PDF has selectable text (not scanned images)
Error: "node: command not found"
- Solution: Install Node.js (see Step 1)
Error: "npm: command not found"
- Solution: Reinstall Node.js (npm comes with it)
Error: Module not found
- Solution: Run
npm installagain
β Good PDFs:
- Text-based PDFs (you can select and copy text)
- Well-formatted documents
- Clear structure with headings
β Problematic PDFs:
- Scanned images (unless OCR processed)
- Password-protected files
- Corrupted files
- PDFs with lots of images and little text
β Questions that work well:
- Factual questions ("What is...?", "Who...?", "When...?")
- Summary requests ("Summarize...", "What are the key points...?")
- Specific lookups ("What does it say about X?")
- Comparisons ("What's the difference between...?")
β Questions that don't work well:
- Questions about things not in the document
- Requests for opinions or predictions
- Math calculations (unless explicitly in the document)
- Questions requiring external knowledge
- β Everything runs locally on your computer
- β No cloud uploads - PDFs never leave your machine
- β No tracking - We don't collect any data
- β Temporary storage - Files are deleted after processing
- β Open source - You can review all code
- Temporary: Uploaded PDFs (deleted after processing)
- In Memory: Document chunks and AI embeddings (cleared when you close the app)
- Never Stored: Your questions, answers, or any personal data
- During Setup: Downloads AI models from Ollama
- During Use: Zero internet needed (everything is local)
1. Check Ollama:
- Look for Ollama icon in system tray (Windows/Mac)
- Or run:
ollama list(should show your models)
2. Check DocuChat:
- Open http://localhost:3000/api/health
- Should see:
{"status":"ok","message":"DocuChat API is running"}
3. Check Document Count:
- Open http://localhost:3000/api/chat/stats
- Shows how many documents are loaded
If DocuChat is slow:
- Close other applications - AI needs RAM
- Use smaller PDFs - Break large documents into sections
- Restart DocuChat - Clears memory
- Upgrade RAM - 16GB+ recommended for large documents
You can test DocuChat using command-line tools:
Check if server is running:
curl http://localhost:3000/api/healthUpload a PDF:
curl -X POST http://localhost:3000/api/upload \
-F "pdf=@/path/to/your/document.pdf"Ask a question:
curl -X POST http://localhost:3000/api/chat \
-H "Content-Type: application/json" \
-d '{"question": "What is this document about?"}'Change AI Model:
Edit backend/services/chatService.js and change:
this.chatModel = "llama3.2:3b"; // Try other Ollama modelsChange Chunk Size:
Edit backend/services/pdfProcessor.js and modify:
chunkSize: 800, // Make larger for more context
chunkOverlap: 200, // Adjust overlapChange Number of Results:
Edit backend/services/chatService.js:
this.topK = 3; // Increase to retrieve more contextBackend:
- Node.js + Express.js
- LangChain (RAG framework)
- Ollama (AI models)
- pdf-parse (PDF processing)
AI Models:
nomic-embed-text- Text embeddings (768 dimensions)llama3.2:3b- Language model for answers
Components:
- PDF Processor - LangChain PDFLoader + RecursiveCharacterTextSplitter
- Embedding Service - Converts text to vectors
- Vector Store - In-memory similarity search
- Chat Service - RAG pipeline with prompt templates
PDF Upload β Text Extraction β Chunking β Embedding β Vector Store
β
User Question β Embedding β Similarity Search β Top K Chunks
β
Chunks + Question β LLM β Answer
See backend/API_DOCUMENTATION.md for complete API reference.
Q: Is DocuChat free? A: Yes! It's completely free and open-source.
Q: Do I need internet? A: Only for initial setup (downloading models). After that, it works offline.
Q: What PDF size is supported? A: Up to 10MB by default. Can be configured for larger files.
Q: Can I upload multiple PDFs? A: Yes! Upload them one at a time. They'll all be searchable together.
Q: Does it work with scanned PDFs? A: Only if the PDF has been OCR-processed (text is selectable).
Q: What language is it written in? A: JavaScript (Node.js for backend, vanilla JS for frontend).
Q: Can I use different AI models? A: Yes! Any Ollama-compatible model works.
Q: Is my data encrypted? A: Everything runs locally, so data never leaves your computer.
Q: Can I deploy this to a server? A: Yes! See deployment documentation (for technical users).
Q: Why does the first question take longer? A: The AI models need to "warm up" on first use.
Q: Can I ask follow-up questions? A: Currently, each question is independent. Conversation memory is a planned feature.
Q: What if my document is in another language? A: Ollama models support many languages! Upload and try.
Q: Can I chat with images in PDFs? A: Currently only text is processed. Image recognition is a future feature.
- Check Troubleshooting section above
- Read API Documentation in
backend/API_DOCUMENTATION.md - Check LangChain Integration guide in
LANGCHAIN_INTEGRATION.md - Search existing issues on GitHub
- Ask the community in Discussions
- Report a bug via GitHub Issues
When reporting a problem, include:
- Operating system (Windows/Mac/Linux)
- Node.js version (
node --version) - Ollama version (
ollama --version) - Error messages (copy the exact text)
- Steps to reproduce the problem
- Screenshots (if helpful)
- Conversation memory (remember previous questions)
- Support for Word documents (.docx)
- Support for Excel files (.xlsx)
- Multiple language support
- Dark mode
- Export chat history
- Image extraction from PDFs
- Table extraction and analysis
- Document comparison
- Batch processing
- API for developers
- Mobile app
- Persistent vector database (Chroma/Qdrant)
We'd love to hear your suggestions! Open a discussion on GitHub.
DocuChat is built with amazing open-source technologies:
- LangChain - RAG framework
- Ollama - Local AI models
- Express.js - Web server
- pdf-parse - PDF processing
- Node.js - Runtime environment
Special thanks to:
- The Ollama team for making AI accessible
- The LangChain community
- All our contributors
This project is licensed under the ISC License.
Use this checklist for first-time setup:
- Install Node.js from https://nodejs.org/
- Install Ollama from https://ollama.ai/
- Run
ollama pull nomic-embed-text - Run
ollama pull llama3.2:3b - Download/clone DocuChat
- Run
npm installin DocuChat folder - Run
mkdir uploadsto create upload folder - Start DocuChat with
npm start - Open browser to
http://localhost:3000 - Upload a test PDF
- Ask a question
- π You're ready to go!
- API Documentation:
backend/API_DOCUMENTATION.md - cURL Examples:
backend/CURL_EXAMPLES.md - LangChain Integration:
LANGCHAIN_INTEGRATION.md
Made with β€οΈ for everyone who hates reading long documents
DocuChat - Chat with your PDFs, not through them!