A conversational RAG (Retrieval Augmented Generation) system that remembers your conversation history and provides contextual answers from documents.
Transform any document into an intelligent chatbot that:
- Remembers conversations: Maintains context across multiple questions
- Answers from documents: Retrieves relevant information to answer your questions
- Interactive chat: Real-time conversation with command support
- Persistent storage: Saves chat history using Chroma vector database
- Install dependencies:
pnpm install
- Set up environment:
cp env.example .env
- Add your API keys to
.env
:
OPENROUTER_API_KEY=your_openrouter_key_here
OPENAI_API_KEY=your_openai_key_here
- Start chatting:
# Interactive chat mode
pnpm start --interactive
# Sample questions mode
pnpm start
# Streaming answers
pnpm start --streaming
$ pnpm start --interactive
💬 You: What is task decomposition?
🤖 Assistant: Task decomposition is the process of breaking down complex tasks into smaller, manageable steps...
💬 You: Can you give me an example?
🤖 Assistant: Based on our previous discussion about task decomposition, here's an example...
/help
- Show available commands/reset
- Start a new conversation/history
- View conversation history/threads
- Show all conversation threads/switch <thread_id>
- Switch to different conversation/summary
- Generate conversation summary/exit
- Exit the chat
The system uses a conversational StateGraph approach:
User Input → Query Reformulation → Document Retrieval →
Contextual Response → Chat History Storage
Key Components:
- Chroma Vector Database: Stores documents and embeddings
- LangGraph StateGraph: Manages conversation flow
- Chat History Manager: Handles conversation persistence
- OpenRouter Integration: LLM and embedding models
rag-langchain/
├── src/
│ ├── wrappers/
│ │ ├── chroma-wrapper.js # Chroma database wrapper
│ │ └── embeddings-openai.js # OpenAI embeddings wrapper
│ ├── chat-history.js # Conversation management
│ ├── interactive-chat.js # CLI chat interface
│ ├── rag.js # Main RAG system
│ └── config.js # Configuration
├── index.js # Entry point
└── README.md
Edit src/config.js
to customize:
- Models: Change LLM and embedding models
- Chroma Settings: Database configuration
- Chat History: Conversation persistence options
- Prompts: System prompts in Korean/English
- Load company documents, manuals, or knowledge bases
- Create an interactive assistant that answers questions
- Maintain conversation context for follow-up questions
- Upload FAQ documents and product manuals
- Provide contextual customer support
- Remember previous conversation for better assistance
- Load research papers or articles
- Ask complex questions with follow-ups
- Generate summaries of long conversations
- Store personal documents and notes
- Create a conversational interface to your knowledge
- Search and retrieve information naturally
// Add PDF loader
import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
// Add to RAGSystem class
async loadPDF(filePath) {
const loader = new PDFLoader(filePath);
return await loader.load();
}
// Add to interactive-chat.js
case '/export':
await this.exportConversation();
break;
CHROMA_USE_LOCAL_DB=false
CHROMA_HOST=your-chroma-server.com
CHROMA_PORT=8000
Default Models:
- LLM:
moonshotai/kimi-k2:free
(via OpenRouter) - Embeddings:
text-embedding-3-small
(via OpenAI)
Change Models:
LLM_MODEL=anthropic/claude-3-haiku
EMBEDDING_MODEL=text-embedding-3-large
"Better SQLite3 bindings not found"
- Conversation persistence will fallback to memory storage
- Install build tools:
pnpm add --dev node-gyp
"Chroma client initialization failed"
- Using memory-based vector store as fallback
- Works normally with in-memory storage
"API key issues"
- Ensure both OpenRouter and OpenAI keys are set
- Check API key validity and credits
- Thread Support: Multiple conversation threads
- Persistence: SQLite-based chat history storage
- Context Awareness: Query reformulation based on chat history
- Summarization: Auto-generate conversation summaries
- Chroma Integration: Production-ready vector storage
- Similarity Search: Efficient document retrieval
- Embeddings: High-quality text embeddings
- Fallback: Memory storage for development
- Start with interactive mode to experience the conversational flow
- Try follow-up questions to see context awareness in action
- Use
/help
to explore available commands - Experiment with different document URLs in the config
- Check conversation history with
/history
command
Ready to chat with your documents? 🚀
pnpm start --interactive