A powerful REST API chat application that allows you to have natural conversations with your documents using RAG (Retrieval Augmented Generation) and LangChain.
The application uses three main components:
-
DocumentProcessor: Handles document ingestion and processing
- Loads PDF, TXT, and other unstructured files using appropriate loaders
- Splits documents into chunks with configurable size and overlap
- Creates vector embeddings using OpenAI's embedding model
- Stores and manages embeddings in a FAISS vector store for efficient similarity search
-
ChatEngine: Manages the conversation with the AI
- Uses OpenAI's chat model for generating responses
- Maintains conversation history with system, human, and AI messages
- Formats messages with context when available
- Provides conversation reset functionality
-
RAGChatbot: Orchestrates the entire process
- Coordinates between DocumentProcessor and ChatEngine
- Handles document uploads and processing
- Manages chat sessions and message flow
- Provides independent control over chat and document states
- Implements the core RAG (Retrieval Augmented Generation) logic
- Python 3.8+
- OpenAI API key
-
Clone the repository:
git clone https://github.com/CodeSignal-Learn/course_talk-to-documents-with-langchain-and-python cd course_talk-to-documents-with-langchain-and-python
-
Install dependencies:
pip install -r requirements.txt
-
Set up your OpenAI API key:
export OPENAI_API_KEY='your-api-key-here'
-
Start the server:
cd app python main.py
The server will start on port 3000 (http://localhost:3000)
POST /upload
Content-Type: multipart/form-data
file: <document_file>
Upload a PDF or TXT file for processing.
Response:
{
"status": "success",
"message": "Document processed successfully"
}
POST /message
Content-Type: application/json
{
"message": "What is the main topic of the document?"
}
Send a message to chat with the uploaded documents.
Response:
{
"response": "Based on the document content, the main topic is..."
}
POST /reset/chat
Reset only the conversation history while keeping the document knowledge.
Response:
{
"status": "success",
"message": "Conversation history has been reset."
}
POST /reset/documents
Reset only the document knowledge while keeping the conversation history.
Response:
{
"status": "success",
"message": "Document knowledge has been reset."
}
POST /reset/all
Reset both conversation history and document knowledge.
Response:
{
"status": "success",
"message": "Both conversation history and document knowledge have been reset."
}