Revolutionize document interaction with our cutting-edge Document Chat Assistant! This intelligent application empowers users to upload, analyze, and explore multiple documents through an intuitive, AI-powered chat interface.
Feature | Description | 🚀 Highlights |
---|---|---|
🗂️ Multi-Document Upload | Upload PDF and TXT files seamlessly | Process multiple documents simultaneously |
🧠 Smart Document Processing | Advanced document chunking and embedding | Uses state-of-the-art NLP techniques |
💬 RAG-Powered Interaction | Context-aware response generation | Combines retrieval and language models |
💾 Persistent Document Storage | Efficient embedding management | Utilizes Chroma for quick information retrieval |
🤝 Interactive Chat Interface | Natural language document exploration | Ask complex questions, get precise answers |
🔄 Flexible Reset Options | Manage chat and database | Easy reset for new document sets |
- 🐍 Python 3.8+
- 📦 pip package manager
# Clone the repository
git clone https://github.com/yourusername/document-chat-assistant.git
# Navigate to project directory
cd document-chat-assistant
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
# Install dependencies
pip install -r requirements.txt
# Launch Streamlit application
streamlit run app.py
graph TD
A[Upload Documents] --> B[Preprocess Documents]
B --> C[Create Embeddings]
C --> D[Store in Chroma]
D --> E[User Query]
E --> F[Retrieve Relevant Context]
F --> G[Generate AI Response]
G --> H[Display Answer]
File Uploader:
uploaded_files = st.file_uploader(
"Upload Documents",
type=["pdf", "txt"],
accept_multiple_files=True
)
Document Processing Workflow:
def process_documents(uploaded_files):
documents = []
for file in uploaded_files:
# Use appropriate loader based on file type
if file.type == "application/pdf":
loader = PyPDFLoader(file)
else:
loader = TextLoader(file)
# Split documents into chunks
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50
)
document_chunks = text_splitter.split_documents(loader.load())
# Create embeddings
embeddings = HuggingFaceEmbeddings()
vectorstore.add_documents(document_chunks)
return documents
prompt_template = """
You are a helpful assistant. Answer the question based strictly on the provided context.
Think step by step and provide a detailed, accurate response.
Context:
{context}
Question: {question}
Helpful Answer:"""
- 🧠 AI/ML:
- LangChain
- HuggingFace Embeddings
- ChatGroq
- 🌐 Web Framework: Streamlit
- 💾 Vector Database: Chroma
Interested in improving the Document Chat Assistant? We welcome contributions!
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
**Created with ❤️ by AI Enthusiasts **