A simple chatbot for PDFs that lets you upload one or more PDF files and ask natural language questions. The system retrieves relevant chunks from the documents using HuggingFace embeddings + FAISS and generates answers with Google Gemini.
- Upload multiple PDF files.
- Split PDFs into chunks using LangChain’s
RecursiveCharacterTextSplitter. - Generate embeddings with HuggingFace (
all-MiniLM-L6-v2). - Store and query chunks using FAISS vector database.
- Answer questions using Gemini Flash with LangChain QA chain.
- Interactive frontend built with Streamlit.
- Python3
- Streamlit – Web interface
- LangChain – Text splitting, QA chain
- HuggingFace Embeddings – Semantic search
- FAISS – Vector database
- Google Gemini API – LLM for answering questions
- PyPDF2 – Extract text from PDFs