This is a Streamlit application that allows users to upload a PDF document or Text and ask questions about its content using LLM and RAG-powered. The app uses Langchain, Google gemini model, and HuggingFaceEmbeddings for document retrieval and question answering.
The following libraries are required for this project:
langchain-community>= 0.3.18langchain-huggingface>= 0.1.2langchain-openai>= 0.3.7langchain>= 0.3.19pypdf>= 5.3.0sentence-transformers>= 3.4.1streamlit>= 1.42.2scripts.secret(for storing your OpenAI API key)scripts.document_loader(for loading and splitting PDF documents)
You can install the necessary dependencies using pip:
pip install streamlit langchain faissA powerful open-source Streamlit web app that allows you to:
- Upload PDF or TXT files
- Paste large text manually
- Ask questions about the content
- Get AI-generated answers using Google Gemini (fast, efficient, context-aware)
![]() |
![]() |
https://drive.google.com/file/d/1d2l1iX5cZu4Zyf7Q6N3Gtz0uGp0mYEJu/view
- Supports both documents and text input
- Uses FAISS vector search for efficient retrieval
- Integrated with Google Gemini 1.5 Flash / 2.5 Flash
- Clean, minimal Streamlit UI
- Ready for cloud deployment (Streamlit, Render, or Google Cloud)
- Frontend/UI: Streamlit
- LLM: Google Gemini (via
langchain-google-genai) - Vector Store: FAISS
- Embeddings: Sentence Transformers
- Document Parsing: LangChain + PyPDF
- Language: Python 3.10+
-
Document Upload The user can upload a PDF document/text via the Streamlit interface. The file is saved temporarily on the server, and the document is then processed into text chunks for easier retrieval.
-
Text Splitting and Vector Database Creation Once the document is uploaded:
- The document is loaded and split into smaller chunks to optimize retrieval performance.
- Embeddings (numerical representations of the document's contents) are generated using the OpenAIEmbeddings model.
- A FAISS vector store is created from these embeddings, allowing efficient similarity search.
-
Question-Answering System The system utilizes a retrieval-based approach (RAG - Retrieval-Augmented Generation) to answer user queries. When a user asks a question, the system retrieves the most relevant chunks from the document using FAISS and processes them with GPT-4 to generate a meaningful answer.
-
User Interface The application presents:
- An input field to upload the PDF file or text input for entering the text.
- A text box to ask questions about the document.
- Responses generated by the AI based on the content of the document.
The required libraries are imported to handle document processing, vector database creation, and AI-powered responses:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.chains import create_retrieval_chain
from langchain_openai import ChatOpenAI
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from scripts.document_loader import load_document
import streamlit as stThe Streamlit app is initialized, with the title displayed and a file uploader for PDFs.
When a file is uploaded, the following steps are performed:
- The document is saved temporarily.
- The document is split into chunks using the load_document function.
Embeddings are generated using OpenAI's model (text-embedding-ada-002), and a FAISS vector store is created to hold these embeddings.
The question-answering chain is created by combining the retriever (FAISS) and the document processing chain (StuffDocumentsChain). The model used for answering questions is GPT-4.
The user is prompted to input a question, and the system responds with an answer based on the document's content.
To run the app, execute the following command in your terminal:
streamlit run app.pyThis will start the Streamlit server and open the app in your browser.
git clone https://github.com/yourusername/QueryBot.git
cd QueryBotpip install -r requirements.txtGOOGLE_API_KEY=your_google_gemini_api_key_hereYou can get your Gemini API key here: https://makersuite.google.com/app/apikey
python -m streamlit run app.py
Then open the URL shown in your terminal (usually http://localhost:8501).
If you liked this project or found it useful, feel free to connect with me on LinkedIn

