This repository contains a Streamlit application that allows users to chat with the content of an uploaded PDF file using OpenAI's GPT-4o. The application leverages LangChain for managing conversation flow and embedding PDFs into a vector database for efficient querying.
- PDF Upload: Upload multi-page PDFs to the application.
- Vector Embedding: Parse and embed PDF content into a vector store using OpenAI embeddings.
- Chat Interface: Interact with the uploaded PDF content through a chat interface.
- State-of-the-Art Technologies: Uses the latest versions of Streamlit, LangChain, and OpenAI APIs.
streamlit: For creating the web application interface.langchain: For managing the conversation flow and integrating with OpenAI.PyPDF2: For parsing PDF documents.FAISS: For efficient vector storage and similarity search.openai: For accessing OpenAI's GPT-4o API.
ChatPDF/
├── app.py
├── requirements.txt
├── install_requirements.sh
├── sample_webapp.png
├── sample_demo.gif-
Clone the repository:
git clone https://github.com/xmpuspus/ChatPDF.git cd ChatPDF -
Install the required packages:
bash install_requirements.sh
-
Run the Streamlit application:
streamlit run app.py
- OpenAI API Key: A text input for the user to enter their OpenAI API key.
- PDF Upload: A file uploader for users to upload their PDF documents.
- Chat Interface: Displays the chat history and allows users to input their questions.
-
PDF Upload and Parsing:
- When a user uploads a PDF, it's saved temporarily and parsed using
PyPDF2. - The extracted text from each page is concatenated into a single string.
- When a user uploads a PDF, it's saved temporarily and parsed using
-
Vector Embedding:
- The parsed PDF text is embedded into a vector store using
OpenAIEmbeddingsfrom LangChain. FAISSis used to store these embeddings and perform similarity searches efficiently.
- The parsed PDF text is embedded into a vector store using
-
Chat Handling:
- User inputs are taken from the chat interface.
- The most relevant text from the vector store is retrieved based on the user's query.
- This relevant text, combined with the user's query, is used as a prompt for OpenAI's GPT-4o to generate a response.
- The response is displayed in the chat interface.
- Upload PDF: Users upload their PDF documents through the sidebar.
- Enter API Key: Users input their OpenAI API key to enable AI responses.
- Chat Interaction: Users can ask questions related to the PDF content, and the AI will respond based on the parsed PDF text.
- Injection-Proof: The application ensures that user inputs cannot override system instructions by strictly defining the prompt template and conversation flow.
- Enhanced Parsing: Improve PDF parsing to handle complex documents better.
- Customization: Allow users to customize the prompt template and other settings.
- Multi-PDF Support: Enable handling of multiple PDFs for a broader range of queries.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or features you would like to see.
This project is licensed under the MIT License. See the LICENSE file for more details.
