This project allows users to upload PDF files and ask questions about them. It leverages various libraries and APIs to extract text from PDFs, split the text into chunks, create embeddings, and perform question-answering tasks. Now, with the addition of a feature that allows exporting the chat history to a PDF file.
- Upload PDFs and get answers about their contents.
- Utilize various libraries and APIs for text extraction and analysis.
- Export the entire chat history to a PDF file.
This project provides a web interface for analyzing PDF files. Here's how it works:
- Setup: Enter your OpenAI API key in the provided field.
- Upload: Add one or multiple PDF files using the file uploader.
- Processing: Click "Process PDFs" to extract text and create an indexed knowledge base.
- Query: After processing, ask questions about the content.
- Answers: Receive relevant information from the database.
- Export: Save the entire chat conversation to a PDF file.
-
Install the necessary libraries:
pip install -r requirements.txt
-
Have an OpenAI API key. Sign up at OpenAI if you don’t have one.
-
Clone the repository:
git clone https://github.com/thaisaraujom/PDF-Insights.git
-
Install the dependencies.
-
Set your OpenAI API key as an environment variable or input it when prompted.
-
Run the code:
streamlit run app.py
The Streamlit application will open in your default web browser.
- 📄 Supports PDF files only.
- 📜 Extracted text is broken down into smaller chunks to enhance performance.
- ❓ Ask questions using the provided text input field.
- 🔄 A spinner is displayed during extraction or questioning tasks.
- 🔒 Keep your OpenAI API key secure.
Explore and analyze different PDFs using this code!