PDF2AI

Overview

This application integrates a PDF text extraction feature and store as embedding vector in vector database (FAISS) and use OpenAI's ChatGPT model to provide an interactive question-answering system. It allows users to query a PDF document and receive contextually relevant answers from vector database which is converted into more meaningful answer by ChatGPT.

Features

PDF Text Extraction: Extract text from any given page of a PDF document.
Embedding Vector: Store the extracted text into embedding vector form in a vector datbase.
ChatGPT-Powered Responses: Generate answers to questions based on the content of a specified page in the PDF document.
Web-Based Interface: The application is accessible through a web interface, allowing for easy interaction and use.

How It Works

PDF Selection and Page Reference: Users can specify a page number from a pre-defined PDF document.
Question Input: Users can input a question related to the content of the selected page.
Similarity Search: Answers which are similar to the asked questions is search from the vector database.
Answer Generation: The application processes the extracted text from the PDF page and the user's question, leveraging ChatGPT to generate a relevant answer.
Response Display: The generated answer is displayed to the user, providing insights or information based on the PDF's content.

Technology

FastAPI: Powers the backend of the application, handling web requests and server-side logic.
PDFMiner: Used for extracting text from PDF documents.
FAISS: Vector database to store the embedding vectors and perform similarity search.
OpenAI's ChatGPT: Provides the AI model for generating answers to user queries.
Uvicorn: Serves as the ASGI server for hosting the application.

Setup and Usage

Prerequisites

Before you start, ensure you have the following installed:

Python 3.6 or higher
Pip (Python package installer)

Installation

Clone the Repository: If the application is hosted in a Git repository, provide instructions to clone it. Otherwise, skip this step if the user is setting it up directly from provided files.

git clone [your-repository-link]
cd [repository-name]

Environment Setup: It's recommended to use a virtual environment for Python projects. This keeps dependencies required by different projects separate and organized.

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies: Install the required Python packages using pip.

pip install -r requirements.txt

Environment Variables: Set up the necessary environment variables. Create a .env file in the root directory of the project and add the following variables:

OPENAI_API_KEY=your_openai_api_key
CHATGPT_MODEL=model_name  # for example, "davinci"

Replace your_openai_api_key and model_name with your actual OpenAI API key and the model name you intend to use.

Running the Application

Start the API Server: Run the following command to start the FastAPI server:
```
python main.py
```
Start the streamlit server in new terminal tab.
```
streamlit run app.py
```
It will open application in browser at http://localhost:8501/

Interacting with the Application

Through the frontend, you can test the PDF text extraction and question-answering features.
Select a page number and input your question related to the content on that page.
Submit the request, and the application will display the generated answer based on the PDF's content.

Notes

Ensure that the PDF file (data/Attention_Is_All You_Need.pdf) is placed in the correct directory as specified in the code.
The API key and model name must be valid and active for the OpenAI service to work correctly.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
data		data
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF2AI

Overview

Features

How It Works

Technology

Setup and Usage

Prerequisites

Installation

Running the Application

Interacting with the Application

Notes

Demo

Application Flow

About

Releases

Packages

Contributors 2

Languages

alpeshkumar9/PDF2AI

Folders and files

Latest commit

History

Repository files navigation

PDF2AI

Overview

Features

How It Works

Technology

Setup and Usage

Prerequisites

Installation

Running the Application

Interacting with the Application

Notes

Demo

Application Flow

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages