Introducing the Instant QnA builder - a powerful tool that allows you to quickly and easily create searchable QnA systems from PDF files. Using state-of-the-art OpenAI technology, this tool generates search embeddings for your documents, making it easy to find the information you need.
-
Install the project's dependencies:
Windows:
pip install -r requirements.txt
Unix:
python3 -m pip install -r requirements.txt
-
Update
constants.py
, with your OpenAI API Tokentoken="<YOUR-OPENAI-API-TOKEN>"
-
Place PDFs that you want to search inside
/sources
directory -
Run the program
Windows:
python main.py
Unix:
python3 main.py
An estimated cost to embed all of the files will be prompted for y/n. Choose
y
to proceed further. By default this engine usetext-embedding-ada-002
which is less expensive and also perfomant. You can update the code to embed using other models like davinci, etc... -
Once all of the files are full processed and embedded, then the program will show a prompt for you to enter your search query, if there are matching results it will return top 3 results with their score and source file name.
If you have PDF files from which you want to build a question and answer engine, this tool should be useful for you.
To begin, select the PDF file that you want to create a QnA system for and upload it to the tool.
This python file reads all of the PDFs file from /sources
and then write all of its text content to /ai_generated/dumps
.
Go through all files in sources
and collect which file that hasn't been embedded yet, or the embedding has expired.
Once the file is uploaded, generate search embeddings for the contents of the PDF. This process may take a few minutes, depending on the size of the file.
Parses through all text content within a PDF, grouping them into coherent paragraphs no longer than 1000 tokens. This dataset is then saved in a CSV format, providing a structured and readable format for an AI model to process.
This file creates the text embedding using OpenAI Ada model (you can customize to any model) and also provides the search/query functions
You can now execute search queries to find the information you need. Enter your query in the search box and the tool will return any matching results from the PDF.
The main function where you run the project