This project uses AI to extract information from PDF files and make it searchable. For factual retrieval of information, the system leverages Retrieval Augmented Augmentation (RAG). In doing so, the retrieved information is represented in a JSON structure that forms the factual context for user queries.
- Clone the git repo
- Install at least python 3.12.##
- Install the required packages:
pip install -r requirements.txt
- Set your OpenAI API key in the
.env
file. - Run the script:
python demo.py
- Upload a PDF file.
- The AI will extract the data from the PDF and generate a JSON file.
- The AI will also generate a JSON schema file.
- You can then interact with the document using natural language queries.
This project uses the OpenAI GPT-4 model for information extraction and query processing. Hence, usage of API require token payment (Charges).
- Christoffer Björkskog - Initial work - melonkernel
- Christian Möller - chrmolnovia
- Lamin Jatta - Lamboyjat
See also the list of people who participated in this project.