PDFChat is a simple application developed by leveraging the capability of LLM (chatgpt) to generate humanlike text, answer questions based on context.
Steps followed:
DataBase Creation
- Read and extract the content of the given pdf file
- Get emebddings of the given file using OpenAI Ada model and store them in Chroma base.
Inference
- For the user query get user embeddings using Ada
- Perform doc search to extract top 3 paragraphs which are semantically similar to the query
- Pass the extracted docs as Context and user query to ChatGPT to formulate the answer.
This section should list any major frameworks/libraries used to bootstrap the project.
To run the application you need the following pre-requisites
- Docker installed on your system. You can follow the instructions here to download it for your system.
- OpenAI API Key. You can get it from here. Add your API key to
.env
file.
Run the below commands for installation
- Build Docker Image
docker build --tag PDFChat --file Dockerfile .
- Run the Docker Image
docker run -it --env-file .env --name gpt-gateway gpt-gateway