This repo is an implementation of a locally hosted chatbot specifically focused on question answering on your onw PDF. Small app built following the LangChain documentation.
Deployed version: chat.langchain.com
- Install backend dependencies:
pip install -r requirements.txt
. - Rename the
.env.example
to.env
- Enter your OpenAI API key in the environment variables in the
.env
file:
OPENAI_API_KEY="sk-***"
- Run
streamlit main.py
- Open localhost:3000 in your browser.
The application reads the PDF and splits the text into smaller chunks that can be then fed into a LLM. It uses OpenAI embeddings to create vector representations of the chunks. The application then finds the chunks that are semantically similar to the question that the user asked and feeds those chunks to the LLM to generate a response.
The application uses Streamlit to create the GUI and Langchain to deal with the LLM.
There are two components:
- embeddings
- question-answering.
Source : Benny Cheung
Question-Answering has the following steps, all handled by OpenAIFunctionsAgent:
- Given the user input question, determine what a standalone question would be (using GPT-3.5).
- Given that standalone question, look up relevant documents chunks.
- Pass the standalone question and relevant document chunks to GPT to generate and stream the final answer.
TODO : docker deployment
This repository is for educational purposes only and is not intended to receive further contributions. It is supposed to be used as support material for the YouTube tutorial that shows how to build the project.