# **An OpenAI Document Querying AI Tool - Streamlit App:**
## **Utilising Large Lanaguage Models (LLMs), Chroma, OpenAI, Langchain and Streamlit.**

This notebook will detail a Python project which will use OpenAI Large Language Models (LLMs) to allow users to receive answers to their questions on a long PDF document, via an LLM AI tool. This will be achieved through the use of Langchain, Chroma Vector Store, the OpenAI API and Streamlit libraries.   

This notebook only contains the code required to run the streamlit app from a Google Colab notebook, which should have the CPU enabled.

Ensure that the 25 page cycleguard insurance PDF document has been loaded into the notebook working directory before running the cells.

In [None]:
!pip install langchain langchain-openai streamlit langchain-community pypdf chromadb
!npm install localtunnel
import urllib

In [2]:
%%writefile openAI_streamlit_app.py

import streamlit as st
import os
from langchain.vectorstores import Chroma
from langchain.document_loaders import PyPDFLoader
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain.chains import RetrievalQA

# Set OpenAI private API Key
os.environ['OPENAI_API_KEY'] = 'your_private_OpenAI_API_key'

# OpenAI text embedding model:
text_embedding_model = OpenAIEmbeddings(model='text-embedding-ada-002')

# Load the PDF document:
input_document = PyPDFLoader('cycleGuard Policy Wording 2021-03.pdf') # Ensure this PDF document file has already been loaded into Colab working directory!
# Split pages from the PDF
pages = input_document.load_and_split()
# Load documents into chroma embedding database:
vector_store = Chroma.from_documents(pages, text_embedding_model, collection_name='cycle_insurance')

# OpenAI LLM
LLM = OpenAI(model='gpt-3.5-turbo-instruct', temperature=0.2)
retriever = vector_store.as_retriever(search_kwargs={"k": 3})
retrieval_QA_chain = RetrievalQA.from_chain_type(
    llm=LLM,
    chain_type="stuff",
    retriever=retriever,
    input_key = 'question')

#-----------------Streamlit App Functionality----------------------#
st.title('Using OpenAI LLMs to Answer Queries on an Insurance Document') # App title
user_input = st.text_input('Enter your query here:') # User input box
if user_input: # If user enters a query via the app interface, pass the query to OpenAI LLM
    openAI_response = retrieval_QA_chain.invoke({"question": user_input})
    st.write(openAI_response["result"]) # Display the LLM response


Writing openAI_streamlit_app.py


In [3]:
print("Password for localtunnel:", urllib.request.urlopen('https://ipv4.icanhazip.com').read().decode('utf8').strip("\n"))
!streamlit run openAI_streamlit_app.py &>/content/logs.txt & npx localtunnel --port 8501 & curl ipv4.icanhazip.com

Password for localtunnel: 35.230.14.96
35.230.14.96
[K[?25hnpx: installed 22 in 3.649s
your url is: https://forty-phones-sip.loca.lt
