## Simplistic LLM Knowledge Augmentation Model

Set environment variable API key

In [28]:
import os
# Read in the API key, gitignored
with open("API_KEY.txt", "r") as file:
    api_key = file.read().strip()
# Set environment variable
os.environ["OPENAI_API_KEY"] = api_key

In [25]:
from llama_index import SimpleDirectoryReader, LLMPredictor, ServiceContext, PromptHelper, VectorStoreIndex
from langchain.chat_models import ChatOpenAI
# Auxiliary functions
import model_assist as ma

Configure model attributes

In [26]:
def configure_model(max_input_size = 4096,
    num_outputs = 512,
    max_chunk_overlap = .5,
    chunk_size_limit = 600):
    
    # Configure the prompt specifications
    prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
    # Configure the model and temperature
    llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0.4, model_name="gpt-3.5-turbo", max_tokens=num_outputs))
    # Wrap both within the model
    service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)
    # Read in documents
    documents = SimpleDirectoryReader('data').load_data()
    # Set up the index
    index = VectorStoreIndex.from_documents(documents, service_context = service_context)

    return index

Instanciate query engine from the index

In [29]:
index = configure_model()
query_engine = index.as_query_engine()

Ask questions to query engine

In [41]:
response = query_engine.query("What is a universal quantifier? Does this relate to the University of Madison-Wisconsin?")
response.response

'A universal quantifier is a logical symbol that represents a statement that is true for every element in a given set. It is often denoted by the symbol (∀). For example, the statement "For every student at UW-Madison, they have access to the library" can be represented using a universal quantifier.\n\nHowever, based on the provided context information, there is no direct mention of a universal quantifier in relation to the University of Wisconsin-Madison. The context primarily focuses on the university\'s overview, academic excellence, popular programs and schools, research opportunities, and student life.'

JSON document loader assistant

In [36]:
directory = "data"
json_string = ma.json_loader(directory)

Wrap in text prompt, this will be uploaded as the `documents` part

In [40]:
prompt = ""
preface = """
You are an academic advisor for the University of Wisconsin-Madison, here to answer students' questions and help them choose classes for the Data Science major. 
The classes are in a JSON heirarchical format.
"""
class_data = f"Here is the class data: {json_string}"
corrections = """
If a student does not mention anything related to course data. Briefly answer their question, but mention that they are off-topic and the purpose of this conversation is for course recommendations. 
If there is insufficient information to give a student a good recommendation, ask the student for more information/preference.
"""
final_prompt = prompt + preface + class_data + corrections
final_prompt


"\nYou are an academic advisor for the University of Wisconsin-Madison, here to answer students' questions and help them choose classes for the Data Science major. \nThe classes are in a JSON heirarchical format.\n[]\nIf a student does not mention anything related to course data. Briefly answer their question, but mention that they are off-topic and the purpose of this conversation is for course recommendations. \nIf there is insufficient information to give a student a good recommendation, ask the student for more information/preference.\n"