# CREATE A CHATBOT WITH OPENAI API

First, we need to set up to use the OpenAI API
- Log in to OpenAI platform
- Billing information
- Set up the usage limit is important
- Create the secret key to call OpenAI, this key only appears to see once, so copy and save for the future use
- Get the OpenAI key and save in the file mykey.py in the same folder
*********
Install packages:
+ %pip install llama
+ %pip install docx2txt


In [4]:
#This code to enable GPU for your laptop if you have, if not you can run the bellow cell.

import tensorflow as tf
devices = tf.config.list_physical_devices()
print("\nDevices: ", devices)
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    details = tf.config.experimental.get_device_details(gpus[0])
    print("GPU details: ", details)



Devices:  [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
GPU details:  {'device_name': 'METAL'}


## Start from here

In [24]:

import os 
import mykey

#Declare the Open_API_Key for the envi
os.environ["OPENAI_API_KEY"] = mykey.openai_key

cur_dir = os.getcwd() 

print(cur_dir)


/Users/hnguyen1/Library/CloudStorage/OneDrive-MichiganStateUniversity/PD/LLM/ShawYoutube/OpenAI


# You only need to run one of the block below. They are the same but different way to write the code.

In [3]:
# BLOCK A:
# We need to read data (all files in the folder ./data/) this is using the SimpleDirectoryReader of LLama.core
# After reading the files, the data needed to be indexed using the format of the GPT model
# Save the index for future use

from llama_index.core import SimpleDirectoryReader, GPTVectorStoreIndex

reader = SimpleDirectoryReader(
    input_dir=cur_dir + "/data/"
)

documents = reader.load_data()

#After reading the file, we need to create an index for the file using GPTVectorStoreIndex. 
## This step to help the retrieving process

myindex = GPTVectorStoreIndex.from_documents(documents)

#This index can be saved in the disk for future use 

myindex.storage_context.persist('')

# Save the index
myindex.storage_context.persist(cur_dir+"/index/")

#!!!!!!!!!!!!!!!!#

#Next time when running the chatbot, you can just load the index without indexing the documents
from llama_index.core import StorageContext, load_index_from_storage

storage = StorageContext.from_defaults(persist_dir=cur_dir+"/index/")

myindex = load_index_from_storage(storage)




In [None]:
#BLOCK B
#This code basically do the same, different code is used in this block.
#This code is from the document of Llama index 

import os.path
from llama_index. core import VectorStoreIndex, SimpleDirectoryReader, StorageContext, load_index_from_storage

#Now we check if the data index already exists or not:
#----If not, then we have to read, index and save. Those file are jason files. 
#----If yes, then we only need to read from the storage. 

index_dir = cur_dir +"/index/"

if not os.path.exists(index_dir):
    #create the index
    os.makedirs(index_dir)
    #read the documents from the data folder and create indexes
    docs = SimpleDirectoryReader(input_dir=cur_dir + "/data/").load_data()
    #indexes the data using such as world2vec 
    index = VectorStoreIndex.from_documents(docs)
    #store it for later use
    index.storage_context.persist(persist_dir=index_dir)
else:
    #the documents are already saved
    storage_context = StorageContext.from_defaults(persist_dir=index_dir)
    index = load_index_from_storage(storage_context)


## Now we are going to build our chatbot Lucy

In [20]:
import openai
import json

class MyChatbot:
    def __init__(self, api_key, index):
        openai.api_key = api_key
        self.index = index
        self.chat_history = []
    
    def gen_response(self, question):
        query_engine = index.as_query_engine()
        response = query_engine.query(question)
        
        prompt = "\n".join([f"{message['role']}:{message['content']}" 
                            for message in self.chat_history[-10:]
                          ])
        
        prompt += f"\n User: {question}"
        
        query_eng = index.as_query_engine()
        res = query_eng.query(question)
        
        message = {"role": "helper", "content": res.response}
        
        #save in chat history
        self.chat_history.append({"role": "user", "content": question})
        self.chat_history.append(message)
        
        return message
    
    def load_chat_history(self, fname):
        try:
            with open(fname, 'r') as f:
                self.chat_history = json.load(f)
        except FileNotFoundError:
            pass
    
    def save_chat_history(self, fname):
        with open(fname, 'w') as f:
            json.dump(self.chat_history, f)


## My Lucy chatbot is working now 

In [21]:
#The mykey.openai_key is saved in mykey.py file
#index is the index created by llama.index with GPT structure 

mybot = MyChatbot(mykey.openai_key, index)

bot = MyChatbot(mykey.openai_key, index=index)
bot.load_chat_history("chat_history.json")

#while True:
question =  ""

while True:
    if question.lower() in ["bye", "goodbye", "done", "thanks"]:
        print("Lucy: Goodbye!")
        bot.save_chat_history("chat_history.json")
        break
    question = input("You: ")
    response = bot.gen_response(question)
    print(f"Lucy: {response['content']}")


You: tell me what time is the class math1750-030
Lucy: The class Math1750-030 time is not specified in the provided context information.
You: hanh nguyen is the instructor of what class
Lucy: Calculus 1
You: tell me where is hanh nguyen office
Lucy: I'm sorry, but based on the provided context information, there is no mention of Hanh Nguyen's office location.
You: hanh nguyen is the instructor of calculus 1 math 1750, where is her office
Lucy: The office location for Hanh Nguyen, the instructor of Calculus 1 (MATH 1750), is not provided in the given context information.
You: bye
Lucy: I'm here to help whenever you need assistance.
Lucy: Goodbye!
