<a href="https://colab.research.google.com/github/mfierbaugh/asp-ai-lab/blob/main/openai_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAI LAB - Retrieval Augmented Generation

This lab combines the previous two labs and will show how we can retrieve information locally and augment the input.  Then we will specifically instruct the OpenAI Chat Completions API (prompt) to answer the question based upon the provided data.  This is an example of RAG (Retrieval Augmented Generation)

## Setting up the runtime enviornment

Clone the github so that we have the files we need in the local runtime.

In [None]:
!git clone https://github.com/mfierbaugh/asp-ai-lab.git

Install the required python libraries

In [None]:
!pip install langchain langchain-community langchain-chroma langchain-core langchain-experimental langchain-openai pypdf sentence_transformers openai

## Import Python Libraries

Import the required libraries for our lab

In [7]:
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import PyPDFLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain_chroma import Chroma
from openai import OpenAI
from google.colab import userdata

Setting the local variables for our lab

In [16]:
file_dir = 'asp-ai-lab/files'
model = "gpt-3.5-turbo"
openai_api_key = userdata.get('OPENAI_API_KEY')

## Functions

### Load the Local Vector Database (Chroma)

In [17]:
def load_vectordb():
    # load the documents from the directory
    loader = DirectoryLoader(file_dir, use_multithreading=True, loader_cls=PyPDFLoader)
    documents = loader.load()
    # split the documents into chunks
    embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
    text_splitter = SemanticChunker(embeddings=embeddings)
    docs = text_splitter.split_documents(documents)

    # store the documents and embeddings in the database
    db = Chroma.from_documents(docs, embeddings)
    return db

### Query the local vector database (Chroma)

In [18]:
def query_vectordb(query, db):
    # query the database
    retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.5})
    docs = retriever.invoke(query)
    return docs

### Create the Prompt for the User Question

In [19]:
def create_prompt(query, context):
    prompt = f"""
    Use the following pieces of context to answer the question at the end.
    If you do not know the answer, please think rationally and answer from your own knowledge base.

    {context}

    Question: {query}
    """
    return prompt

### OpenAI Chat Completions API


In [20]:
def chat_openai (user_question, client, model):
    """
    This function sends a chat message to the OpenAI API and returns the content of the response.
    It takes two parameters: the chat prompt and the model to use for the chat.
    """
    prompt = """
    {user_question}

    Analyze the user's question and provide an answer based upon the context of the question.
    """.format(
        user_question=user_question
    )

    response = client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful network engineering expert:",
            },
            {"role": "user", "content": prompt},
        ],
    )
    return response.choices[0].message.content

## Run the application

In [None]:
db = load_vectordb()
query = 'How many 800G interfaces does the PTX10002-36QDD have?'
docs = query_vectordb(query, db)
# print the results
query_result = (docs[0].page_content)
prompt = create_prompt(query, query_result)
client = OpenAI(api_key=openai_api_key)
response = chat_openai(prompt, client, model)

print(response)