<a href="https://colab.research.google.com/github/mfierbaugh/asp-ai-lab/blob/main/openai_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This lab combines the previous two labs and will show how we can retrieve information locally and augment the input.  Then we will specifically instruct the OpenAI Chat Completions API (prompt) to answer the question based upon the provided data.  This is an example of RAG (Retrieval Augmented Generation)

In [1]:
!git clone https://github.com/mfierbaugh/asp-ai-lab.git

Cloning into 'asp-ai-lab'...
remote: Enumerating objects: 49, done.[K
remote: Counting objects: 100% (49/49), done.[K
remote: Compressing objects: 100% (45/45), done.[K
remote: Total 49 (delta 21), reused 16 (delta 3), pack-reused 0[K
Receiving objects: 100% (49/49), 13.55 MiB | 13.18 MiB/s, done.
Resolving deltas: 100% (21/21), done.


In [6]:
!pip install langchain langchain-community langchain-chroma langchain-core langchain-experimental langchain-openai pypdf sentence_transformers openai

Collecting langchain-openai
  Downloading langchain_openai-0.1.20-py3-none-any.whl.metadata (2.6 kB)
Collecting tiktoken<1,>=0.7 (from langchain-openai)
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Downloading langchain_openai-0.1.20-py3-none-any.whl (48 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m48.2/48.2 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m13.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tiktoken, langchain-openai
Successfully installed langchain-openai-0.1.20 tiktoken-0.7.0


In [7]:
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import PyPDFLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain_chroma import Chroma
from openai import OpenAI
from google.colab import userdata

In [16]:
file_dir = 'asp-ai-lab/files'
model = "gpt-3.5-turbo"
openai_api_key = userdata.get('OPENAI_API_KEY')

In [17]:
def load_vectordb():
    # load the documents from the directory
    loader = DirectoryLoader(file_dir, use_multithreading=True, loader_cls=PyPDFLoader)
    documents = loader.load()
    # split the documents into chunks
    embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
    text_splitter = SemanticChunker(embeddings=embeddings)
    docs = text_splitter.split_documents(documents)

    # store the documents and embeddings in the database
    db = Chroma.from_documents(docs, embeddings)
    return db

In [18]:
def query_vectordb(query, db):
    # query the database
    retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.5})
    docs = retriever.invoke(query)
    return docs

In [19]:
def create_prompt(query, context):
    prompt = f"""
    Use the following pieces of context to answer the question at the end.
    If you do not know the answer, please think rationally and answer from your own knowledge base.

    {context}

    Question: {query}
    """
    return prompt

In [20]:
def chat_openai (user_question, client, model):
    """
    This function sends a chat message to the OpenAI API and returns the content of the response.
    It takes two parameters: the chat prompt and the model to use for the chat.
    """
    prompt = """
    {user_question}

    Analyze the user's question and provide an answer based upon the context of the question.
    """.format(
        user_question=user_question
    )

    response = client.chat.completions.create(
        model=model,
        messages=[
            {
                "role": "system",
                "content": "You are a helpful network engineering expert:",
            },
            {"role": "user", "content": prompt},
        ],
    )
    return response.choices[0].message.content

In [21]:
db = load_vectordb()
query = 'How many 800G interfaces does the PTX10002-36QDD have?'
docs = query_vectordb(query, db)
# print the results
query_result = (docs[0].page_content)
prompt = create_prompt(query, query_result)
client = OpenAI(api_key=openai_api_key)
response = chat_openai(prompt, client, model)

print(response)

Based on the information provided, the PTX10002-36QDD supports 72x 400GigE ports without the need for breakout cables or patch panels. Since each 400G port is made up of 2x 400G paired connectors, each 800G interface consists of 2x 400G ports. Therefore, the PTX10002-36QDD has a total of 36 800G interfaces (72 divided by 2).
