## Naive Q&A agent using Langchain


In this exercise, we will implement a Q&A agent from scarch using Langchain framework
Knowledge of this agent is from t6, all knowledge will be passed to the agent via prompt

After complete this hand-ons, we will have some understanding of how to build a simple Q&A agent using Langchain, advandtages and limitations of this approach

![rag.png](./public/rag.png)

In [None]:
import os
from dotenv import load_dotenv

load_dotenv(".env")

# these variables are required to initialize Langchain AzureChatOpenAI instance
required_env_vars = [
    "AZURE_OPENAI_API_KEY",
    "AZURE_OPENAI_API_VERSION",
    "AZURE_OPENAI_ENDPOINT",
    "AZURE_OPENAI_MODEL",
    "AZURE_OPENAI_DEPLOYMENT_NAME",
]

for var in required_env_vars:
    if os.environ.get(var) is None:
        raise Exception(f"Missing `{var}` environment variable")


In [None]:
from langchain.chat_models import AzureChatOpenAI

api_key = os.environ.get("AZURE_OPENAI_API_KEY", "")
api_version=os.environ.get("AZURE_OPENAI_API_VERSION", "2023-03-15-preview")
azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT", "https://public-api.grabgpt.managed.catwalk-k8s.stg-myteksi.com")
deployment_name=os.environ.get("AZURE_OPENAI_DEPLOYMENT_NAME", "gpt-4-turbo")
model=os.environ.get("AZURE_OPENAI_MODEL", "gpt-4-turbo")

llm = AzureChatOpenAI(
    api_key=api_key,
    api_version=api_version,
    azure_endpoint=azure_endpoint,
    deployment_name=deployment_name,
    temperature=0,
)

In [None]:
# !git clone git@gitlab.myteksi.net:sentry/t6/t6.git ./tmp/t6 && mkdir -p knowledge/t6 && rsync -avm --include='*.rst' --remove-source-files -f 'hide,! */' "tmp/t6/doc" "knowledge/t6" && rm -rf tmp

In [None]:
from typing import List
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.docstore.document import Document

BASE_PATH = "./knowledge/t6/doc/grabdocs"
if not os.path.exists(BASE_PATH):
    raise ValueError(f"Directory {BASE_PATH} does not exist")

loader = DirectoryLoader(path=BASE_PATH, loader_cls=TextLoader, glob="**/*.rst", exclude=["index.rst"])
documents: List[Document] = loader.load()

In [None]:
# print("\n".join([doc.metadata['source'] for doc in documents]))

In [None]:
print(len(documents))

In [None]:
# documents and len(documents) and print(documents[0].page_content, end="\n<========== END \n")
# documents and len(documents) and print(documents[1].page_content, end="\n<========== END \n")

In [None]:
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

from langchain_core.runnables import RunnableSequence

ADVIRSOR_PROMPT = """
You are a helpful advisor, collaborating with other agents. \
Don't assume anything you don't know. Use the context below to answer the user question\
Think carefully about the question and provide the best answer you can. \
If you are unable to fully answer, that's OK, another agent with different tools will help where you left off. \
If you or any of the other agents have the final answer or deliverable, \
prefix your respond with FINAL ANSWER so the team knows to stop.
The context is: {context}.\n
"""
context = "\n".join([doc.page_content for doc in documents])

chain_prompt = ChatPromptTemplate.from_messages(
    messages=[
        ("system", ADVIRSOR_PROMPT),
        ("user", "{input}"),
    ]
)

# In overview, this chain will receive a dictionary with the key "input" and "context", and return the dictionary with the key "answer"
naive_chain = RunnableSequence(
    chain_prompt,
    llm.with_config(run_name="llm"),
    {"answer": StrOutputParser().with_config(run_name="parser")},
).with_config(run_name="naive_chain")

In [None]:
len(chain_prompt.invoke({"input": "", "context":context }).to_string())

In [None]:
user_query =(
    "I have error `Fail to push image` while running cop_image:envoy-base step in "
    "pre stage while setting up t6 fabric pipeline, how to resolve it?"
)
from openai import RateLimitError

try:
    output = naive_chain.invoke({"input": user_query, "context":context })
    print(output["answer"])

except RateLimitError as e:
    print("Exceed rate limit")
    print(str(e))


Seems that the answer is not correct, we have some main reasons for this:
- AI under the hood is a black box, we don't know how it works, so with the context of the question which is very large, we couldn't sure that the model already catch the right context of the question (context-window limit)
- Hallucination of the model, the model could generate an answer that looks genuine but actually not

Now, try to use this naive llm with a longger context as in real world scenario, and see how it performs 

For the example scenario, we will double the context length to pass to the chain_prompt

In [None]:
len(chain_prompt.invoke({"input": "", "context":context + context }).to_string())

In [None]:
from openai import RateLimitError

user_query = (
    "I have error `Fail to push image` while running cop_image:envoy-base step in "
    "pre stage while setting up t6 fabric pipeline, how to resolve it?"
)

try:
    output = naive_chain.invoke({"input": user_query, "context": context + context})

    print(output["answer"])
except RateLimitError as e:
    print("Exceed rate limit")

    print(str(e))

## How to workaround this issue?

We will try to narrow down the context to the most relevant information to the question,

In this case, we will try to add the `reference` field that specifies the path of the file that may contain the relevant information to the question

It is a naive RAG model 🙂

In [None]:
from langchain_core.runnables import RunnablePassthrough

# we now have 1 more attribute `reference` in the input, then this chain will read the content of the file and pass it to the next chain as `context`
naive_retriever_chain = RunnableSequence(
    RunnablePassthrough.assign(
        context=RunnableSequence(
            (lambda x: TextLoader(file_path=x["reference"]).load()),
            (lambda docs: "\n".join([doc.page_content for doc in docs])),
        )
    ),
    naive_chain,
)

In [None]:
user_query = (
    "I have error `Fail to push image` while running cop_image:envoy-base step in "
    "pre stage while setting up t6 fabric pipeline, how to resolve it?"
)

try:
    output = naive_retriever_chain.invoke(
        {
            "input": user_query,
            "reference": "./knowledge/t6/doc/grabdocs/automation/gitlab_ci_automation.rst",
        }
    )

    print(output)
except RateLimitError as e:
    print("Exceed rate limit")

    print(str(e))

The answer is quite good now (at least, it's answer the question correctly), but we spend more effort to narrow down the context (human retrival 🙂), and it's not always the case that we can narrow down the context to the most relevant information

We could see that this approach is not scalable, and it's not efficient in real-world scenarios, especially when the knowledge base is very large


In summary, the naive Q&A agent have some good points:
- It's easy to implement
- Sufficent for simple Q&A tasks
- With small knowledge base, it could perform well

But it has some limitations:
- Not scalable
- Not efficient in real-world scenarios
- Can't handle large knowledge base
- Token limit

In the next notebook, we will try to use the another `retrieval` approach to get the most relevant information to the question