# Pre-requisites
- WSL
- Miniconda3 

# Setup environment
- Create conda env `conda create langchain python=3.11`
- Set the "langchain" env that has been just created as the running env in VS code


Install langchain and openai package

In [20]:
! pip install langchain langchain-openai langchain-community langchain-core openai dotenv chromadb pypdf duckduckgo-search

[0m

# Init variables

You need to set value of `OPENAI_API_KEY` that you get from the training team in the .env file

In [21]:
import openai, os
from dotenv import load_dotenv

load_dotenv(override=True)
AZURE_OPENAI_API_KEY=os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_API_ENDPOINT=os.getenv("AZURE_OPENAI_API_ENDPOINT")
AZURE_OPENAI_API_VERSION=os.getenv("AZURE_OPENAI_API_VERSION")
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT")

# Overviews
The BonBon FAQ.pdf file contains frequently asked questions and answers for customer support scenario. The topics are around IT related issue troubleshooting such as networking, software, hardware. You are requested to provide a solution to build a chat bot capable of answering the user questions with LangChain.

## Assignment 1: Document Indexing (mandatory)

- The content of BonBon FAQ.pdf should be indexed to the local Chroma vector DB from where the chatbot can lookup the appropriate information to answer questions.
- Should use some embedding model such as Azure Open AI text-embedding-3-small to create vectors, feel free to use any other open source embedding model if it works.

In [22]:
from langchain_community.document_loaders import PyPDFLoader
pdf_path = "data/BonBon FAQ.pdf"
loader = PyPDFLoader(pdf_path)
documents = loader.load()

In [23]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
docs = text_splitter.split_documents(documents)


In [24]:
from langchain_openai import AzureOpenAIEmbeddings
embeddings = AzureOpenAIEmbeddings(
    openai_api_key=AZURE_OPENAI_API_KEY,
    azure_endpoint=AZURE_OPENAI_API_ENDPOINT,
    openai_api_version=AZURE_OPENAI_API_VERSION,
    deployment=AZURE_OPENAI_EMBEDDING_DEPLOYMENT,
)

In [25]:
from langchain.vectorstores import Chroma

chroma_dir = "./chroma_db"

vectorstore = Chroma.from_documents(docs, embeddings, persist_directory=chroma_dir)
vectorstore.persist()

print(f"✅ Indexing complete. Chroma DB saved at: {chroma_dir}")

✅ Indexing complete. Chroma DB saved at: ./chroma_db


## Assignment 2: Building Chatbot (mandatory)
- You are requested to build a chatbot solution for customer support scenario using Conversational ReAct agent supported in LangChain
- The chatbot is able to support user to answer FAQs in the sample BonBon FAQ.pdf file.
- The chatbot should use Azure Open AI GPT-4o LLM as the reasoning engine.
- The chatbot should be context aware, meaning that it should be able to chat with users in the conversation manner.
- The agent is equipped the following tools:
  - Internet Search: Help the chatbot automatically find out more about something using Duck Duck Go internet search
  - Knowledge Base Search: Help the chatbot to lookup information in the private knowledge base
- In case user asks for information related to topics in the BonBon FAQ.pdf file such as internet connection, printer, malware issues the chatbot must use the private knowledge base, otherwise it should search on the internet to answer the question.
- In the answer of chatbot, it should mention the source file and the page that the answer belongs to, for example the answer should mention "BonBon FQA.pdf (page 2)"

In [26]:
from langchain_openai import AzureChatOpenAI

load_dotenv()
llm = AzureChatOpenAI(
    openai_api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    azure_endpoint=os.getenv("AZURE_OPENAI_API_ENDPOINT"),
    openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    deployment_name=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT"),
)

In [27]:
vectorstore = Chroma(
    persist_directory="./chroma_db",
    embedding_function=embeddings,
)
retriever = vectorstore.as_retriever(search_type="similarity", k=3)

In [28]:
from langchain.chains import RetrievalQA

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True,
)

def knowledge_base_search(query: str) -> str:
    result = qa_chain(query)
    answer = result["result"]

    sources = result.get("source_documents", [])
    if sources:
        page = sources[0].metadata.get("page", "unknown")
        answer += f"\n(Source: BonBon FAQ.pdf, page {page})"

    return answer

In [29]:
from langchain_community.tools import DuckDuckGoSearchResults
internet_tool = DuckDuckGoSearchResults()

In [30]:
from langchain.agents import Tool

tools = [
    Tool(
        name="Knowledge Base Search",
        func=knowledge_base_search,
        description="Use this for BonBon-related IT support topics (network, printer, malware, etc.)"
    ),
    Tool(
        name="Internet Search",
        func=internet_tool.run,
        description="Use this to find info unrelated to BonBon internal topics"
    ),
]


In [31]:
from langchain.agents import initialize_agent
from langchain.agents.agent_types import AgentType

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    verbose=True,
)


In [32]:
test_queries = [
    "My internet connection is not working. Can you help me troubleshoot it?",
    "How do I connect to Any Corp’s Corporate Wi-Fi network?",
]

chat_history = []

for query in test_queries:
    print(f"\n🧑 You: {query}")
    inputs = {"input": query, "chat_history": chat_history}
    response = agent.invoke(inputs)
    print("🤖", response["output"])
    chat_history.append((query, response["output"]))



🧑 You: My internet connection is not working. Can you help me troubleshoot it?


[1m> Entering new AgentExecutor chain...[0m


[32;1m[1;3mThought: Do I need to use a tool? Yes  
Action: Knowledge Base Search  
Action Input: "internet connection troubleshooting"  
[0m
Observation: [36;1m[1;3mPlease follow these steps to troubleshoot your internet connection:

1) **Check Physical Connections:**
   - Ensure all cables (Ethernet, modem, router, etc.) are securely connected.
   - Power cycle your modem and router: unplug them from the power source, wait 30 seconds, then plug them back in.

2) **Verify Wi-Fi Settings:**
   - For wireless connections, ensure Wi-Fi is turned on on your device.
   - Check if your device is connected to the correct network.

3) **Restart Your Device:**
   - Restart the device you are trying to connect to the internet.

4) **Test Another Device:**
   - Try connecting another device to the internet to determine if the issue is device-specific.

5) **Check for Service Outages:**
   - Confirm with your internet provider if there are service outages in your area.

6) **Run Network Troub

## Assignment 3: Build a new assistant based on BonBon source code (optional)
The objective
- Run the code and index the sample BonBon FAQ.pdf file to Azure Cognitive Search
- Explore the code and implement a new assistant that has the same behavior as above
- Explore other features such as RBACs, features on admin portal

Please contact the training team in case you need to get the source code of BonBon.