# Pre-requisites
- WSL
- Miniconda3 

# Setup environment
- Create conda env `conda create langchain python=3.11`
- Set the "langchain" env that has been just created as the running env in VS code


Install langchain and openai package

In [2]:
! pip install python-dotenv



In [None]:

! pip install langchain chromadb openai tiktoken pypdf langchain_openai langchain-chroma duckduckgo-search

# Init variables

You need to set value of `OPENAI_API_KEY` that you get from the training team in the .env file

In [4]:
import os
from dotenv import load_dotenv

# Load environment variables from the .env file
load_dotenv()

AZURE_OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION")
AZURE_OPENAI_EMBEDDING_MODEL = os.getenv("OPENAI_EMBEDDING_MODEL")
AZURE_OPENAI_DEPLOYMENT_NAME = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT")

# Overviews
The BonBon FAQ.pdf file contains frequently asked questions and answers for customer support scenario. The topics are around IT related issue troubleshooting such as networking, software, hardware. You are requested to provide a solution to build a chat bot capable of answering the user questions with LangChain.

## Assignment 1: Document Indexing (mandatory)

- The content of BonBon FAQ.pdf should be indexed to the local Chroma vector DB from where the chatbot can lookup the appropriate information to answer questions.
- Should use some embedding model such as Azure Open AI text-embedding-ada-002 to create vectors, feel free to use any other open source embedding model if it works.

In [13]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import AzureOpenAIEmbeddings
from langchain_chroma import Chroma


embedding = AzureOpenAIEmbeddings(
    azure_deployment=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT"),
    api_key=os.getenv("OPENAI_API_KEY"),
    azure_endpoint="https://langchain-training-openai.openai.azure.com",
    api_version=os.getenv("AZURE_OPENAI_API_VERSION")
)

loader = PyPDFLoader("data/BonBon FAQ.pdf")
pages = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(pages)

for chunk in chunks:
    chunk.metadata["source"] = f"BonBon FAQ.pdf (page {chunk.metadata.get('page', 'unknown') + 1})"

vectordb = Chroma.from_documents(chunks, embedding, persist_directory="./chroma_db")
print("Built Chroma vector DB and loaded pdf file successfully!!!")


Built Chroma vector DB and loaded pdf file successfully!!!


## Assignment 2: Building Chatbot (mandatory)
- You are requested to build a chatbot solution for customer support scenario using Conversational ReAct agent supported in LangChain
- The chatbot is able to support user to answer FAQs in the sample BonBon FAQ.pdf file.
- The chatbot should use Azure Open AI GPT-3.5 LLM as the reasoning engine.
- The chatbot should be context aware, meaning that it should be able to chat with users in the conversation manner.
- The agent is equipped the following tools:
  - Internet Search: Help the chatbot automatically find out more about something using Duck Duck Go internet search
  - Knowledge Base Search: Help the chatbot to lookup information in the private knowledge base
- In case user asks for information related to topics in the BonBon FAQ.pdf file such as internet connection, printer, malware issues the chatbot must use the private knowledge base, otherwise it should search on the internet to answer the question.
- In the answer of chatbot, it should mention the source file and the page that the answer belongs to, for example the answer should mention "BonBon FQA.pdf (page 2)"

In [None]:
from langchain_openai import AzureChatOpenAI
from langchain.agents import Tool, initialize_agent, AgentType
from langchain.tools import DuckDuckGoSearchRun
from langchain.chains import RetrievalQA
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.memory import ConversationBufferMemory

llm = AzureChatOpenAI(
    azure_deployment=os.getenv("AZURE_OPENAI_CHAT_DEPLOYMENT"),
    api_key=os.getenv("OPENAI_API_KEY"),
    azure_endpoint="https://langchain-training-openai.openai.azure.com",
    api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
    temperature=0.2
)

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

retriever = vectordb.as_retriever()

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True,
    verbose=False,
)

# Tool 1: Search from KB
def search_kb_with_citation(query: str) -> str:
    result = qa_chain.invoke({"query": query})
    docs = result.get("source_documents", [])
    if not docs:
        return "I don't know."
    
    page = docs[0].metadata.get("page", "unknown")
    source = docs[0].metadata.get("source", "BonBon FAQ.pdf")
    return f"{result['result']}\n\n(Source: {source} (page {int(page)+1}))"

# Tool 2: Internet search
internet_search = DuckDuckGoSearchRun()

tools = [
    Tool(
        name="Knowledge Base",
        func=search_kb_with_citation,
        description="Use this to answer questions about internet connection, software, password, malware, printer, etc. from BonBon FAQ."
    ),
    Tool(
        name="Web Search",
        func=internet_search.run,
        description="Use this if the question is outside BonBon FAQ scope, like current events or general facts."
    )
]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION,
    memory=memory,
    verbose=True
)

def chat():
    print("💬 Welcome to BonBon Support Chatbot! (type 'exit' to quit)\n")
    while True:
        question = input("User: ")
        if question.lower() == "exit":
            break
        response = agent.run(question)
        print(f"\n🤖 Bot: {response}\n")

chat()



💬 Welcome to BonBon Support Chatbot! (type 'exit' to quit)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? No
AI: Hello! How can I assist you today?[0m

[1m> Finished chain.[0m

🤖 Bot: Hello! How can I assist you today?



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: Do I need to use a tool? Yes
Action: Knowledge Base
Action Input: internet connection not working[0m
Observation: [36;1m[1;3mPlease follow these steps to troubleshoot your internet connection issue:

1) Check physical connections:
- Ensure all cables (Ethernet, modem, router) are securely connected.
- Power cycle your modem and router by unplugging them for 30 seconds and then plugging them back in.

2) Verify Wi-Fi settings:
- Make sure Wi-Fi on your device is turned on.
- Check if you are connected to the correct Wi-Fi network.
- Try disconnecting and reconnecting to the Wi-Fi network.

3) Test connectivity on other devices:
- Check if other dev

## Assignment 3: Build a new assistant based on BonBon source code (optional)
The objective
- Run the code and index the sample BonBon FAQ.pdf file to Azure Cognitive Search
- Explore the code and implement a new assistant that has the same behavior as above
- Explore other features such as RBACs, features on admin portal

Please contact the training team in case you need to get the source code of BonBon.