# Pre-requisites
- WSL
- Miniconda3 

# Setup environment
- Create conda env `conda create langchain python=3.11`
- Set the "langchain" env that has been just created as the running env in VS code


Install langchain and openai package

In [None]:
! pip install langchain
! pip install openai==0.28
! pip install tiktoken
! pip install chromadb
! pip install pypdf

# Init variables

You need to set value of `OPENAI_API_KEY` that you get from the training team in the .env file

In [None]:
%pip install llama-index-embeddings-azure-openai
%pip install llama-index-llms-azure-openai
! pip install llama-index

In [1]:
import openai, os
from dotenv import load_dotenv

load_dotenv()
openai.api_type = "azure"
openai.api_version = "2023-07-01-preview"
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")
openai.api_key = os.getenv("OPENAI_API_KEY")

# Overviews
The BonBon FAQ.pdf file contains frequently asked questions and answers for customer support scenario. The topics are around IT related issue troubleshooting such as networking, software, hardware. You are requested to provide a solution to build a chat bot capable of answering the user questions with LangChain.

## Assignment 1: Document Indexing (mandatory)

- The content of BonBon FAQ.pdf should be indexed to the local Chroma vector DB from where the chatbot can lookup the appropriate information to answer questions.
- Should use some embedding model such as Azure Open AI text-embedding-ada-002 to create vectors, feel free to use any other open source embedding model if it works.

In [3]:

from langchain.llms import AzureOpenAI

EMBEDDING_MODEL_ID = "text-embedding-ada-002"
MODEL_ID = "gpt-35-turbo"
llm = AzureOpenAI(deployment_name="gpt-35-turbo",temperature=0)

  warn_deprecated(


In [4]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader('./data/BonBon FAQ.pdf')
loader

<langchain_community.document_loaders.pdf.PyPDFLoader at 0x7f08ed35e350>

In [5]:
from langchain_text_splitters import CharacterTextSplitter

documents = loader.load_and_split()
_text = "\n".join([document.page_content for document in documents])

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=int(4096 * 0.8), chunk_overlap=20)

texts = text_splitter.split_text(_text)
print(texts[0])

General guidelines for categorising requests as assessing Priority.  
 
Categorize the incident accurately based on predefined categories.  
1.       Password and Account Management:  
Examples:  
• Password Resets: Assisting users who have forgotten their passwords or need to reset them 
due to security reasons.  
• Account Creations: Creating new user accounts for employees or clients, granting access to 
various systems and services.  
• Username Recovery: Helping users retrieve their forgotten usernames or login IDs.  
2.       Software and Application Support  
Examples:  
• Providing guidance and troubleshooting assistance during the installation of software 
applications on users' devices.  
• Software Installation  
• Application Errors: Resolving issues related to errors or crashes that occur while using 
specific software applications.  
• Configuration Assistance: Helping users configure software settings according to their 
requirements or fixing misconfigurations.  
3.    

In [6]:
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from typing import List
from langchain.vectorstores import Chroma
EMBEDDING_MODEL_ID = "text-embedding-ada-002"
MODEL_ID = "gpt-35-turbo"

class AzureOpenAIEmbeddings:
    def embed_documents(self, texts: List[str]):
        embeddings = [openai.Embedding.create(input=text, deployment_id=EMBEDDING_MODEL_ID)["data"][0]["embedding"] for text in texts]
        return embeddings
    def embed_query(self, query: str):
        embedding = openai.Embedding.create(input=query, deployment_id=EMBEDDING_MODEL_ID)["data"][0]["embedding"]
        return embedding

db = Chroma.from_texts(texts, AzureOpenAIEmbeddings())
retriever = db.as_retriever()

chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever)

In [9]:
question = "Can you help me set up a secure password for my accounts?"
answer = chain({"query": question})["result"]
print(answer)

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1


 
1) Length:  
• Make your password at least 12 characters long. Longer passwords are generally stronger.  
2) Complexity:   
• Use a mix of characters, including uppercase letters, lowercase letters, numbers, and 
special symbols (e.g., !, @, #, $, %, etc.). The more character types you use, the stronger 
your password.  
3) Avoid Common Words:  
• Avoid using easily guessable words, such as "password," "123456," or common dictionary 
words. Hackers use dictionary attacks to guess passwords.  
4) Avoid Personal Information:  
• Don't use personal information like your name, birthdate, or family members' names. This 
information is often publicly available on social media.  
5) No Sequential or Repeated Characters:  
• Avoid sequences like "123456" or "abcdef," and don't repeat characters like "aaaa."  
6) Passphrases:  
• Consider using a passphrase, which is a series of random words or a memorable phrase. 
For example, "PurpleTiger$DancesUnderMoonlight!"  
7) Avoid Easily Guessable P

## Assignment 2: Building Chatbot (mandatory)
- You are requested to build a chatbot solution for customer support scenario using Conversational ReAct agent supported in LangChain
- The chatbot is able to support user to answer FAQs in the sample BonBon FAQ.pdf file.
- The chatbot should use Azure Open AI GPT-3.5 LLM as the reasoning engine.
- The chatbot should be context aware, meaning that it should be able to chat with users in the conversation manner.
- The agent is equipped the following tools:
  - Internet Search: Help the chatbot automatically find out more about something using Duck Duck Go internet search
  - Knowledge Base Search: Help the chatbot to lookup information in the private knowledge base
- In case user asks for information related to topics in the BonBon FAQ.pdf file such as internet connection, printer, malware issues the chatbot must use the private knowledge base, otherwise it should search on the internet to answer the question.
- In the answer of chatbot, it should mention the source file and the page that the answer belongs to, for example the answer should mention "BonBon FQA.pdf (page 2)"

In [17]:
%pip install --upgrade --quiet  duckduckgo-search

Note: you may need to restart the kernel to use updated packages.


In [20]:
from langchain.agents import initialize_agent
from langchain_community.tools import DuckDuckGoSearchRun, Tool
duck_go_search = DuckDuckGoSearchRun()
tools = [
    Tool(
        name="Knowledge Base Search",
        func=chain.run,
        description="useful for when you need to answer questions about current events",
    ),
    Tool(
        name="DuckDuckGo Search",
        func=duck_go_search.run,
        description="useful for when you need to answer questions about current events",
    )
]

agent = initialize_agent(
    tools,
    llm,
    agent="zero-shot-react-description",
    verbose=True
)

In [21]:
agent.run("My internet connection is not working. Can you help me troubleshoot it?")

  warn_deprecated(




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I'm not sure how to troubleshoot internet connection issues
Action: Knowledge Base Search
Action Input: "troubleshoot internet connection"[0m

Number of requested results 4 is greater than number of elements in index 1, updating n_results = 1



Observation: [36;1m[1;3m 
1) Check physical connections:  
• Ensure that all cables (Ethernet, modem, router, etc.) are securely connected.  
• Power cycle your modem and router by unplugging them from the power source, waiting for 30 
seconds, and then plugging them back in.  
 
2) Verify Wi -Fi settings (for wireless connections):  
• Make sure the Wi -Fi on your device is turned on.  
• Check if you are connected to the correct Wi -Fi network.  
• Try disconnecting and reconnecting to the Wi -Fi network.  
 
3) Test connectivity on other devices:  
• Check if other devices (e.g., smartphones, tablets, other computers) can connect to the internet. 
This helps determine if the issue is specific to your device or a broader network problem.  
 
4) Restart your device:  
• Restart your computer or device to refresh network settings.  
 
5) Disable/enable network adapters:  
• For Windows: Go to the Control Panel > Network and Internet > Network and Sharing Center. 
Click on "Change ad

'Agent stopped due to iteration limit or time limit.'