# Putting it all together

So far we have done the following on the prior Notebooks:

- **Notebook 01**: We loaded the Azure Search Engine with enriched PDFs in index: "cogsrch-index-files"
- **Notebook 02**: We loaded more information to the Search Engine this time using a CSV file with 52k rows/articles in index: "cogsrch-index-csv"
- **Notebook 03**: We added AzureOpenAI GPT models to enhance the the production of the answer bu using Utility Chains of LLMs
- **Notebook 04**: We added memory to our system in order to power a conversational Chat Bot
- **Notebook 05**: We introduced Agents and Tools in order to be able to solve a more complex task: ask questions to Tabular datasets
- **Notebook 06**: We used a Utility Chain in order to talk to a SQL Database directly

We are missing one more thing: **How do we glue all these features together into a very smart GPT Smart Search Engine Chat Bot?**

We want a virtual assistant for our company that can get the question, think what tool to use, then get the answer. The goal is that, regardless of the source of the information (Search Engine, Bing Search, SQL Database, CSV File, JSON File, etc), the Assistant can answer the question correctly using the right tool.

In this Notebook we are going to create that "brain" Agent, that will understand the question and use the right tool to get the answer from the right source.

Let's go..

In [1]:
import os
from langchain.chat_models import AzureChatOpenAI
from langchain.utilities import BingSearchAPIWrapper
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.agents import AgentType, Tool, initialize_agent, tool
from langchain.memory import ConversationBufferMemory
from langchain.tools import BaseTool

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string))

#custom libraries that we will use later in the app
from app.utils import DocSearchWrapper, CSVTabularWrapper, SQLDbWrapper, ChatGPTWrapper

# Don't mess with this unless you really know what you are doing
AZURE_SEARCH_API_VERSION = '2021-04-30-Preview'
AZURE_OPENAI_API_VERSION = "2023-03-15-preview"
BING_SEARCH_URL = "https://api.bing.microsoft.com/v7.0/search"

MODEL_DEPLOYMENT_NAME = "gpt-35-turbo"

Change these below with your own services credentials
AZURE_SEARCH_ENDPOINT = "Enter your Azure Cognitive Search Endpoint ..."
AZURE_SEARCH_KEY = "Enter your Azure Cognitive Search Key ..."
AZURE_OPENAI_ENDPOINT = "Enter your Azure OpenAI Endpoint ..."
AZURE_OPENAI_API_KEY = "Enter your Azure OpenAI Key ..."
BING_SUBSCRIPTION_KEY = "Enter your Azure Bing Search Subscription Key"

In [2]:
os.environ["OPENAI_API_BASE"] = os.environ["AZURE_OPENAI_ENDPOINT"] = AZURE_OPENAI_ENDPOINT
os.environ["OPENAI_API_KEY"] = os.environ["AZURE_OPENAI_API_KEY"] = AZURE_OPENAI_API_KEY
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"] = AZURE_OPENAI_API_VERSION
os.environ["AZURE_SEARCH_KEY"] = AZURE_SEARCH_KEY
os.environ["AZURE_SEARCH_ENDPOINT"] = AZURE_SEARCH_ENDPOINT
os.environ["BING_SEARCH_URL"] =  BING_SEARCH_URL
os.environ["BING_SUBSCRIPTION_KEY"] = BING_SUBSCRIPTION_KEY
os.environ["OPENAI_API_TYPE"] = "azure"
os.environ["LANGCHAIN_HANDLER"] = "langchain"

### Get the Tools - Doc Search, CSV Agent, SQL Agent and  Web Search

In the file app/utils.py we create a wrapper Class for each of the Functionalities that we developed in prior Notebooks:

In [3]:
# DocSearchWrapper is our Custom Tool Class created for Azure Cognitive Search + OpenAI
indexes = ["cogsrch-index-files", "cogsrch-index-csv"]
doc_search = DocSearchWrapper(indexes=indexes,k=5, deployment_name=MODEL_DEPLOYMENT_NAME)

In [4]:
# BingSearchAPIWrapper is a langchain Tool class to use the Bing Search API
# We will use this to add capabilities to our Engine to search for current information
# https://www.microsoft.com/en-us/bing/apis/bing-web-search-api
www_search = BingSearchAPIWrapper(k=5)

In [5]:
## CSVTabularWrapper is a custom Tool class crated to Q&A over CSV files
file_url = "./data/all-states-history.csv"
csv_search = CSVTabularWrapper(path=file_url, deployment_name=MODEL_DEPLOYMENT_NAME)

In [6]:
## SQLDbWrapper is a custom Tool class created to Q&A over a MS SQL Database
server = '<YOUR_SERVER_NAME>.database.windows.net'
database = '<YOUR_DATABASE_NAME>'
username = '<YOUR_DB_USERNAME>'
password = '<YOUR_DB_PASSWORD>'

sql_search = SQLDbWrapper(server=server, database=database,
                          username=username, password=password, k=30,
                          deployment_name=MODEL_DEPLOYMENT_NAME)

In [7]:
# ChatGPTWrapper is a custom Tool Class to use the pre-trained knowledged of the OpenAI Model (ChatGPT Clone)
chatgpt = ChatGPTWrapper(deployment_name=MODEL_DEPLOYMENT_NAME)

### Test the Tools

In [8]:
# Test the Documents Search Tool with a question we know it doesn't have the knowledge for
printmd(doc_search.run("what is the weather today in Dallas?"))

No Results Found

In [9]:
# Test the Document Search Tool with a question that we know it has the answer for
printmd(doc_search.run("what are covid risk factors?"))

Custom token limit for gpt-35-turbo : 3000
Combined docs tokens count: 2500
Chain Type selected: stuff


The risk factors for COVID-19 severity and progression include old age, male gender, high body mass index, and underlying comorbidities such as hypertension, diabetes, obesity, chronic lung diseases, heart, liver and kidney diseases, tumors, clinically apparent immunodeficiencies, local immunodeficiencies, such as early type I interferon secretion capacity, and pregnancy. Other possible complications include acute kidney injury, coagulation disorders, thromboembolism, and a cytokine storm. Laboratory indicators of COVID-19 include lymphopenia, eosinopenia, and elevated levels of white blood cell counts, liver enzymes, lactate dehydrogenase, creatine kinase, C-reactive protein, procalcitonin, proinflammatory cytokines such as interleukin (IL)-6, IL-1beta, Krebs von den Lungen-6 (KL-6), and ferritin. Socioeconomic status, diet, lifestyle, geographical differences, ethnicity, exposed viral load, day of initiation of treatment, and quality of health care have also been reported to influence individual outcomes. The sources for this information are: https://demodatasetsp.blob.core.windows.net/litcovid/train.csv

In [10]:
# Test the Bing Search Tool
printmd(www_search.run("What is the temperature today in Dallas?"))

<b>Dallas</b>, TX <b>Weather</b> Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days. Go Back <b>Dallas</b>, TX 65°F Location Chevron down LocationNewsVideos Use Current... <b>Weather</b> <b>Today</b> <b>in Dallas</b>, TX 79° Feels Like 6:41 am 8:08 pm High / Low 81° / 56° Wind 10 mph Humidity 31% Dew Point 46° Pressure 29.89 in UV Index 9 of 10 Visibility 10 mi Moon Phase Waxing... Hourly <b>Weather</b> Forecast for <b>Dallas</b>, TX - <b>The Weather</b> Channel | <b>Weather</b>.com <b>Today</b> tenDay largeDeviceBannerText Don&#39;t Miss Severe <b>Weather</b> Takes Aim At Texas, Gulf Coast, Florida New COVID... <b>Dallas</b>, TX <b>Weather</b> Conditions | <b>Weather</b> Underground Sensor Network Maps &amp; Radar star Popular Cities San Francisco, CA warning56 °F Partly Cloudy Manhattan, NY warning52 °F Rain Schiller Park,...

In [11]:
# Test the SQL Search Tool
sql_search.run("How many people in total died california in july 2020?")



[1m> Entering new SQLDatabaseChain chain...[0m
How many people in total died california in july 2020?
SQLQuery:[32;1m[1;3mSELECT SUM(death) AS total_deaths
FROM covidtracking
WHERE state = 'CA' AND date LIKE '2020-07%'[0m
SQLResult: [33;1m[1;3m[(229362.0,)][0m
Answer:[32;1m[1;3mThere were 229362 people who died in California in July 2020.
Explanation: 
The query selects the sum of the 'death' column from the 'covidtracking' table where the state is 'CA' and the date is in July 2020. The result is a single value representing the total number of deaths in California in July 2020.[0m
[1m> Finished chain.[0m


"There were 229362 people who died in California in July 2020.\nExplanation: \nThe query selects the sum of the 'death' column from the 'covidtracking' table where the state is 'CA' and the date is in July 2020. The result is a single value representing the total number of deaths in California in July 2020."

In [12]:
# Test the ChatGPT Tool
printmd(chatgpt.run("How do I cook a chocolate cake?"))

To cook a chocolate cake, first preheat your oven to 350°F (175°C). Then, mix together 2 cups of sugar, 1 3/4 cups of all-purpose flour, 3/4 cup of unsweetened cocoa powder, 1 1/2 teaspoons of baking powder, 1 1/2 teaspoons of baking soda, and 1 teaspoon of salt in a large bowl. In another bowl, beat together 2 eggs, 1 cup of milk, 1/2 cup of vegetable oil, and 2 teaspoons of vanilla extract. Gradually mix the wet ingredients into the dry ingredients until well combined. Then, add 1 cup of boiling water and mix until smooth. Pour the batter into a greased 9x13 inch baking pan and bake for 35-40 minutes or until a toothpick inserted in the center comes out clean. Let the cake cool before serving. Enjoy! 

### Define our experts/tools from our main "brain" agent

In [33]:
tools = [
    Tool(
        name = "Bing Search",
        func=www_search.run,
        description='useful **ONLY** when you need to answer questions that begin with "Hey Bing"'
    ),
    Tool(
        name = "ChatGPT Search",
        func=chatgpt.run,
        description='useful **ONLY** when you need to answer questions that begin with "Hey ChatGPT"',
        return_direct=True
    ),
    Tool(
        name = "Doc Search",
        func=doc_search.run,
        description='useful for when you need to answer questions that **DO NOT** begin with "Hey Bing" or "Hey ChatGPT"',
        return_direct=True
    ),
    Tool(
        name = "SQL Tabular Search",
        func=sql_search.run,
        description="useful for when you need to answer questions about number of cases, deaths, hospitalizations, tests, people in ICU, people in Ventilator, in the United States related to Covid-19",
        return_direct=True
    ),
]

**Note**: Notice that since both the CSV file and the SQL Database have the same exact data, we are only going to use the SQLDBWrapper Tool

### Create a shared memory for the "brain" agent

In [34]:
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

### Initialize the brain agent

In [35]:
llm = AzureChatOpenAI(deployment_name=MODEL_DEPLOYMENT_NAME, temperature=0.5, max_tokens=500)
agent_chain = initialize_agent(tools, llm, agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION, 
                               verbose=True, memory=memory)

### Let's talk to our GPT Smart Search Engine chat bot now

In [36]:
agent_chain.run(input="hi, how are you doing today Virtual Assistant?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "I'm just a machine, so I don't have feelings, but I'm here to assist you with whatever you need. How can I help you today?"
}[0m

[1m> Finished chain.[0m


"I'm just a machine, so I don't have feelings, but I'm here to assist you with whatever you need. How can I help you today?"

In [37]:
agent_chain.run(input="what is your name?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "My name is Assistant."
}[0m

[1m> Finished chain.[0m


'My name is Assistant.'

In [38]:
agent_chain.run(input="Hey ChatGPT, what is the weather right now in 75240?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "ChatGPT Search",
    "action_input": "what is the weather right now in 75240?"
}[0m
Observation: [33;1m[1;3mThe current weather in 75240 is not available. Please check a weather website or app for the latest updates. [0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


'The current weather in 75240 is not available. Please check a weather website or app for the latest updates. '

In [39]:
agent_chain.run(input="Hey Bing, what is the weather right now in 75240?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Bing Search",
    "action_input": "weather 75240"
}[0m
Observation: [36;1m[1;3mCurrent <b>Weather</b>. 11:41 AM. 61° F. RealFeel® 72°. RealFeel Shade™ 61°. Air Quality Fair. Wind ESE 4 mph. Wind Gusts 9 mph. Mostly cloudy More Details. Hourly <b>Weather</b> Forecast for Dallas, TX - The <b>Weather</b> Channel | <b>Weather</b>.com Today tenDay largeDeviceBannerText Don&#39;t Miss Severe <b>Weather</b> Takes Aim At Texas, Gulf Coast, Florida New COVID... Dallas <b>Weather</b> Forecasts. <b>Weather Underground</b> provides local &amp; long-range <b>weather</b> forecasts, weatherreports, maps &amp; tropical <b>weather</b> conditions for the Dallas area. Richardson, TX 10-Day <b>Weather</b> Forecast - The <b>Weather</b> Channel | <b>Weather</b>.com 10 Day <b>Weather</b> - Richardson, TX As of 9:36 pm CDT Tonight --/ 54° 1% Sun 30 | Night 54° 1% NE 9 mph Partly cloudy....[0m
Thought:[32;1m[1;3m{
    "action":

'The current weather in 75240 is 61°F and mostly cloudy. Wind speed is ESE 4 mph with gusts up to 9 mph. Air quality is fair. You can find more details on The Weather Channel or Weather.com.'

In [41]:
agent_chain.run(input="What medicine helps with Covid?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```json
{
    "action": "Doc Search",
    "action_input": "COVID-19 treatment options"
}
```[0mCustom token limit for gpt-35-turbo : 3000
Combined docs tokens count: 1049
Chain Type selected: stuff

Observation: [38;5;200m[1;3mThere are currently no definitely proven effective therapies for COVID-19, and supportive care remains the mainstay of therapy while ongoing clinical trials are being conducted. Various conventional as well as potentially advanced in vitro diagnostic approaches (IVD) for the diagnosis of COVID-19 have been explored, but mixed results have been observed so far when utilising these approaches for the treatment of COVID-19 infection. Several clinical trials have been conducted or are currently ongoing to evaluate the efficacy of repurposed drugs that have been approved or are in development for other conditions. Antiviral agents and immune modulating treatments are currently being trialled, and understa

'There are currently no definitely proven effective therapies for COVID-19, and supportive care remains the mainstay of therapy while ongoing clinical trials are being conducted. Various conventional as well as potentially advanced in vitro diagnostic approaches (IVD) for the diagnosis of COVID-19 have been explored, but mixed results have been observed so far when utilising these approaches for the treatment of COVID-19 infection. Several clinical trials have been conducted or are currently ongoing to evaluate the efficacy of repurposed drugs that have been approved or are in development for other conditions. Antiviral agents and immune modulating treatments are currently being trialled, and understanding immune evasion strategies of SARS-CoV2 and the resulting delayed massive immune response will result in the identification of biomarkers that predict outcomes as well as phenotype and disease stage specific treatments that will likely include both antiviral and immune modulating agen

In [42]:
agent_chain.run(input="How do I cook a Burrito?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Doc Search",
    "action_input": "How to cook a Burrito"
}[0m
Observation: [38;5;200m[1;3mNo Results Found[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m


'No Results Found'

In [43]:
agent_chain.run(input="How many people died of covid in Texas in 2020?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "SQL Tabular Search",
    "action_input": "SELECT SUM(Deaths) AS Total_Deaths FROM covid_19_usa_summary WHERE State='Texas' AND YEAR(Date)=2020"
}[0m

[1m> Entering new SQLDatabaseChain chain...[0m
SELECT SUM(Deaths) AS Total_Deaths FROM covid_19_usa_summary WHERE State='Texas' AND YEAR(Date)=2020
SQLQuery:[32;1m[1;3mSELECT SUM([death]) AS Total_Deaths FROM covidtracking WHERE state='TX' AND YEAR(CONVERT(date, date))=2020[0m
SQLResult: [33;1m[1;3m[(2841253.0,)][0m
Answer:[32;1m[1;3mTotal_Deaths in Texas in 2020 was 2841253.
Explanation: 
The query selects the sum of the 'death' column from the covidtracking table where the state is 'TX' and the year of the date is 2020. The result is a single value representing the total number of deaths in Texas in 2020.[0m
[1m> Finished chain.[0m

Observation: [31;1m[1;3mTotal_Deaths in Texas in 2020 was 2841253.
Explanation: 
The query selects the sum of the

"Total_Deaths in Texas in 2020 was 2841253.\nExplanation: \nThe query selects the sum of the 'death' column from the covidtracking table where the state is 'TX' and the year of the date is 2020. The result is a single value representing the total number of deaths in Texas in 2020."

In [44]:
agent_chain.memory.buffer

[HumanMessage(content='hi, how are you doing today Virtual Assistant?', additional_kwargs={}),
 AIMessage(content="I'm just a machine, so I don't have feelings, but I'm here to assist you with whatever you need. How can I help you today?", additional_kwargs={}),
 HumanMessage(content='what is your name?', additional_kwargs={}),
 AIMessage(content='My name is Assistant.', additional_kwargs={}),
 HumanMessage(content='Hey ChatGPT what is the weather right now in 75240?', additional_kwargs={}),
 AIMessage(content='The current weather in 75240 is not available. Please check a weather website or app for the latest updates. ', additional_kwargs={}),
 HumanMessage(content='Hey Bing, what is the weather right now in 75240?', additional_kwargs={}),
 AIMessage(content='The current weather in 75240 is 61°F and mostly cloudy. Wind speed is ESE 4 mph with gusts up to 9 mph. Air quality is fair. You can find more details on The Weather Channel or Weather.com.', additional_kwargs={}),
 HumanMessage(c

# Summary

Great!, We just built the GPT Smart Search Engine!
In this Notebook we created the brain, the decision making Agent that decides what Tool to use to answer the question from the user. This is what was necessary in order to have an smart chat bot.

We can have many tools to accomplish different tasks, including connecting to APIs, dealing with File Systems, and even using Humans as Tools. For more reference see [HERE](https://python.langchain.com/en/latest/modules/agents/tools.html)

# NEXT
Go back to your forked github repo and Deploy the web Application by following app/README.md