## Set up variables

In [1]:
import os
import openai
import urllib
import requests
import random
import json
from collections import OrderedDict
from IPython.display import display, HTML, Markdown
from typing import List
from operator import itemgetter

# LangChain Imports needed
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.retrievers import BaseRetriever
from langchain_core.callbacks import CallbackManagerForRetrieverRun
#from langchain_core.documents import Document
from langchain_core.runnables import ConfigurableField


# Our own libraries needed
#from common.prompts import DOCSEARCH_PROMPT
#from common.utils import get_search_results

from dotenv import load_dotenv
load_dotenv("credentials.env")


True

In [2]:
QUESTION = "what is azure ml for"

In [3]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

## A gentle intro to chaining LLMs and prompt engineering

Chains refer to sequences of calls - whether to an LLM, a tool, or a data preprocessing step.

Azure OpenAI is a type of LLM (provider) that you can use but there are others like Cohere, Huggingface, etc.

Chains can be simple (i.e. Generic) or specialized (i.e. Utility).

* Generic — A single LLM is the simplest chain. It takes an input prompt and the name of the LLM and then uses the LLM for text generation (i.e. output for the prompt).

Here’s an example:

In [4]:
COMPLETION_TOKENS = 2000
llm = AzureChatOpenAI(deployment_name="gpt-4-0125-Preview", 
                      temperature=0.5, 
                      max_tokens=COMPLETION_TOKENS)

In [5]:
output_parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant that give thorough responses to users."),
    ("user", "{input}. Give your response in {language}")
])

The | symbol is similar to a unix pipe operator, which chains together the different components feeds the output from one component as input into the next component.

In [6]:
chain = prompt | llm | output_parser

In [7]:
chain.invoke({"input": QUESTION, "language": "Spanish"})

'Azure ML (Machine Learning) es una plataforma de computación en la nube ofrecida por Microsoft que permite a los desarrolladores, científicos de datos e investigadores construir, entrenar y desplegar modelos de machine learning de manera rápida y eficiente. Esta plataforma proporciona herramientas y servicios que facilitan el proceso de desarrollo de soluciones de inteligencia artificial, desde la preparación de datos hasta la implementación de modelos en producción.\n\nAlgunos de los usos principales de Azure ML incluyen:\n\n1. **Desarrollo de modelos de aprendizaje automático**: Permite a los usuarios crear, entrenar y validar modelos utilizando diversos algoritmos y técnicas de aprendizaje supervisado, no supervisado y de refuerzo.\n\n2. **Automatización y optimización de modelos**: Ofrece servicios como la búsqueda automática de hiperparámetros y la selección de modelos, lo que ayuda a optimizar el rendimiento de los modelos de manera eficiente.\n\n3. **Integración y despliegue**:

Great!!, now you know how to create a simple prompt and use a chain in order to answer a general question using ChatGPT knowledge!. 

It is important to note that we rarely use generic chains as standalone chains. More often they are used as building blocks for Utility chains (as we will see next). Also important to notice is that we are NOT using our documents or the result of the Azure Search yet, just the knowledge of ChatGPT on the data it was trained on.

**The second type of Chains are Utility:**

* Utility — These are specialized chains, comprised of many building blocks to help solve a specific task. For example, LangChain supports some end-to-end chains (such as `create_retrieval_chain` for QnA Doc retrieval, Summarization, etc).

We will build our own specific chain in this workshop for digging deeper and solve our use case of enhancing the results of Azure AI Search.

So really, our only job now is to make sure that the results from the Azure AI Search queries fit on the LLM context size, and then let it do its magic.

Now let's create a Prompt Template that will ground the response only in the given context.

In [8]:
template = """Answer the question based **ONLY** on the following context:
{context}

Question: {question}
If the question is different from the context, just tell "sorry,I can not assist with this query"
"""
prompt = ChatPromptTemplate.from_template(template)

In [28]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Answer the question based **ONLY** on the following context:\n{context}\n\nQuestion: {question}\nIf the question is different from the context, just tell "sorry,I can not assist with this query"\n'))])

In [9]:
%%time 
# Creation of our custom chain
chain = prompt | llm | output_parser

chain.invoke({"question": "Tell me about Bupa", "context": "Azure Machine LEarning is E2E ML platform."})

CPU times: user 20.8 ms, sys: 1.61 ms, total: 22.4 ms
Wall time: 585 ms


'Sorry, I cannot assist with this query.'

#  Internet and Websites Search using Bing API

In [10]:
import os
import requests
from typing import Dict, List, Optional, Type
import asyncio
from concurrent.futures import ThreadPoolExecutor
from bs4 import BeautifulSoup


from langchain import hub
from langchain.callbacks.manager import AsyncCallbackManagerForToolRun, CallbackManagerForToolRun
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool, StructuredTool, tool
from langchain_openai import AzureChatOpenAI
from langchain.agents import AgentExecutor, Tool, create_openai_tools_agent
from langchain.callbacks.manager import CallbackManager
from langchain.agents import initialize_agent, AgentType
from langchain.utilities import BingSearchAPIWrapper

from common.callbacks import StdOutCallbackHandler
from common.prompts import BINGSEARCH_PROMPT

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string.replace("$","USD ")))

from dotenv import load_dotenv
load_dotenv("credentials.env")

True

In [11]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

In [13]:
cb_handler = StdOutCallbackHandler()
cb_manager = CallbackManager(handlers=[cb_handler])

COMPLETION_TOKENS = 2000

llm = AzureChatOpenAI(deployment_name="gpt-4-0125-Preview", 
                      temperature=0.5, max_tokens=COMPLETION_TOKENS, 
                      streaming=True, callback_manager=cb_manager)

In [16]:
class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")

class MyBingSearch(BaseTool):
    """Tool for a Bing Search Wrapper"""
    
    name = "Searcher"
    description = "useful to search the internet.\n"
    args_schema: Type[BaseModel] = SearchInput

    k: int = 5
    
    def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
        bing = BingSearchAPIWrapper(k=self.k)
        return bing.results(query,num_results=self.k)
            
    async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
        bing = BingSearchAPIWrapper(k=self.k)
        loop = asyncio.get_event_loop()
        results = await loop.run_in_executor(ThreadPoolExecutor(), bing.results, query, self.k)
        return results

In [17]:
def parse_html(content) -> str:
    soup = BeautifulSoup(content, 'html.parser')
    text_content_with_links = soup.get_text()
    return text_content_with_links

def fetch_web_page(url: str) -> str:
    HEADERS = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:90.0) Gecko/20100101 Firefox/90.0'}
    response = requests.get(url, headers=HEADERS)
    return parse_html(response.content)

In [18]:
web_fetch_tool = Tool.from_function(
    func=fetch_web_page,
    name="WebFetcher",
    description="useful to fetch the content of a url"
)

In [19]:
# tools = [MyBingSearch(k=5), web_fetch_tool] # With GPT-4 you can add the web_fetch_tool

tools = [MyBingSearch(k=5)] # With GPT-3.5 

prompt = BINGSEARCH_PROMPT

# Construct the OpenAI Tools agent
agent = create_openai_tools_agent(llm, tools, prompt)

# Create an agent executor by passing in the agent and tools
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False, 
                               return_intermediate_steps=True)

In [20]:
agent_executor.tools

[MyBingSearch()]

In [21]:
# QUESTION = "Create a list with the main facts on What is happening with the oil supply in the world right now?"
# QUESTION = "How much is 50 USD in Euros and is it enough for an average hotel in Madrid?"
# QUESTION = "My son needs to build a pinewood car for a pinewood derbi, how do I build such a car?"
# QUESTION = "I'm planning a vacation to Greece, tell me budget for a family of 4, in Summer, for 7 days including travel, lodging and food costs"
# QUESTION = "Who won the 2023 superbowl and who was the MVP?"
QUESTION = """
compare the number of job opennings (provide the exact number), the average salary within 15 miles of Dallas, TX, for these ocupations:

- ADN Registerd Nurse 
- Occupational therapist assistant
- Dental Hygienist
- Graphic Designer


# Create a table with your findings. Place the sources on each cell.
# """

In [23]:
async for chunk in agent_executor.astream({"question": QUESTION}):
    # Agent Action
    if "actions" in chunk:
        for action in chunk["actions"]:
            print(f"Calling Tool: `{action.tool}` with input `{action.tool_input}`")
    # Observation
    elif "steps" in chunk:
        # Uncomment if you need to have the information retrieve from the tool
        # for step in chunk["steps"]:
        #     print(f"Tool Result: `{step.observation}`")
        continue
    # Final result
    elif "output" in chunk:
        # No need to print the final output again since we would be streaming it as it is produced
        # print(f'Final Output: {chunk["output"]}') 
        continue
    else:
        raise ValueError()
    print("---")

Calling Tool: `Searcher` with input `{'query': 'ADN Registered Nurse job openings in Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'ADN Registered Nurse average salary within 15 miles of Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Occupational therapist assistant job openings in Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Occupational therapist assistant average salary within 15 miles of Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Dental Hygienist job openings in Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Dental Hygienist average salary within 15 miles of Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Graphic Designer job openings in Dallas, TX'}`
---
Calling Tool: `Searcher` with input `{'query': 'Graphic Designer average salary within 15 miles of Dallas, TX'}`
---


HTTPError: 429 Client Error: Too Many Requests for url: https://api.bing.microsoft.com/v7.0/search?q=ADN+Registered+Nurse+average+salary+within+15+miles+of+Dallas%2C+TX&count=5&textDecorations=True&textFormat=HTML

#### Without showing the intermedite steps, just the final answer

In [40]:
QUESTION = "How much is 50 USD in Euros and is it enough for an average hotel in Madrid?"

try:
    response = agent_executor.invoke({"question":QUESTION})
except Exception as e:
    response = str(e)

50 USD is equivalent to approximately 46.68 Euros at the current exchange rate [[1]](https://www.xe.com/en/currencyconverter/convert/?Amount=50&From=USD&To=EUR).

Regarding the cost of an average hotel in Madrid, the prices can vary significantly. The average price of accommodation in hotels and similar lodging establishments in Madrid amounted to 169 euros in February 2024 [[2]](https://www.statista.com/statistics/614059/overnight-accommodation-costs-madrid-city/). However, for budget hotels, prices can range from approximately 30 to 70 euros per night, offering basic amenities like free Wi-Fi and breakfast [[3]](https://luxurytraveldiva.com/how-much-does-a-hotel-cost-in-madrid-spain/).

Given this information, 50 USD (approximately 46.68 Euros) would not be enough for an average hotel in Madrid, especially considering the average price of 169 euros for standard accommodations. However, it might contribute towards a night in a budget hotel, depending on the specific rates and availabi

In [41]:

printmd(response["output"])

50 USD is equivalent to approximately 46.68 Euros at the current exchange rate [[1]](https://www.xe.com/en/currencyconverter/convert/?Amount=50&From=USD&To=EUR).

Regarding the cost of an average hotel in Madrid, the prices can vary significantly. The average price of accommodation in hotels and similar lodging establishments in Madrid amounted to 169 euros in February 2024 [[2]](https://www.statista.com/statistics/614059/overnight-accommodation-costs-madrid-city/). However, for budget hotels, prices can range from approximately 30 to 70 euros per night, offering basic amenities like free Wi-Fi and breakfast [[3]](https://luxurytraveldiva.com/how-much-does-a-hotel-cost-in-madrid-spain/).

Given this information, 50 USD (approximately 46.68 Euros) would not be enough for an average hotel in Madrid, especially considering the average price of 169 euros for standard accommodations. However, it might contribute towards a night in a budget hotel, depending on the specific rates and availability during your intended travel period.

## QnA to specific websites

There are several use cases where we want the smart bot to answer questions about a specific company's public website. There are two approaches we can take:

1. Create a crawler script that runs regularly, finds every page on the website, and pushes the documents to Azure Cognitive Search.
2. Since Bing has likely already indexed the public website, we can utilize Bing search targeted specifically to that site, rather than attempting to index the site ourselves and duplicate the work already done by Bing's crawler.

Below are some sample questions related to specific sites. Take a look:

In [42]:
# QUESTION = "information on how to kill wasps in homedepot.com"
QUESTION = "in target.com, find how what's the price of a Nesspresso coffee machine and of a Keurig coffee machine"
# QUESTION = "in microsoft.com, find out what is the latests news on quantum computing"
# QUESTION = "give me on a list the main points on the latest investor report from mondelezinternational.com"

In [43]:
async for chunk in agent_executor.astream({"question": QUESTION}):
    # Agent Action
    if "actions" in chunk:
        for action in chunk["actions"]:
            print(f"Calling Tool: `{action.tool}` with input `{action.tool_input}`")
    # Observation
    elif "steps" in chunk:
        # Uncomment if you need to have the information retrieve from the tool
        # for step in chunk["steps"]:
        #     print(f"Tool Result: `{step.observation}`")
        continue
    # Final result
    elif "output" in chunk:
        # No need to print the final output again since we would be streaming it as it is produced
        # print(f'Final Output: {chunk["output"]}') 
        continue
    else:
        raise ValueError()
    print("---")

Calling Tool: `Searcher` with input `{'query': 'site:target.com Nespresso coffee machine price'}`
---
Calling Tool: `Searcher` with input `{'query': 'site:target.com Keurig coffee machine price'}`
---
Based on the search results from Target's website, here are the prices for Nespresso and Keurig coffee machines:

### Nespresso Coffee Machines:
- **Nespresso Vertuo Next Coffee Maker and Espresso Machine by De'Longhi, Gray** is available but the price is not listed in the snippet. For the most accurate price, please visit the [product page](https://www.target.com/p/nespresso-vertuo-next-coffee-maker-and-espresso-machine-by-delonghi-gray/-/A-79332989).
- **Nespresso VertuoPlus Coffee Maker and Espresso Machine by De'Longhi, Black Matte** is also available with detailed information on the website but no price mentioned in the snippet. Check the [product page](https://www.target.com/p/nespresso-vertuoplus-coffee-maker-and-espresso-machine-by-delonghi-black-matte/-/A-54498718) for the latest