## Set up variables

In [15]:
import os
import openai
import urllib
import requests
import random
import json
from collections import OrderedDict
from IPython.display import display, HTML, Markdown
from typing import List
from operator import itemgetter

# LangChain Imports needed
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.retrievers import BaseRetriever
from langchain_core.callbacks import CallbackManagerForRetrieverRun
from langchain_core.documents import Document
from langchain_core.runnables import ConfigurableField


# Our own libraries needed
from common.prompts import DOCSEARCH_PROMPT
from common.utils import get_search_results

from dotenv import load_dotenv
load_dotenv("credentials2.env")


True

In [16]:
QUESTION = "what is azure ml for"

In [17]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

## A gentle intro to chaining LLMs and prompt engineering

Chains refer to sequences of calls - whether to an LLM, a tool, or a data preprocessing step.

Azure OpenAI is a type of LLM (provider) that you can use but there are others like Cohere, Huggingface, etc.

Chains can be simple (i.e. Generic) or specialized (i.e. Utility).

* Generic — A single LLM is the simplest chain. It takes an input prompt and the name of the LLM and then uses the LLM for text generation (i.e. output for the prompt).

Here’s an example:

In [18]:
COMPLETION_TOKENS = 2000
llm = AzureChatOpenAI(deployment_name=os.environ["GPT4_DEPLOYMENT_NAME"], 
                      temperature=0.5, 
                      max_tokens=COMPLETION_TOKENS)

In [19]:
output_parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant that give thorough responses to users."),
    ("user", "{input}. Give your response in {language}")
])

The | symbol is similar to a unix pipe operator, which chains together the different components feeds the output from one component as input into the next component.

In [20]:
chain = prompt | llm | output_parser

In [21]:
chain.invoke({"input": QUESTION, "language": "Spanish"})

'Azure ML, que es una abreviatura de Azure Machine Learning, es una plataforma de servicios en la nube de Microsoft que se utiliza para la creación, implementación y gestión de soluciones de aprendizaje automático. Está diseñado para ayudar a los desarrolladores y científicos de datos a crear, entrenar, probar, implementar y administrar modelos de aprendizaje automático en la nube.\n\nAzure ML proporciona un entorno de trabajo que ofrece una amplia gama de herramientas y servicios. Estos permiten a los usuarios y las organizaciones desarrollar modelos de aprendizaje automático en gran escala, lo que facilita la creación de aplicaciones y servicios inteligentes.\n\nAdemás, Azure ML también proporciona características que permiten la colaboración y la gestión de flujos de trabajo, lo que facilita la implementación y el mantenimiento de soluciones de aprendizaje automático. En resumen, Azure ML es una plataforma completa para el desarrollo y la implementación de soluciones de aprendizaje 

Great!!, now you know how to create a simple prompt and use a chain in order to answer a general question using ChatGPT knowledge!. 

It is important to note that we rarely use generic chains as standalone chains. More often they are used as building blocks for Utility chains (as we will see next). Also important to notice is that we are NOT using our documents or the result of the Azure Search yet, just the knowledge of ChatGPT on the data it was trained on.

**The second type of Chains are Utility:**

* Utility — These are specialized chains, comprised of many building blocks to help solve a specific task. For example, LangChain supports some end-to-end chains (such as `create_retrieval_chain` for QnA Doc retrieval, Summarization, etc).

We will build our own specific chain in this workshop for digging deeper and solve our use case of enhancing the results of Azure AI Search.

So really, our only job now is to make sure that the results from the Azure AI Search queries fit on the LLM context size, and then let it do its magic.

Now let's create a Prompt Template that will ground the response only in the given context.

In [22]:
template = """Answer the question thoroughly, based **ONLY** on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [23]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Answer the question thoroughly, based **ONLY** on the following context:\n{context}\n\nQuestion: {question}\n'))])

In [25]:
%%time 
# Creation of our custom chain
chain = prompt | llm | output_parser

chain.invoke({"question": "who is Rich Seelinger?", "context": "Rich Seelinger, CEO and Chief Claims Officer of Enstar US."})

CPU times: user 17.8 ms, sys: 0 ns, total: 17.8 ms
Wall time: 1.28 s


'Rich Seelinger is the CEO and Chief Claims Officer of Enstar US.'

In [1]:
import os
import requests
from typing import Dict, List, Optional, Type
import asyncio
from concurrent.futures import ThreadPoolExecutor
from bs4 import BeautifulSoup


from langchain import hub
from langchain.callbacks.manager import AsyncCallbackManagerForToolRun, CallbackManagerForToolRun
from langchain.pydantic_v1 import BaseModel, Field
from langchain.tools import BaseTool, StructuredTool, tool
from langchain_openai import AzureChatOpenAI
from langchain.agents import AgentExecutor, Tool, create_openai_tools_agent
from langchain.callbacks.manager import CallbackManager
from langchain.agents import initialize_agent, AgentType
from langchain.utilities import BingSearchAPIWrapper

from common.callbacks import StdOutCallbackHandler
from common.prompts import BINGSEARCH_PROMPT

from IPython.display import Markdown, HTML, display  

def printmd(string):
    display(Markdown(string.replace("$","USD ")))

from dotenv import load_dotenv
load_dotenv("credentials2.env")

True

In [2]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

In [3]:
cb_handler = StdOutCallbackHandler()
cb_manager = CallbackManager(handlers=[cb_handler])

COMPLETION_TOKENS = 2000

llm = AzureChatOpenAI(deployment_name=os.environ["GPT4_DEPLOYMENT_NAME"], 
                      temperature=0.5, max_tokens=COMPLETION_TOKENS, 
                      streaming=True, callback_manager=cb_manager)

In [4]:
class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")

class MyBingSearch(BaseTool):
    """Tool for a Bing Search Wrapper"""
    
    name = "Searcher"
    description = "useful to search the internet.\n"
    args_schema: Type[BaseModel] = SearchInput

    k: int = 5
    
    def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:
        bing = BingSearchAPIWrapper(k=self.k)
        return bing.results(query,num_results=self.k)
            
    async def _arun(self, query: str, run_manager: Optional[AsyncCallbackManagerForToolRun] = None) -> str:
        bing = BingSearchAPIWrapper(k=self.k)
        loop = asyncio.get_event_loop()
        results = await loop.run_in_executor(ThreadPoolExecutor(), bing.results, query, self.k)
        return results

In [5]:
def parse_html(content) -> str:
    soup = BeautifulSoup(content, 'html.parser')
    text_content_with_links = soup.get_text()
    return text_content_with_links

def fetch_web_page(url: str) -> str:
    HEADERS = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:90.0) Gecko/20100101 Firefox/90.0'}
    response = requests.get(url, headers=HEADERS)
    return parse_html(response.content)

In [6]:
web_fetch_tool = Tool.from_function(
    func=fetch_web_page,
    name="WebFetcher",
    description="useful to fetch the content of a url"
)

In [7]:
# tools = [MyBingSearch(k=5), web_fetch_tool] # With GPT-4 you can add the web_fetch_tool

tools = [MyBingSearch(k=5)] # With GPT-3.5 

prompt = BINGSEARCH_PROMPT

# Construct the OpenAI Tools agent
agent = create_openai_tools_agent(llm, tools, prompt)

# Create an agent executor by passing in the agent and tools
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False, 
                               return_intermediate_steps=True)

In [8]:
agent_executor.tools

[MyBingSearch()]

In [9]:
# QUESTION = "Create a list with the main facts on What is happening with the oil supply in the world right now?"
# QUESTION = "How much is 50 USD in Euros and is it enough for an average hotel in Madrid?"
# QUESTION = "My son needs to build a pinewood car for a pinewood derbi, how do I build such a car?"
# QUESTION = "I'm planning a vacation to Greece, tell me budget for a family of 4, in Summer, for 7 days including travel, lodging and food costs"
# QUESTION = "Who won the 2023 superbowl and who was the MVP?"
QUESTION = """
compare the number of job opennings (provide the exact number), the average salary within 15 miles of Dallas, TX, for these ocupations:

- ADN Registerd Nurse 
- Occupational therapist assistant
- Dental Hygienist
- Graphic Designer


# Create a table with your findings. Place the sources on each cell.
# """

In [None]:
async for chunk in agent_executor.astream({"question": QUESTION}):
    # Agent Action
    if "actions" in chunk:
        for action in chunk["actions"]:
            print(f"Calling Tool: `{action.tool}` with input `{action.tool_input}`")
    # Observation
    elif "steps" in chunk:
        # Uncomment if you need to have the information retrieve from the tool
        # for step in chunk["steps"]:
        #     print(f"Tool Result: `{step.observation}`")
        continue
    # Final result
    elif "output" in chunk:
        # No need to print the final output again since we would be streaming it as it is produced
        # print(f'Final Output: {chunk["output"]}') 
        continue
    else:
        raise ValueError()
    print("---")

# Summary
##### By using OpenAI, the answers to user questions are way better than taking just the results from Azure AI Search. So the summary is:
- Utilizing Azure AI Search, we conduct a multi-index hybrid search that identifies the top chunks of documents from each index.
- Subsequently, Azure OpenAI utilizes these extracted chunks as context, comprehends the content, and employs it to deliver optimal answers.
- Best of two worlds!

##### Important observations on this notebook:

1) Answers with GPT-3.5 are less quality but fast
2) Answers with GPT-3.5 sometimes failed on provinding citations in the right format
3) Answers with GPT-4 are great quality but slower
4) Answers with GPT-4 always provide good and diverse citations in the right format
5) Models like Mistral Large or Cohere Command R+ provide, so far, a similar experience to GPT-4
5) Streaming the answers improves the user experience big time!

# NEXT
In the next notebook, we are going to see how we can treat complex and large documents separately, also using Vector Search