### env setup

In [52]:
from dotenv import load_dotenv
load_dotenv()

import os
os.environ['LANGCHAIN_TRACING_V2'] = 'true'

### search tools

In [53]:
from langchain_community.tools.tavily_search import TavilySearchResults

In [55]:
search = TavilySearchResults(max_results=5)

In [58]:
results = search.invoke('Who is the prime minister of Singapore?')

In [62]:
results[0]

{'url': 'https://www.pmo.gov.sg/The-Cabinet/Mr-LEE-Hsien-Loong',
 'content': "Biodata\nCareer\n2004 -\nPrime Minister\nChairman, Research, Innovation and Enterprise Council (RIEC)\nChairman, Government of Singapore Investment Corporation (2011)\n2001 - 2007\nMinister for Finance\n1998 - 2004\nChairman, Monetary Authority of Singapore\n1990 - 2004\nDeputy Prime Minister\n1987 - 1992\nMinister for Trade and Industry\n1987 - 1990\nSecond Minister for Defence\n1984 - 1987\nMinister of State for Ministry of Trade and Industry &\nMinistry of Defence\n2004 -\nSecretary-General, People's Action Party\n1992 - 2004\nFirst Assistant Secretary-General, People's Action Party\n1989 - 1992\nSecond Assistant Secretary-General, People's Action Party\n1986 - 1989\nMember, Central Executive Committee of the People's Action Party\n1984 -\nMember of Parliament\n(First elected in 1984, and re-elected in 1988, 1991, 1997, 2001, 2006 and 2011)\n1982 - 1984\nChief of Staff of the General Staff\n1983 - 1984\nDi

- langchain is good for these kind of convenient abstractions. easy to experiment with integrations like search APIs.
- but you can easily end up in a deep black hole if you try to trace the abstraction. see source code examples [here](https://api.python.langchain.com/en/latest/_modules/langchain_community/tools/tavily_search/tool.html#TavilySearchResults) and [here](https://api.python.langchain.com/en/latest/_modules/langchain_core/tools.html#BaseTool)

### retriever 

In [63]:
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

loader = WebBaseLoader("https://learn.microsoft.com/en-us/azure/search/semantic-search-overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()

In [67]:
retriever.invoke("limitations of semantic ranking?")

[Document(page_content="What semantic ranking can't do is rerun the query over the entire corpus to find semantically relevant results. Semantic ranking reranks the existing result set, consisting of the top 50 results as scored by the default ranking algorithm. Furthermore, semantic ranking can't create new information or strings. Captions and answers are extracted verbatim from your content so if the results don't include answer-like text, the language models won't produce one.\nAlthough semantic ranking isn't beneficial in every scenario, certain content can benefit significantly from its capabilities. The language models in semantic ranking work best on searchable content that is information-rich and structured as prose. A knowledge base, online documentation, or documents that contain descriptive content see the most gains from semantic ranking capabilities.", metadata={'source': 'https://learn.microsoft.com/en-us/azure/search/semantic-search-overview', 'title': 'Semantic ranking 

### tool binding

In [72]:
from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "azure_ai_search",
    "Search for information about Azure Semantic Search. For any questions about Azure Semantic Search, you must use this tool!",
)

tools = [search, retriever_tool]

In [73]:
search

TavilySearchResults()

In [74]:
retriever_tool

Tool(name='azure_ai_search', description='Search for information about Azure Semantic Search. For any questions about Azure Semantic Search, you must use this tool!', args_schema=<class 'langchain_core.tools.RetrieverInput'>, func=functools.partial(<function _get_relevant_documents at 0x11756cea0>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x137e2f3e0>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\n\n'), coroutine=functools.partial(<function _aget_relevant_documents at 0x11756d3a0>, retriever=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x137e2f3e0>), document_prompt=PromptTemplate(input_variables=['page_content'], template='{page_content}'), document_separator='\n\n'))

- abstractions are good, but you quickly realise that you don't have a good overview of the expected inputs/outputs from each of these tools... hard to control and debug down the line

In [77]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo-0125")

In [78]:
from langchain_core.messages import HumanMessage
# ummm is this really necessary? example of abstraction that makes you go hmm...

response = model.invoke([HumanMessage(content="hi!")])
response.content

'Hello! How can I assist you today?'

In [80]:
model_with_tools = model.bind_tools(tools)

In [None]:
from pprint import pprint

In [90]:
def invoke_model_with_tools(message: str, model_with_tools=model_with_tools) -> None:

    response = model_with_tools.invoke([HumanMessage(content=message)])

    print(f"ContentString: {response.content}")
    print(f"ToolCalls: {response.tool_calls}")
    pprint(response.dict())

In [91]:
invoke_model_with_tools('hi')

ContentString: Hello! How can I assist you today?
ToolCalls: []
{'additional_kwargs': {},
 'content': 'Hello! How can I assist you today?',
 'example': False,
 'id': 'run-028c9bae-24b8-412b-ba2e-9c0f038346db-0',
 'invalid_tool_calls': [],
 'name': None,
 'response_metadata': {'finish_reason': 'stop',
                       'logprobs': None,
                       'model_name': 'gpt-3.5-turbo-0125',
                       'system_fingerprint': None,
                       'token_usage': {'completion_tokens': 10,
                                       'prompt_tokens': 130,
                                       'total_tokens': 140}},
 'tool_calls': [],
 'type': 'ai'}


In [92]:
invoke_model_with_tools("what's the weather in singapore")

ContentString: 
ToolCalls: [{'name': 'tavily_search_results_json', 'args': {'query': 'weather in Singapore'}, 'id': 'call_JbtqL7RXXVA6XutzIOcvST27'}]
{'additional_kwargs': {'tool_calls': [{'function': {'arguments': '{"query":"weather '
                                                                 'in '
                                                                 'Singapore"}',
                                                    'name': 'tavily_search_results_json'},
                                       'id': 'call_JbtqL7RXXVA6XutzIOcvST27',
                                       'type': 'function'}]},
 'content': '',
 'example': False,
 'id': 'run-e4054405-1f74-4a7e-b3c1-9dba94091e66-0',
 'invalid_tool_calls': [],
 'name': None,
 'response_metadata': {'finish_reason': 'tool_calls',
                       'logprobs': None,
                       'model_name': 'gpt-3.5-turbo-0125',
                       'system_fingerprint': None,
                       'token_usage': {'complet

#### what would this have looked like if you used the api directly - [taken from openai docs directly]
```python
def run_conversation():
    # Step 1: send the conversation and available functions to the model
    messages = [{"role": "user", "content": "What's the weather like in San Francisco, Tokyo, and Paris?"}]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                    },
                    "required": ["location"],
                },
            },
        }
    ]
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools,
        tool_choice="auto",  # auto is default, but we'll be explicit
    )
    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    # Step 2: check if the model wanted to call a function
    if tool_calls:
        # Step 3: call the function
        # Note: the JSON response may not always be valid; be sure to handle errors
        available_functions = {
            "get_current_weather": get_current_weather,
        }  # only one function in this example, but you can have multiple
        messages.append(response_message)  # extend conversation with assistant's reply
        # Step 4: send the info for each function call and function response to the model
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            function_response = function_to_call(
                location=function_args.get("location"),
                unit=function_args.get("unit"),
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )  # extend conversation with function response
        second_response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
        )  # get a new response from the model where it can see the function response
        return second_response
print(run_conversation())
```

In [93]:
def invoke_agent(message):
    
    from langgraph.prebuilt import chat_agent_executor

    agent_executor = chat_agent_executor.create_tool_calling_executor(model, tools)
    
    response = agent_executor.invoke({"messages": [HumanMessage(content=message)]})
    
    return response['messages']


okay let's go down the rabbit [hole](https://github.com/langchain-ai/langgraph/blob/main/langgraph/prebuilt/chat_agent_executor.py) -- now we enter the graph frameworks (see `create_tool_calling_executor`) -- see bottom of the notebook for the src code

In [96]:
invoke_agent('hi')

[HumanMessage(content='hi', id='192cbcd1-fc45-4b22-9a58-b2755ad91b19'),
 AIMessage(content='Hello! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 130, 'total_tokens': 140}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-de35f43c-5673-4b49-9c6e-d4491b63139b-0')]

In [97]:
invoke_agent('what are the limitations of azure semantic search')

[HumanMessage(content='what are the limitations of azure semantic search', id='20006eca-4758-44e8-a422-7eff2b66550e'),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_utMNjad8h4hd04ycdqPLH4E5', 'function': {'arguments': '{"query":"limitations of Azure Semantic Search"}', 'name': 'azure_ai_search'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 19, 'prompt_tokens': 137, 'total_tokens': 156}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-073da0cb-db98-45d8-8318-262759e09f5b-0', tool_calls=[{'name': 'azure_ai_search', 'args': {'query': 'limitations of Azure Semantic Search'}, 'id': 'call_utMNjad8h4hd04ycdqPLH4E5'}]),
 ToolMessage(content="The underlying technology is from Bing and Microsoft Research, and integrated into the Azure AI Search infrastructure as an add-on feature. For more information about the research and AI investments backing semantic ranking, 

In [98]:
invoke_agent('what is the weather in singapore?')

[HumanMessage(content='what is the weather in singapore?', id='171d567b-ea42-4237-966a-040e449a399a'),
 AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Z6kHj6O5OJ2p301JCDQ1F7IZ', 'function': {'arguments': '{"query":"weather in Singapore"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 136, 'total_tokens': 156}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-556e493f-5d41-44a2-adf0-e221adf05e31-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'weather in Singapore'}, 'id': 'call_Z6kHj6O5OJ2p301JCDQ1F7IZ'}]),
 AIMessage(content='The current weather in Singapore is as follows:\n- Temperature: 32.0°C (89.6°F)\n- Condition: Partly cloudy\n- Wind: 5.6 mph (9.0 km/h) from SSE\n- Humidity: 71%\n- Feels like: 43.0°C (109.3°F)\n- Visibility: 10.0 km\n- UV Index: 6.0\n\nFor more detaile

```python
# Define a new graph
    workflow = StateGraph(AgentState)

    # Define the two nodes we will cycle between
    workflow.add_node("agent", RunnableLambda(call_model, acall_model))
    workflow.add_node("tools", ToolNode(tools))

    # Set the entrypoint as `agent`
    # This means that this node is the first one called
    workflow.set_entry_point("agent")

    # We now add a conditional edge
    workflow.add_conditional_edges(
        # First, we define the start node. We use `agent`.
        # This means these are the edges taken after the `agent` node is called.
        "agent",
        # Next, we pass in the function that will determine which node is called next.
        should_continue,
        # Finally we pass in a mapping.
        # The keys are strings, and the values are other nodes.
        # END is a special node marking that the graph should finish.
        # What will happen is we will call `should_continue`, and then the output of that
        # will be matched against the keys in this mapping.
        # Based on which one it matches, that node will then be called.
        {
            # If `tools`, then we call the tool node.
            "continue": "tools",
            # Otherwise we finish.
            "end": END,
        },
    )

    # We now add a normal edge from `tools` to `agent`.
    # This means that after `tools` is called, `agent` node is called next.
    workflow.add_edge("tools", "agent")

    # Finally, we compile it!
    # This compiles it into a LangChain Runnable,
    # meaning you can use it as you would any other runnable
    return workflow.compile(
        checkpointer=checkpointer,
        interrupt_before=interrupt_before,
        interrupt_after=interrupt_after,
        debug=debug,
    )
```

See [langsmith](https://smith.langchain.com) also -- in my opinion, just a toy