# Exploring Tools in LangChain

## Install OpenAI, and LangChain dependencies

In [1]:
from warnings import filterwarnings
filterwarnings('ignore')

In [2]:
!pip install langchain==0.3.14
!pip install langchain-openai==0.3.0
!pip install langchain-community==0.3.14

Collecting langchain-openai==0.3.0
  Downloading langchain_openai-0.3.0-py3-none-any.whl.metadata (2.7 kB)
Collecting tiktoken<1,>=0.7 (from langchain-openai==0.3.0)
  Downloading tiktoken-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Downloading langchain_openai-0.3.0-py3-none-any.whl (54 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.2/54.2 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tiktoken-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m29.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tiktoken, langchain-openai
Successfully installed langchain-openai-0.3.0 tiktoken-0.8.0
Collecting langchain-community==0.3.14
  Downloading langchain_community-0.3.14-py3-none-any.whl.metadata (2.9 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community==0.3.14)
  Downloading dat

## Install Data Extraction APIs

In [3]:
# to create custom tools
!pip install wikipedia==1.4.0
!pip install markitdown
# to highlight json
!pip install rich

Collecting wikipedia==1.4.0
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11679 sha256=e1ac9f0adbb200cd532ff6143e4cc96b6ca90cb760d4e6a5595e8bfb603b5acf
  Stored in directory: /root/.cache/pip/wheels/8f/ab/cb/45ccc40522d3a1c41e1d2ad53b8f33a62f394011ec38cd71c6
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0
Collecting markitdown
  Downloading markitdown-0.0.1a3-py3-none-any.whl.metadata (4.8 kB)
Collecting mammoth (from markitdown)
  Downloading mammoth-1.9.0-py2.py3-none-any.whl.metadata (24 kB)
Collecting markdownify (from markitdown)
  Downloading markdownify-0.14.1-py3-none-any.whl.metadata (8.5 kB)
Collecting pathvalidate (from markitdown)
  Downloading pathvalidate-3.2.3-py3-none-any

## Enter Open AI API Key

In [4]:
from getpass import getpass

OPENAI_KEY = getpass('Enter Open AI API Key: ')

Enter Open AI API Key: ··········


## Enter Tavily Search API Key

Get a free API key from [here](https://tavily.com/#api)

In [5]:
TAVILY_API_KEY = getpass('Enter Tavily Search API Key: ')

Enter Tavily Search API Key: ··········


## Enter WeatherAPI API Key

Get a free API key from [here](https://www.weatherapi.com/signup.aspx)

In [6]:
WEATHER_API_KEY = getpass('Enter WeatherAPI API Key: ')

Enter WeatherAPI API Key: ··········


## Setup Environment Variables

In [7]:
import os

os.environ['OPENAI_API_KEY'] = OPENAI_KEY
os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

## Exploring Built-in Tools

### Exploring the Wikipedia Tool

Enables you to tap into the Wikipedia API to search wikipedia pages for information

In [8]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wiki_api_wrapper = WikipediaAPIWrapper(top_k_results=3,
                                       doc_content_chars_max=8000)
wiki_tool = WikipediaQueryRun(api_wrapper=wiki_api_wrapper, features="lxml")

In [9]:
wiki_tool.description

'A wrapper around Wikipedia. Useful for when you need to answer general questions about people, places, companies, facts, historical events, or other subjects. Input should be a search query.'

In [10]:
wiki_tool.args

{'query': {'description': 'query to look up on wikipedia',
  'title': 'Query',
  'type': 'string'}}

In [11]:
print(wiki_tool.invoke({"query": "Microsoft"}))

Page: Microsoft
Summary: Microsoft Corporation is an American multinational technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became highly influential in the rise of personal computers through software like Windows, and the company has since expanded to Internet services, cloud computing, video gaming and other fields. Microsoft is the largest software maker, one of the most valuable public U.S. companies, and one of the most valuable brands globally.
Microsoft was founded by Bill Gates and Paul Allen to develop and sell BASIC interpreters for the Altair 8800. It rose to dominate the personal computer operating system market with MS-DOS in the mid-1980s, followed by Windows. During the 41 years from 1980 to 2021 Microsoft released 9 versions of MS-DOS with a median frequency of 2 years, and 13 versions of Microsoft Windows with a median frequency of 3 years. The company's 1986 initial public offering (IPO) and subsequent rise in its share price

In [12]:
print(wiki_tool.invoke({"query": "AI"}))

Page: Artificial intelligence
Summary: Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs.
High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something

 You can customize the default tool with its own name, description and so on as follows

In [13]:
from langchain.agents import Tool

wiki_tool_init = Tool(name="Wikipedia",
                      func=wiki_api_wrapper.run,
                      description="useful when you need a detailed answer about general knowledge")

In [14]:
wiki_tool_init.description

'useful when you need a detailed answer about general knowledge'

In [15]:
wiki_tool_init.args

{'tool_input': {'type': 'string'}}

In [16]:
print(wiki_tool_init.invoke({"tool_input": "AI"}))

Page: Artificial intelligence
Summary: Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. Such machines may be called AIs.
High-profile applications of AI include advanced web search engines (e.g., Google Search); recommendation systems (used by YouTube, Amazon, and Netflix); virtual assistants (e.g., Google Assistant, Siri, and Alexa); autonomous vehicles (e.g., Waymo); generative and creative tools (e.g., ChatGPT and AI art); and superhuman play and analysis in strategy games (e.g., chess and Go). However, many AI applications are not perceived as AI: "A lot of cutting edge AI has filtered into general applications, often without being called AI because once something

### Exploring the Tavily Search Tool

Tavily Search API is a search engine optimized for LLMs and RAG, aimed at efficient, quick and persistent search results

In [17]:
from langchain_community.tools.tavily_search import TavilySearchResults

tavily_tool = TavilySearchResults(max_results=8,
                                search_depth='advanced',
                                include_raw_content=True)

In [18]:
tavily_tool.args

{'query': {'description': 'search query to look up',
  'title': 'Query',
  'type': 'string'}}

In [19]:
tavily_tool.description

'A search engine optimized for comprehensive, accurate, and trusted results. Useful for when you need to answer questions about current events. Input should be a search query.'

In [20]:
results = tavily_tool.invoke("Tell me about Microsoft")
results

[{'url': 'https://news.microsoft.com/facts-about-microsoft/',
  'content': 'Corporate Address\nImportant Dates\nBoard of Directors\nMicrosoft Business Organization\nOperation Centers\nMicrosoft Subsidiaries\nFinancial Data\nEmployment Information\nReal Estate Portfolio\nAbout Microsoft\nCorporate Address\nMicrosoft Corporation\nOne Microsoft Way\nRedmond, WA 98052-7329\nUSA\nTel: (425) 882-8080\nFax: (425) 706-7329\nhttp://www.microsoft.com\nImportant Dates\nAs of Sept. 30, 2023\nBoard of Directors\nAs of Dec. 21, 2023\nMicrosoft Business Organization\n Last updated: Dec. 21, 2023\nOperation Centers\nAs of June 30, 2022\nMicrosoft Subsidiaries\nAs of Sept. 30, 2023Â\nCurrent employment headcount\nAs of June 30, 2023\nAdditional information and details about the demographics of our workforce can be found on the Microsoft Global Diversity & Inclusion site.\n Global\nFacts About Microsoft\nMicrosoft enables digital transformation for the era of an intelligent cloud and an intelligent edge

## Build your own tools in LangChain

Tools are interfaces that an agent, chain, or LLM can use to interact with the world. They combine a few things:

- The name of the tool
- A description of what the tool is
- JSON schema of what the inputs to the tool are
- The function to call
- Whether the result of a tool should be returned directly to the user

It is useful to have all this information because this information can be used to build action-taking systems! The name, description, and JSON schema can be used to prompt the LLM so it knows how to specify what action to take, and then the function to call is equivalent to taking that action.

### Building a Simple Math Tool

We will start by building a simple tool which does some basic math

In [21]:
from langchain_core.tools import tool

@tool
def multiply(a, b):
    """Multiply two numbers."""
    return a * b


# Let's inspect some of the attributes associated with the tool.
print(multiply.name)
print(multiply.description)
print(multiply.args)

multiply
Multiply two numbers.
{'a': {'title': 'A'}, 'b': {'title': 'B'}}


In [22]:
type(multiply)

In [23]:
multiply.invoke({"a": 2, "b": 3})

6

In [24]:
multiply.invoke({"a": 2.1, "b": 3.2})

6.720000000000001

In [25]:
multiply.invoke({"a": 2, "b": 'abc'})

'abcabc'

Let's now build a tool with data type enforcing

In [26]:
from pydantic import BaseModel, Field
from langchain_core.tools import StructuredTool

class CalculatorInput(BaseModel):
    a: float = Field(description="first number")
    b: float = Field(description="second number")


def multiply(a: float, b: float) -> float:
    """Multiply two numbers."""
    return a * b

# we could also use the @tool decorator from before
multiply = StructuredTool.from_function(
    func=multiply,
    name="multiply",
    description="use to multiply numbers",
    args_schema=CalculatorInput,
    return_direct=True
    )

# Let's inspect some of the attributes associated with the tool.
print(multiply.name)
print(multiply.description)
print(multiply.args)

multiply
use to multiply numbers
{'a': {'description': 'first number', 'title': 'A', 'type': 'number'}, 'b': {'description': 'second number', 'title': 'B', 'type': 'number'}}


In [27]:
multiply.invoke({"a": 2, "b": 3})

6.0

In [28]:
# this code will error out as abc is not a floating point number
multiply.invoke({"a": 2, "b": 'abc'})

ValidationError: 1 validation error for CalculatorInput
b
  Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='abc', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/float_parsing

### Build a Web Search & Information Extraction Tool

In [29]:
tavily_tool = TavilySearchResults(max_results=5,
                                  search_depth='advanced',
                                  include_raw_content=True)

result = tavily_tool.invoke("Tell me about Microsoft's Q4 2024 earning call report")
result

[{'url': 'https://www.gurufocus.com/news/2486529/microsoft-corp-msft-q4-2024-earnings-call-transcript-highlights-strong-cloud-growth-and-record-revenue',
  'content': 'Release Date: July 30, 2024. For the complete transcript of the earnings call, please refer to the full earnings call transcript. Positive Points . Microsoft Corp (MSFT, Financial) reported annual revenue of over $245 billion, up 15% year over year. Microsoft Cloud revenue surpassed $135 billion, up 23% year over year.'},
 {'url': 'https://www.marketbeat.com/earnings/reports/2024-7-30-microsoft-co-stock/',
  'content': 'In our largest quarter of the year, we again delivered double-digit top and bottom-line growth with continued share gains across many of our businesses and record commitments to our Microsoft Cloud platform. Office 365 Commercial revenue increased 13% and 14% in constant currency with ARPU growth primarily from E5 momentum as well as Copilot for Microsoft 365. Azure and other cloud services revenue grew 2

In [30]:
result[0]['url']

'https://www.gurufocus.com/news/2486529/microsoft-corp-msft-q4-2024-earnings-call-transcript-highlights-strong-cloud-growth-and-record-revenue'

In [31]:
from markitdown import MarkItDown

md = MarkItDown()
doc_content = md.convert(result[0]['url'])



HTTPError: 403 Client Error: Forbidden for url: https://www.gurufocus.com/news/2486529/microsoft-corp-msft-q4-2024-earnings-call-transcript-highlights-strong-cloud-growth-and-record-revenue

In [33]:
doc_content = md.convert(result[1]['url'])

In [34]:
print(doc_content.title.strip())

Microsoft  (NASDAQ:MSFT) Q4 2024 Earnings Report on 7/30/2024


In [35]:
print(doc_content.text_content)


[Skip to main content](#main)

[![MarketBeat home page](/images/master/MarketBeat-logo-r-white.svg?v=2019)](https://www.marketbeat.com "MarketBeat")

* [Research Tools](/all-access/)
  + [All Access Research Tools](/all-access/)
    - [Live News Feed](/all-access/live-news/)
    - [Momentum Alerts](/manage/momentum-alerts/)
    - [Idea Engine](/all-access/idea-engine/)
    - [Export Data (CSV)](/all-access/export-data/)
    - [See All Research Tools](/all-access/)
  + [My MarketBeat](/manage/watchlists/)
    - [My Portfolio](/manage/watchlists/)
    - [My Newsletter](/manage/watchlists/#newsletter)
    - [My Account](/manage/)
  + [Calculators](/calculators/)
    - [Dividend Calculator](/dividends/calculator/)
    - [Dividend Yield Calculator](/dividends/yield-calculator/)
    - [Market Cap Calculator](/calculators/market-cap-calculator/)
    - [Options Profit Calculator](/calculators/options-profit-calculator/)
    - [Stock Average Calculator](/calculators/stock-average-calculator/)


In [36]:
from markitdown import MarkItDown
from langchain_community.tools.tavily_search import TavilySearchResults
from tqdm import tqdm
import requests

tavily_tool = TavilySearchResults(max_results=5,
                                  search_depth='advanced',
                                  include_answer=False,
                                  include_raw_content=True)
md = MarkItDown()

@tool
def search_web_extract_info(query: str) -> list:
    """Search the web for a query and extracts useful information from the search links"""
    results = tavily_tool.invoke(query)
    docs = []
    for result in tqdm(results):
        # Extracting all text content from the URL
        try:
            extracted_info = md.convert(result['url'])
            text_title = extracted_info.title.strip()
            text_content = extracted_info.text_content.strip()
            docs.append(text_title + '\n' + text_content)
        except:
            print('Extraction blocked for url: ', result['url'])
            pass

    return docs

In [37]:
docs = search_web_extract_info('OpenAI GPT-4o')

  docs = search_web_extract_info('OpenAI GPT-4o')
 40%|████      | 2/5 [00:00<00:00, 13.78it/s]

Extraction blocked for url:  https://openai.com/index/hello-gpt-4o/
Extraction blocked for url:  https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/


100%|██████████| 5/5 [00:01<00:00,  4.50it/s]

Extraction blocked for url:  https://openai.com/index/gpt-4/





In [38]:
from IPython.display import display, Markdown

display(Markdown(docs[0]))

Introduction to GPT-4o and GPT-4o mini | OpenAI Cookbook
[Cookbook](/)Topics[About](/about)[API Docs](https://platform.openai.com/docs/introduction)[Contribute](https://github.com/openai/openai-cookbook)Toggle themeToggle themeSearch...⌘K
# Introduction to GPT-4o and GPT-4o mini

![Juston Forte](/_next/image?url=https%3A%2F%2Favatars.githubusercontent.com%2Fu%2F96567547%3Fs%3D400%26u%3D08b9757200906ab12e3989b561cff6c4b95a12cb%26v%3D4&w=64&q=75)![Verified](/_next/static/media/openai-logomark.e026557a.svg)Juston Forte(OpenAI)Jul 18, 2024[Open in Github](https://github.com/openai/openai-cookbook/blob/main/examples/gpt4o/introduction_to_gpt4o.ipynb)

---

GPT-4o ("o" for "omni") and GPT-4o mini are natively multimodal models designed to handle a combination of text, audio, and video inputs, and can generate outputs in text, audio, and image formats. GPT-4o mini is the lightweight version of GPT-4o.

### [Background](#background)

Before GPT-4o, users could interact with ChatGPT using Voice Mode, which operated with three separate models. GPT-4o integrates these capabilities into a single model that's trained across text, vision, and audio. This unified approach ensures that all inputs — whether text, visual, or auditory — are processed cohesively by the same neural network.

GPT-4o mini is the next iteration of this omni model family, available in a smaller and cheaper version. This model offers higher accuracy than GPT-3.5 Turbo while being just as fast and supporting multimodal inputs and outputs.

### [Current API Capabilities](#current-api-capabilities)

Currently, the `gpt-4o-mini` model supports `{text, image}`, with `{text}` outputs, the same modalities as `gpt-4-turbo`. As a preview, we will also be using the `gpt-4o-audio-preview` model to showcase transcription though the GPT4o model.

## [Getting Started](#getting-started)

### [Install OpenAI SDK for Python](#install-openai-sdk-for-python)

```
%pip install --upgrade openai
```
### [Configure the OpenAI client and submit a test request](#configure-the-openai-client-and-submit-a-test-request)

To setup the client for our use, we need to create an API key to use with our request. Skip these steps if you already have an API key for usage.

You can get an API key by following these steps:

1. [Create a new project](https://help.openai.com/en/articles/9186755-managing-your-work-in-the-api-platform-with-projects)
2. [Generate an API key in your project](https://platform.openai.com/api-keys)
3. (RECOMMENDED, BUT NOT REQUIRED) [Setup your API key for all projects as an env var](https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key)

Once we have this setup, let's start with a simple {text} input to the model for our first request. We'll use both `system` and `user` messages for our first request, and we'll receive a response from the `assistant` role.

```
from openai import OpenAI
import os

## Set the API key and model name
MODEL="gpt-4o-mini"
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as an env var>"))
```
```
completion = client.chat.completions.create(
  model=MODEL,
  messages=[
    {"role": "system", "content": "You are a helpful assistant. Help me with my math homework!"}, # <-- This is the system message that provides context to the model
    {"role": "user", "content": "Hello! Could you solve 2+2?"}  # <-- This is the user message for which the model will generate a response
  ]
)

print("Assistant: " + completion.choices[0].message.content)
```
```
Assistant: Of course! \( 2 + 2 = 4 \).

```
## [Image Processing](#image-processing)

GPT-4o mini can directly process images and take intelligent actions based on the image. We can provide images in two formats:

1. Base64 Encoded
2. URL

Let's first view the image we'll use, then try sending this image as both Base64 and as a URL link to the API

```
from IPython.display import Image, display, Audio, Markdown
import base64

IMAGE_PATH = "data/triangle.png"

# Preview image for context
display(Image(IMAGE_PATH))
```
![image generated by notebook](data:image/png;base64...)
#### [Base64 Image Processing](#base64-image-processing)

```
# Open the image file and encode it as a base64 string
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

base64_image = encode_image(IMAGE_PATH)

response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant that responds in Markdown. Help me with my math homework!"},
        {"role": "user", "content": [
            {"type": "text", "text": "What's the area of the triangle?"},
            {"type": "image_url", "image_url": {
                "url": f"data:image/png;base64,{base64_image}"}
            }
        ]}
    ],
    temperature=0.0,
)

print(response.choices[0].message.content)
```
```
To find the area of the triangle, you can use the formula:

\[
\text{Area} = \frac{1}{2} \times \text{base} \times \text{height}
\]

In the triangle you provided:

- The base is \(9\) (the length at the bottom).
- The height is \(5\) (the vertical line from the top vertex to the base).

Now, plug in the values:

\[
\text{Area} = \frac{1}{2} \times 9 \times 5
\]

Calculating this:

\[
\text{Area} = \frac{1}{2} \times 45 = 22.5
\]

Thus, the area of the triangle is **22.5 square units**.

```
#### [URL Image Processing](#url-image-processing)

```
response = client.chat.completions.create(
    model=MODEL,
    messages=[
        {"role": "system", "content": "You are a helpful assistant that responds in Markdown. Help me with my math homework!"},
        {"role": "user", "content": [
            {"type": "text", "text": "What's the area of the triangle?"},
            {"type": "image_url", "image_url": {
                "url": "https://upload.wikimedia.org/wikipedia/commons/e/e2/The_Algebra_of_Mohammed_Ben_Musa_-_page_82b.png"}
            }
        ]}
    ],
    temperature=0.0,
)

print(response.choices[0].message.content)
```
```
To find the area of the triangle, you can use the formula:

\[
\text{Area} = \frac{1}{2} \times \text{base} \times \text{height}
\]

In the triangle you provided:

- The base is \(9\) (the length at the bottom).
- The height is \(5\) (the vertical line from the top vertex to the base).

Now, plug in the values:

\[
\text{Area} = \frac{1}{2} \times 9 \times 5
\]

Calculating this gives:

\[
\text{Area} = \frac{1}{2} \times 45 = 22.5
\]

Thus, the area of the triangle is **22.5 square units**.

```
## [Video Processing](#video-processing)

While it's not possible to directly send a video to the API, GPT-4o can understand videos if you sample frames and then provide them as images.

Since GPT-4o mini in the API does not yet support audio-in (as of July 2024), we'll use a combination of GPT-4o mini and Whisper to process both the audio and visual for a provided video, and showcase two usecases:

1. Summarization
2. Question and Answering
### [Setup for Video Processing](#setup-for-video-processing)

We'll use two python packages for video processing - opencv-python and moviepy.

These require [ffmpeg](https://ffmpeg.org/about.html), so make sure to install this beforehand. Depending on your OS, you may need to run `brew install ffmpeg` or `sudo apt install ffmpeg`

```
%pip install opencv-python
%pip install moviepy
```
### [Process the video into two components: frames and audio](#process-the-video-into-two-components-frames-and-audio)

```
import cv2
from moviepy import *
import time
import base64

# We'll be using the OpenAI DevDay Keynote Recap video. You can review the video here: https://www.youtube.com/watch?v=h02ti0Bl6zk
VIDEO_PATH = "data/keynote_recap.mp4"
```
```
def process_video(video_path, seconds_per_frame=2):
    base64Frames = []
    base_video_path, _ = os.path.splitext(video_path)

    video = cv2.VideoCapture(video_path)
    total_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
    fps = video.get(cv2.CAP_PROP_FPS)
    frames_to_skip = int(fps * seconds_per_frame)
    curr_frame=0

    # Loop through the video and extract frames at specified sampling rate
    while curr_frame < total_frames - 1:
        video.set(cv2.CAP_PROP_POS_FRAMES, curr_frame)
        success, frame = video.read()
        if not success:
            break
        _, buffer = cv2.imencode(".jpg", frame)
        base64Frames.append(base64.b64encode(buffer).decode("utf-8"))
        curr_frame += frames_to_skip
    video.release()

    # Extract audio from video
    audio_path = f"{base_video_path}.mp3"
    clip = VideoFileClip(video_path)
    clip.audio.write_audiofile(audio_path, bitrate="32k")
    clip.audio.close()
    clip.close()

    print(f"Extracted {len(base64Frames)} frames")
    print(f"Extracted audio to {audio_path}")
    return base64Frames, audio_path

# Extract 1 frame per second. You can adjust the `seconds_per_frame` parameter to change the sampling rate
base64Frames, audio_path = process_video(VIDEO_PATH, seconds_per_frame=1)

```
```
MoviePy - Writing audio in data/keynote_recap.mp3

```
```

```
```
MoviePy - Done.
Extracted 218 frames
Extracted audio to data/keynote_recap.mp3

```
```

```
```
## Display the frames and audio for context
display_handle = display(None, display_id=True)
for img in base64Frames:
    display_handle.update(Image(data=base64.b64decode(img.encode("utf-8")), width=600))
    time.sleep(0.025)

Audio(audio_path)
```
![image generated by notebook](data:image/png;base64...)

Your browser does not support the audio element.

### [Example 1: Summarization](#example-1-summarization)

Now that we have both the video frames and the audio, let's run a few different tests to generate a video summary to compare the results of using the models with different modalities. We should expect to see that the summary generated with context from both visual and audio inputs will be the most accurate, as the model is able to use the entire context from the video.

1. Visual Summary
2. Audio Summary
3. Visual + Audio Summary

#### [Visual Summary](#visual-summary)

The visual summary is generated by sending the model only the frames from the video. With just the frames, the model is likely to capture the visual aspects, but will miss any details discussed by the speaker.

```
response = client.chat.completions.create(
    model=MODEL,
    messages=[
    {"role": "system", "content": "You are generating a video summary. Please provide a summary of the video. Respond in Markdown."},
    {"role": "user", "content": [
        "These are the frames from the video.",
        *map(lambda x: {"type": "image_url",
                        "image_url": {"url": f'data:image/jpg;base64,{x}', "detail": "low"}}, base64Frames)
        ],
    }
    ],
    temperature=0,
)
print(response.choices[0].message.content)
```
```
# OpenAI Dev Day Summary

## Overview
The video captures highlights from OpenAI's Dev Day, showcasing new advancements and features in AI technology, particularly focusing on the latest developments in the GPT-4 model and its applications.

## Key Highlights

### Event Introduction
- The event is branded as "OpenAI Dev Day," setting the stage for discussions on AI advancements.

### Keynote Recap
- The keynote features a recap of significant updates and innovations in AI, particularly around the GPT-4 model.

### New Features
- **GPT-4 Turbo**: Introduction of a faster and more efficient version of GPT-4, emphasizing improved performance and reduced costs.
- **DALL-E 3**: Updates on the image generation model, showcasing its capabilities and integration with other tools.
- **Custom Models**: Introduction of features allowing users to create tailored AI models for specific tasks.

### Technical Innovations
- **Function Calling**: Demonstration of how the model can handle complex instructions and execute functions based on user queries.
- **JSON Mode**: A new feature that allows for structured data handling, enhancing the model's ability to process and respond to requests.

### User Experience Enhancements
- **Threading and Retrieval**: New functionalities that improve how users can interact with the model, making it easier to manage conversations and retrieve information.
- **Code Interpreter**: Introduction of a tool that allows the model to execute code, expanding its utility for developers.

### Community Engagement
- The event emphasizes community involvement, encouraging developers to explore and utilize the new features in their applications.

### Conclusion
- The event wraps up with a call to action for developers to engage with the new tools and features, fostering innovation in AI applications.

## Closing Remarks
The OpenAI Dev Day serves as a platform for showcasing the latest advancements in AI technology, encouraging developers to leverage these innovations for enhanced applications and user experiences.

```

The results are as expected - the model is able to capture the high level aspects of the video visuals, but misses the details provided in the speech.

#### [Audio Summary](#audio-summary)

The audio summary is generated by sending the model the audio transcript. With just the audio, the model is likely to bias towards the audio content, and will miss the context provided by the presentations and visuals.

`{audio}` input for GPT-4o is currently in preview, but will be incorporated into the base model in the near future. Because of this, we will use the `gpt-4o-audio-preview` model to process the audio.

```
#transcribe the audio
with open(audio_path, 'rb') as audio_file:
    audio_content = base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
            model='gpt-4o-audio-preview',
            modalities=["text"],
            messages=[
                    {   "role": "system",
                        "content":"You are generating a transcript. Create a transcript of the provided audio."
                    },
                    {
                        "role": "user",
                        "content": [
                            {
                                "type": "text",
                                "text": "this is the audio."
                            },
                            {
                                "type": "input_audio",
                                "input_audio": {
                                    "data": audio_content,
                                    "format": "mp3"
                                }
                            }
                        ]
                    },
                ],
            temperature=0,
        )

# Extract and return the transcription
transcription = response.choices[0].message.content
print (transcription)
```

Looking good. Now let's summarize this and format in markdown.

```
#summarize the transcript
response = client.chat.completions.create(
            model=MODEL,
            modalities=["text"],
            messages=[
                {"role": "system", "content": "You are generating a transcript summary. Create a summary of the provided transcription. Respond in Markdown."},
                {"role": "user", "content": f"Summarize this text: {transcription}"},
            ],
            temperature=0,
        )
transcription_summary = response.choices[0].message.content
print (transcription_summary)
```
```
# OpenAI Dev Day Summary

On the inaugural OpenAI Dev Day, several significant updates and features were announced:

- **Launch of GPT-4 Turbo**: This new model supports up to 128,000 tokens of context and is designed to follow instructions more effectively.

- **JSON Mode**: A new feature that ensures the model responds with valid JSON.

- **Function Calling**: Users can now call multiple functions simultaneously, enhancing the model's capabilities.

- **Retrieval Feature**: This allows models to access external knowledge from documents or databases, improving their contextual understanding.

- **Knowledge Base**: GPT-4 Turbo has knowledge up to April 2023, with plans for ongoing improvements.

- **Dolly 3 and New Models**: The introduction of Dolly 3, GPT-4 Turbo with Vision, and a new Text-to-Speech model, all available via the API.

- **Custom Models Program**: A new initiative where researchers collaborate with companies to create tailored models for specific use cases.

- **Increased Rate Limits**: Established GPT-4 customers will see a doubling of tokens per minute, with options to request further changes in API settings.

- **Cost Efficiency**: GPT-4 Turbo is significantly cheaper than its predecessor, with a 3x reduction for prompt tokens and 2x for completion tokens.

- **Introduction of GPTs**: Tailored versions of ChatGPT designed for specific purposes, allowing users to create and share private or public GPTs easily, even without coding skills.

- **Upcoming GPT Store**: A platform for users to share their GPT creations.

- **Assistance API**: Features persistent threads, built-in retrieval, a code interpreter, and improved function calling to streamline user interactions.

The event concluded with excitement about the future of AI technology and an invitation for attendees to return next year to see further advancements.

```

The audio summary is biased towards the content discussed during the speech, but comes out with much less structure than the video summary.

#### [Audio + Visual Summary](#audio--visual-summary)

The Audio + Visual summary is generated by sending the model both the visual and the audio from the video at once. When sending both of these, the model is expected to better summarize since it can perceive the entire video at once.

```
## Generate a summary with visual and audio
response = client.chat.completions.create(
    model=MODEL,
    messages=[
    {"role": "system", "content":"""You are generating a video summary. Create a summary of the provided video and its transcript. Respond in Markdown"""},
    {"role": "user", "content": [
        "These are the frames from the video.",
        *map(lambda x: {"type": "image_url",
                        "image_url": {"url": f'data:image/jpg;base64,{x}', "detail": "low"}}, base64Frames),
        {"type": "text", "text": f"The audio transcription is: {transcription}"}
        ],
    }
],
    temperature=0,
)
print(response.choices[0].message.content)
```
```
# OpenAI Dev Day Summary

## Overview
The first-ever OpenAI Dev Day introduced several exciting updates and features, primarily focusing on the launch of **GPT-4 Turbo**. This new model enhances capabilities and expands the potential for developers and users alike.

## Key Announcements

### 1. **GPT-4 Turbo**
- **Token Support**: Supports up to **128,000 tokens** of context.
- **JSON Mode**: A new feature that ensures responses are in valid JSON format.
- **Function Calling**: Improved ability to call multiple functions simultaneously and better adherence to instructions.

### 2. **Knowledge Retrieval**
- **Enhanced Knowledge Access**: Users can now integrate external documents or databases, allowing models to access updated information beyond their training cut-off (April 2023).

### 3. **DALL-E 3 and Other Models**
- Launch of **DALL-E 3**, **GPT-4 Turbo with Vision**, and a new **Text-to-Speech model** in the API.

### 4. **Custom Models Program**
- Introduction of a program where OpenAI researchers collaborate with companies to create tailored models for specific use cases.

### 5. **Rate Limits and Pricing**
- **Increased Rate Limits**: Doubling tokens per minute for established GPT-4 customers.
- **Cost Efficiency**: GPT-4 Turbo is **3x cheaper** for prompt tokens and **2x cheaper** for completion tokens compared to GPT-4.

### 6. **Introduction of GPTs**
- **Tailored Versions**: GPTs are customized versions of ChatGPT designed for specific tasks, combining instructions, expanded knowledge, and actions.
- **User-Friendly Creation**: Users can create GPTs through conversation, making it accessible even for those without coding skills.
- **GPT Store**: A new platform for sharing and discovering GPTs, launching later this month.

### 7. **Assistance API Enhancements**
- Features include persistent threads, built-in retrieval, a code interpreter, and improved function calling.

## Conclusion
The event highlighted OpenAI's commitment to enhancing AI capabilities and accessibility for developers. The advancements presented are expected to empower users to create innovative applications and solutions. OpenAI looks forward to future developments and encourages ongoing engagement with the community.

Thank you for attending!

```

After combining both the video and audio, we're able to get a much more detailed and comprehensive summary for the event which uses information from both the visual and audio elements from the video.

### [Example 2: Question and Answering](#example-2-question-and-answering)

For the Q&A, we'll use the same concept as before to ask questions of our processed video while running the same 3 tests to demonstrate the benefit of combining input modalities:

1. Visual Q&A
2. Audio Q&A
3. Visual + Audio Q&A
```
QUESTION = "Question: Why did Sam Altman have an example about raising windows and turning the radio on?"
```
```
qa_visual_response = client.chat.completions.create(
    model=MODEL,
    messages=[
    {"role": "system", "content": "Use the video to answer the provided question. Respond in Markdown."},
    {"role": "user", "content": [
        "These are the frames from the video.",
        *map(lambda x: {"type": "image_url", "image_url": {"url": f'data:image/jpg;base64,{x}', "detail": "low"}}, base64Frames),
        QUESTION
        ],
    }
    ],
    temperature=0,
)
print("Visual QA:\n" + qa_visual_response.choices[0].message.content)
```
```
Visual QA:
Sam Altman used the example of raising windows and turning the radio on to illustrate the concept of function calling in AI. This example demonstrates how AI can interpret natural language commands and translate them into specific function calls, making interactions more intuitive and user-friendly. By showing a relatable scenario, he highlighted the advancements in AI's ability to understand and execute complex tasks based on simple instructions.

```
```
qa_audio_response = client.chat.completions.create(
    model=MODEL,
    messages=[
    {"role": "system", "content":"""Use the transcription to answer the provided question. Respond in Markdown."""},
    {"role": "user", "content": f"The audio transcription is: {transcription}. \n\n {QUESTION}"},
    ],
    temperature=0,
)
print("Audio QA:\n" + qa_audio_response.choices[0].message.content)
```
```
Audio QA:
The transcription provided does not include any mention of Sam Altman discussing raising windows or turning the radio on. Therefore, I cannot provide an answer to that specific question based on the given text. If you have more context or another transcription that includes that example, please share it, and I would be happy to help!

```
```
qa_both_response = client.chat.completions.create(
    model=MODEL,
    messages=[
    {"role": "system", "content":"""Use the video and transcription to answer the provided question."""},
    {"role": "user", "content": [
        "These are the frames from the video.",
        *map(lambda x: {"type": "image_url",
                        "image_url": {"url": f'data:image/jpg;base64,{x}', "detail": "low"}}, base64Frames),
                        {"type": "text", "text": f"The audio transcription is: {transcription}"},
        QUESTION
        ],
    }
    ],
    temperature=0,
)
print("Both QA:\n" + qa_both_response.choices[0].message.content)
```
```
Both QA:
Sam Altman used the example of raising windows and turning the radio on to illustrate the new function calling feature in GPT-4 Turbo. This example demonstrates how the model can interpret natural language commands and translate them into specific function calls, making it easier for users to interact with the model in a more intuitive way. It highlights the model's ability to understand context and perform multiple actions based on user instructions.

```

Comparing the three answers, the most accurate answer is generated by using both the audio and visual from the video. Sam Altman did not discuss the raising windows or radio on during the Keynote, but referenced an improved capability for the model to execute multiple functions in a single request while the examples were shown behind him.

## [Conclusion](#conclusion)

Integrating many input modalities such as audio, visual, and textual, significantly enhances the performance of the model on a diverse range of tasks. This multimodal approach allows for more comprehensive understanding and interaction, mirroring more closely how humans perceive and process information.

Currently, GPT-4o and GPT-4o mini in the API support text and image inputs, with audio capabilities coming soon. For the time being, use the `gpt-4o-audio-preview` for audio inputs.

### Build a Weather Tool

In [39]:
import requests

@tool
def get_weather(query: str) -> list:
    """Search weatherapi to get the current weather."""
    base_url = "http://api.weatherapi.com/v1/current.json"
    complete_url = f"{base_url}?key={WEATHER_API_KEY}&q={query}"

    response = requests.get(complete_url)
    data = response.json()
    if data.get("location"):
        return data
    else:
        return "Weather Data Not Found"

In [40]:
get_weather.invoke("Bangalore")

{'location': {'name': 'Bangalore',
  'region': 'Karnataka',
  'country': 'India',
  'lat': 12.9833,
  'lon': 77.5833,
  'tz_id': 'Asia/Kolkata',
  'localtime_epoch': 1737589266,
  'localtime': '2025-01-23 05:11'},
 'current': {'last_updated_epoch': 1737588600,
  'last_updated': '2025-01-23 05:00',
  'temp_c': 18.2,
  'temp_f': 64.8,
  'is_day': 0,
  'condition': {'text': 'Mist',
   'icon': '//cdn.weatherapi.com/weather/64x64/night/143.png',
   'code': 1030},
  'wind_mph': 7.4,
  'wind_kph': 11.9,
  'wind_degree': 99,
  'wind_dir': 'E',
  'pressure_mb': 1015.0,
  'pressure_in': 29.97,
  'precip_mm': 0.0,
  'precip_in': 0.0,
  'humidity': 94,
  'cloud': 50,
  'feelslike_c': 18.2,
  'feelslike_f': 64.8,
  'windchill_c': 17.0,
  'windchill_f': 62.6,
  'heatindex_c': 17.0,
  'heatindex_f': 62.6,
  'dewpoint_c': 15.6,
  'dewpoint_f': 60.0,
  'vis_km': 2.0,
  'vis_miles': 1.0,
  'uv': 0.0,
  'gust_mph': 12.4,
  'gust_kph': 19.9}}

In [41]:
import rich

result = get_weather.invoke("Zurich")
rich.print_json(data=result)



## Explore LLM tool calling with custom tools

An agent is basically an LLM which has the capability to automatically call relevant functions to perform complex or tool-based tasks based on input human prompts.

Tool calling also popularly known as function calling is the ability to reliably enable such LLMs to call external tools and APIs.

We will leverate the custom tools we created earlier in the previous section and try to see if the LLM can automatically call the right tools based on input prompts

### Tool calling for LLMs with native support for tool or function calling

Tool calling allows a model to respond to a given prompt by generating output that matches a user-defined schema. While the name implies that the model is performing some action, this is actually not the case! The model is coming up with the arguments to a tool, and actually running the tool (or not) is up to the user or agent defined by the user.

Many LLM providers, including Anthropic, Cohere, Google, Mistral, OpenAI, and others, support variants of a tool calling feature. These features typically allow requests to the LLM to include available tools and their schemas, and for responses to include calls to these tools.



In [42]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model="gpt-4o", temperature=0)



In [43]:
tools = [multiply, search_web_extract_info, get_weather]
chatgpt_with_tools = chatgpt.bind_tools(tools)

In [44]:
# LLMs are still not perfect in tool calling so you might need to play around with the following prompt
prompt = """
            Given only the tools at your disposal, mention tool calls for the following tasks:
            Do not change the query given for any search tasks
            1. What is 2.1 times 3.5
            2. What is the current weather in Greenland today
            3. What are the 4 major Agentic AI Design Patterns
         """

results = chatgpt_with_tools.invoke(prompt)

In [45]:
results

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_ZnAT5ZuL6BpedH7FEvzSiydD', 'function': {'arguments': '{"a": 2.1, "b": 3.5}', 'name': 'multiply'}, 'type': 'function'}, {'id': 'call_nR7pKlXmDvBAJDz8Y9pNHjaW', 'function': {'arguments': '{"query": "Greenland"}', 'name': 'get_weather'}, 'type': 'function'}, {'id': 'call_ekGIBMjz1Hk3O2cYGRt9fAEb', 'function': {'arguments': '{"query": "4 major Agentic AI Design Patterns"}', 'name': 'search_web_extract_info'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 75, 'prompt_tokens': 182, 'total_tokens': 257, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_4691090a87', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-7dc3d55f-6fcf-42b6-a557-85deea052a72-0', 

In [46]:
results.tool_calls

[{'name': 'multiply',
  'args': {'a': 2.1, 'b': 3.5},
  'id': 'call_ZnAT5ZuL6BpedH7FEvzSiydD',
  'type': 'tool_call'},
 {'name': 'get_weather',
  'args': {'query': 'Greenland'},
  'id': 'call_nR7pKlXmDvBAJDz8Y9pNHjaW',
  'type': 'tool_call'},
 {'name': 'search_web_extract_info',
  'args': {'query': '4 major Agentic AI Design Patterns'},
  'id': 'call_ekGIBMjz1Hk3O2cYGRt9fAEb',
  'type': 'tool_call'}]

In [47]:
multiply

StructuredTool(name='multiply', description='use to multiply numbers', args_schema=<class '__main__.CalculatorInput'>, return_direct=True, func=<function multiply at 0x7b9c20f2efc0>)

In [48]:
toolkit = {
    "multiply": multiply,
    "search_web_extract_info": search_web_extract_info,
    "get_weather": get_weather
}

for tool_call in results.tool_calls:
    selected_tool = toolkit[tool_call["name"].lower()]
    print(f"Calling tool: {tool_call['name']}")
    tool_output = selected_tool.invoke(tool_call["args"])
    print(tool_output)
    print()

Calling tool: multiply
7.3500000000000005

Calling tool: get_weather
{'location': {'name': 'Nuuk', 'region': 'Vestgronland', 'country': 'Greenland', 'lat': 64.183, 'lon': -51.75, 'tz_id': 'America/Nuuk', 'localtime_epoch': 1737591145, 'localtime': '2025-01-22 22:12'}, 'current': {'last_updated_epoch': 1737590400, 'last_updated': '2025-01-22 22:00', 'temp_c': -6.7, 'temp_f': 19.9, 'is_day': 0, 'condition': {'text': 'Light snow', 'icon': '//cdn.weatherapi.com/weather/64x64/night/326.png', 'code': 1213}, 'wind_mph': 4.0, 'wind_kph': 6.5, 'wind_degree': 177, 'wind_dir': 'S', 'pressure_mb': 983.0, 'pressure_in': 29.03, 'precip_mm': 0.13, 'precip_in': 0.01, 'humidity': 79, 'cloud': 100, 'feelslike_c': -10.0, 'feelslike_f': 14.1, 'windchill_c': -11.5, 'windchill_f': 11.2, 'heatindex_c': -8.1, 'heatindex_f': 17.5, 'dewpoint_c': -9.0, 'dewpoint_f': 15.9, 'vis_km': 10.0, 'vis_miles': 6.0, 'uv': 0.0, 'gust_mph': 5.3, 'gust_kph': 8.5}}

Calling tool: search_web_extract_info


 40%|████      | 2/5 [00:00<00:00,  4.91it/s]

Extraction blocked for url:  https://www.otechtalks.tv/introduction-to-4-agentic-ai-design-patterns/


100%|██████████| 5/5 [00:06<00:00,  1.26s/it]

['Introduction to 4 Agentic AI Design Patterns\n[![](https://substackcdn.com/image/fetch/w_96,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad51036a-09b8-40d6-8ee9-cc1c865c9be9_500x500.png)](/)\n# [AI Tech Circle](/)\n\nSubscribeSign in\n#### Share this post\n\n[![](https://substackcdn.com/image/fetch/w_520,h_272,c_fill,f_auto,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff36e923f-1489-4c3b-82b4-1c5f5b7aecf3_1916x1038.png)![AI Tech Circle](https://substackcdn.com/image/fetch/w_36,h_36,c_fill,f_auto,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad51036a-09b8-40d6-8ee9-cc1c865c9be9_500x500.png)AI Tech CircleIntroduction to 4 Agentic AI Design Patterns](https://substack.com/home/post/p-155165327?utm_campaign=post&utm_medium=web)Copy linkFacebookEmailNotesMore\n# Introduction to 4 Agentic A




In [49]:
tools

[StructuredTool(name='multiply', description='use to multiply numbers', args_schema=<class '__main__.CalculatorInput'>, return_direct=True, func=<function multiply at 0x7b9c20f2efc0>),
 StructuredTool(name='search_web_extract_info', description='Search the web for a query and extracts useful information from the search links', args_schema=<class 'langchain_core.utils.pydantic.search_web_extract_info'>, func=<function search_web_extract_info at 0x7b9c0a3d3240>),
 StructuredTool(name='get_weather', description='Search weatherapi to get the current weather.', args_schema=<class 'langchain_core.utils.pydantic.get_weather'>, func=<function get_weather at 0x7b9c03fdc220>)]

### Tool calling for LLMs without native support for tool or function calling

Some models like ChatGPT have been fine-tuned for tool calling and provide a dedicated API for tool calling. Generally, such models are better at tool calling than non-fine-tuned models, and are recommended for use cases that require tool calling.

Here we will explore an alternative method to invoke tools if you're using a model that does not natively support tool calling (even though we use ChatGPT here which supports it, we will assume it could be any LLM which doesn't support tool calling).

We'll do this by simply writing a prompt that will get the model to invoke the appropriate tools.

In [50]:
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import render_text_description

rendered_tools = render_text_description(tools)
print(rendered_tools)

multiply(a: float, b: float) -> float - use to multiply numbers
search_web_extract_info(query: str) -> list - Search the web for a query and extracts useful information from the search links
get_weather(query: str) -> list - Search weatherapi to get the current weather.


In [51]:
system_prompt = f"""\
You are an assistant that has access to the following set of tools.
Here are the names and descriptions for each tool:

{rendered_tools}

Given the user instructions, for each instruction do the following:
 - Return the name and input of the tool to use.
 - Return your response as a JSON blob with 'name' and 'arguments' keys.
 - The `arguments` should be a dictionary, with keys corresponding
   to the argument names and the values corresponding to the requested values.
"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("user", "{input}")
    ]
)

In [52]:
instructions = [
                  {"input" : "What is 2.1 times 3.5"},
                  {"input" : "What is the current weather in Greenland"},
                  {"input" : "Tell me about the current state of Agentic AI in the industry" }
               ]

In [53]:
from langchain_core.output_parsers import JsonOutputParser

chain = (prompt
            |
         chatgpt
            |
         JsonOutputParser())

In [54]:
responses = chain.map().invoke(instructions)



In [55]:
responses

[{'name': 'multiply', 'arguments': {'a': 2.1, 'b': 3.5}},
 {'name': 'get_weather', 'arguments': {'query': 'Greenland'}},
 {'name': 'search_web_extract_info',
  'arguments': {'query': 'current state of Agentic AI in the industry 2023'}}]

In [56]:
toolkit = {
    "multiply": multiply,
    "search_web_extract_info": search_web_extract_info,
    "get_weather": get_weather
}

for tool_call in responses:
    selected_tool = toolkit[tool_call["name"].lower()]
    print(f"Calling tool: {tool_call['name']}")
    tool_output = selected_tool.invoke(tool_call["arguments"])
    print(tool_output)
    print()

Calling tool: multiply
7.3500000000000005

Calling tool: get_weather
{'location': {'name': 'Nuuk', 'region': 'Vestgronland', 'country': 'Greenland', 'lat': 64.183, 'lon': -51.75, 'tz_id': 'America/Nuuk', 'localtime_epoch': 1737591345, 'localtime': '2025-01-22 22:15'}, 'current': {'last_updated_epoch': 1737591300, 'last_updated': '2025-01-22 22:15', 'temp_c': -6.8, 'temp_f': 19.8, 'is_day': 0, 'condition': {'text': 'Overcast', 'icon': '//cdn.weatherapi.com/weather/64x64/night/122.png', 'code': 1009}, 'wind_mph': 4.0, 'wind_kph': 6.5, 'wind_degree': 177, 'wind_dir': 'S', 'pressure_mb': 983.0, 'pressure_in': 29.03, 'precip_mm': 0.13, 'precip_in': 0.01, 'humidity': 79, 'cloud': 100, 'feelslike_c': -10.1, 'feelslike_f': 13.9, 'windchill_c': -11.5, 'windchill_f': 11.2, 'heatindex_c': -8.1, 'heatindex_f': 17.5, 'dewpoint_c': -9.0, 'dewpoint_f': 15.9, 'vis_km': 10.0, 'vis_miles': 6.0, 'uv': 0.0, 'gust_mph': 5.3, 'gust_kph': 8.5}}

Calling tool: search_web_extract_info


 20%|██        | 1/5 [00:02<00:08,  2.16s/it]

Extraction blocked for url:  https://www.pwc.com/m1/en/publications/documents/2024/agentic-ai-the-new-frontier-in-genai-an-executive-playbook.pdf


100%|██████████| 5/5 [00:04<00:00,  1.20it/s]







In [57]:
for doc in tool_output:
    print(doc)
    print()

The state of AI in early 2024 | McKinsey
[Skip to main content](#skipToMain)
# The state of AI in early 2024: Gen AI adoption spikes and starts to generate value

May 30, 2024 | SurveyAs generative AI adoption accelerates, survey respondents report measurable benefits and increased mitigation of the risk of inaccuracy. A small group of high performers lead the way.

###

 [(23 pages)](#/download/%2F~%2Fmedia%2Fmckinsey%2Fbusiness%20functions%2Fquantumblack%2Four%20insights%2Fthe%20state%20of%20ai%2F2024%2Fthe-state-of-ai-in-early-2024-final.pdf%3FshouldIndex%3Dfalse)

**If 2023** was the year the world discovered [generative AI (gen AI)](/featured-insights/mckinsey-explainers/what-is-generative-ai), 2024 is the year organizations truly began using—and deriving business value from—this new technology. In the latest [McKinsey Global Survey](/featured-insights/mckinsey-global-surveys) on AI, 65 percent of respondents report that their organizations are regularly using gen AI, nearly doubl