# Skill 5: API Search - Make our bot to talk to any API

We have observed the remarkable synergy created by combining **GPT llms with intelligent agents and detailed prompts**. This powerful combination has consistently delivered impressive results. To further capitalize on this capability, we should aim to integrate it with various systems through API communication. Essentially, we can develop within this notebook what is referred to in OpenAI's ChatGPT as 'GPTs.'

Envision a bot that seamlessly integrates with:

- **CRM Systems:** Including Dynamics, Salesforce, and HubSpot.
- **ERP Systems:** Such as SAP, Dynamics, and Oracle.
- **CMS Systems:** Including Adobe, Oracle, and other content management platforms.

The objective is to connect our bot with data repositories, minimizing data duplication as much as possible. These systems typically offer APIs, facilitating programmatic data access.

In this notebook, we plan to develop an agent capable of querying an API to retrieve information and effectively answer questions.
This time we are going to use an open API for currency and digital coins pricing: https://docs.kraken.com/rest/#tag/Market-Data

In [1]:
import os
import json
import requests
from time import sleep
from typing import Dict, List
from pydantic import BaseModel, Extra, root_validator

from langchain_openai import AzureChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain.callbacks.manager import CallbackManager
from langchain.agents import initialize_agent, AgentType
from langchain.tools import BaseTool
from langchain.requests import RequestsWrapper
from langchain.chains import APIChain

from common.callbacks import StdOutCallbackHandler
from common.utils import num_tokens_from_string, reduce_openapi_spec
from common.prompts import APISEARCH_PROMPT

from IPython.display import Markdown, HTML, display  

from dotenv import load_dotenv
load_dotenv("credentials.env")

def printmd(string):
    display(Markdown(string.replace("$","USD ")))


In [2]:
# Set the ENV variables that Langchain needs to connect to Azure OpenAI
os.environ["OPENAI_API_VERSION"] = os.environ["AZURE_OPENAI_API_VERSION"]

In [7]:
cb_handler = StdOutCallbackHandler()
cb_manager = CallbackManager(handlers=[cb_handler])

COMPLETION_TOKENS = 2000


# This notebook needs GPT-4-Turbo (context size of 128k tokens)
llm = AzureChatOpenAI(deployment_name=os.environ["GPT4_DEPLOYMENT_NAME"], 
                      temperature=0.5, max_tokens=COMPLETION_TOKENS, 
                      streaming=True, callback_manager=cb_manager)

## The Logic

By now, you must infer that the solution for an API Agent has to be something like: give the API specification as part of the system prompt to the LLM , then have an agent plan for the right steps to formulate the API call.<br>

Let's do that. But we must first understand the industry standards of Swagger/OpenAPI


## Introduction to OpenAPI (formerly Swagger)

The OpenAPI Specification, previously known as the Swagger Specification, is a specification for a machine-readable interface definition language for describing, producing, consuming and visualizing web services. Previously part of the Swagger framework, it became a separate project in 2016, overseen by the OpenAPI Initiative, an open-source collaboration project of the Linux Foundation.

OpenAPI Specification is an API description format for REST APIs. An OpenAPI file allows you to describe your entire API, including: Available endpoints (/users for example) and operations on each endpoint ( GET /users, POST /users), description, contact information, license, terms of use and other information.

### Let's get the OpenAPI (Swagger) spec from our desired API that we want to talk to
You can also download it from the Kraken website: https://docs.kraken.com/rest/

In [8]:
url = "https://datasetsgptsmartsearch.blob.core.windows.net/apispecs/openapi_kraken.json"
response = requests.get(url + os.environ['BLOB_SAS_TOKEN'])

# Check if the request was successful
if response.status_code == 200:
    spec = response.json()
else:
    spec = None
    print(f"Failed to retrieve data: Status code {response.status_code}")


Let's see how big is this API specification:

In [9]:
# You can check the function "reduce_openapi_spec()" in utils.py
reduced_api_spec = reduce_openapi_spec(spec)

In [10]:
api_tokens = num_tokens_from_string(str(spec))
print("API spec size in tokens:",api_tokens)
api_tokens = num_tokens_from_string(str(reduced_api_spec))
print("Reduced API spec size in tokens:",api_tokens)

API spec size in tokens: 66625
Reduced API spec size in tokens: 57394


Sometimes it makes sense to reduce the size of the API Specs by using the `reduce_openapi_spec` function. It's optional.

#### NOTE: As you can see, a large context LLM is needed. `GPT4-turbo` is necessary for this notebook to run succesfully.

## Question
Let's make a complicated question that requires two distinct API calls to different endpoints:

In [11]:
QUESTION = """
Tell me the price of bitcoin against USD , also the latest OHLC values for Ethereum,
also me also the bid and ask for Euro
"""

## Use a chain to convert the natural language question to an API request using the API specification in the prompt

We can use a nice chain in langchain called APIChain

In [12]:
# Most of APIs require Authorization tokens, so we construct the headers using a lightweight python request wrapper called RequestsWrapper
access_token = "ABCDEFG123456" 
headers = {"Authorization": f"Bearer {access_token}"}
requests_wrapper = RequestsWrapper(headers=headers)

In [13]:
chain = APIChain.from_llm_and_api_docs(
    llm=llm,
    api_docs=str(reduced_api_spec),
    headers=headers,
    verbose=False,
    limit_to_domains=None,
    callback_manager=cb_manager
)


These are the prompts on the APIChain class (one to create the URL endpoint and the other one to use it and get the answer):

In [14]:
chain.api_request_chain.prompt.template

'You are given the below API Documentation:\n{api_docs}\nUsing this documentation, generate the full API url to call for answering the user question.\nYou should build the API url in order to get a response that is as short as possible, while still getting the necessary information to answer the question. Pay attention to deliberately exclude any unnecessary pieces of data in the API call.\n\nQuestion:{question}\nAPI url:'

In [15]:
chain.api_answer_chain.prompt.template

'You are given the below API Documentation:\n{api_docs}\nUsing this documentation, generate the full API url to call for answering the user question.\nYou should build the API url in order to get a response that is as short as possible, while still getting the necessary information to answer the question. Pay attention to deliberately exclude any unnecessary pieces of data in the API call.\n\nQuestion:{question}\nAPI url: {api_url}\n\nHere is the response from the API:\n\n{api_response}\n\nSummarize this response to answer the original question.\n\nSummary:'

In [16]:
%%time
try:
    chain.invoke(QUESTION)
except Exception as e:
    response = str(e)

https://api.kraken.com/0/public/Ticker?pair=XBTUSD,ETHUSD,XBTEURThe current price of Bitcoin against USD is $58,675.60. The latest OHLC values for Ethereum in USD are as follows: Open - $3066.50, High - $3129.78, Low - $3024.37, Close - $3104.89. For Euro, the bid price is €54,142.70, and the ask price is €54,142.80.CPU times: user 189 ms, sys: 26.1 ms, total: 215 ms
Wall time: 14.7 s


As we have seen before in prior notebooks, a single chain cannot reason/observe/think/retry, so it cannot call multiple endpoints and it doesn't retry or reflect on errors.

## Creating a custom agent that uses the APIChain as a tool

To solve the avobe problem, we can build a REACT Agent that uses the APIChain as a tool to get the information. This agent will create as many calls as needed (using the chain tool) until it answers the question

In [17]:
class MyAPISearch(BaseTool):
    """APIChain as an agent tool"""
    
    name = "apisearch"
    description = "useful when the questions includes the term: apisearch.\n"

    llm: AzureChatOpenAI
    api_spec: str
    headers: dict = {}
    limit_to_domains: list = None
    verbose: bool = False
    
    def _run(self, query: str) -> str:
        
        chain = APIChain.from_llm_and_api_docs(
                            llm=self.llm,
                            api_docs=self.api_spec,
                            headers=self.headers,
                            verbose=self.verbose,
                            limit_to_domains=self.limit_to_domains
                            )
        try:
            sleep(2) # This is optional to avoid possible TPM rate limits
            response = chain.invoke(query)
        except Exception as e:
            response = e
        
        return response
            
    async def _arun(self, query: str) -> str:
        """Use the tool asynchronously."""
        print("I am running ASYNC")
        raise NotImplementedError("This Tool does not support async")

Notice below that we are using GPT-35-Turbo-16k (llm_2) for the Tool and GPT-4-turbo (llm_1) for the Agent

In [18]:
tools = [MyAPISearch(llm=llm, api_spec=str(reduced_api_spec), limit_to_domains=None)]
agent = create_openai_tools_agent(llm, tools, APISEARCH_PROMPT)
agent_executor = AgentExecutor(agent=agent, tools=tools, 
                               return_intermediate_steps=True)


In [19]:
%%time 

#As LLMs responses are never the same, we do a for loop in case the answer cannot be parsed according to our prompt instructions
for i in range(2):
    try:
        response = agent_executor.invoke({"question":QUESTION})["output"]
        break
    except Exception as e:
        response = str(e)
        continue
        
printmd(response)

https://api.kraken.com/0/public/Ticker?pair=XBTUSDThe current price of Bitcoin (BTC) in USD is $58,649.30.https://api.kraken.com/0/public/OHLC?pair=ETHUSDThe latest Ethereum (ETH) OHLC (Open, High, Low, Close) values are as follows:
- Open: 3105.23 USD
- High: 3107.79 USD
- Low: 3105.23 USD
- Close: 3107.79 USDhttps://api.kraken.com/0/public/Ticker?pair=EURUSDThe current bid price for EUR/USD is 1.08293, and the ask price is 1.08298.Here are the details you requested:

- The current price of **Bitcoin (BTC)** in USD is **USD 58,649.30**.

- The latest **Ethereum (ETH) OHLC** (Open, High, Low, Close) values are:
  - Open: USD 3105.23
  - High: USD 3107.79
  - Low: USD 3105.23
  - Close: USD 3107.79

- For the **Euro (EUR/USD)**, the current bid price is **1.08293**, and the ask price is **1.08298**.

Here are the details you requested:

- The current price of **Bitcoin (BTC)** in USD is **USD 58,649.30**.

- The latest **Ethereum (ETH) OHLC** (Open, High, Low, Close) values are:
  - Open: USD 3105.23
  - High: USD 3107.79
  - Low: USD 3105.23
  - Close: USD 3107.79

- For the **Euro (EUR/USD)**, the current bid price is **1.08293**, and the ask price is **1.08298**.

CPU times: user 601 ms, sys: 65.5 ms, total: 666 ms
Wall time: 38.5 s


**Great!!** we have now an API Agent using APIChain as a tool, capable of reasoning until it can find the answer.

## Simple APIs

What happens if the API is quite basic, meaning it's just a simple endpoint without a Swagger/OpenAPI definition? Let’s consider the following example:

[CountdownAPI](https://www.countdownapi.com/) is a streamlined version of the eBay API, available as a paid service. We can test it using their demo query, which does not require any Swagger or OpenAPI specification. In this scenario, our main task is to create a tool that retrieves the results. We then pass these results to an agent for analysis, providing answers to user queries, similar to our approach with the Bing Search agent.

An aspect we haven't discussed yet while constructing our API Agent using the APIChain tool is handling situations where either the API specification or the API call results are quite extensive. In such cases, we need to choose between using GPT-4-32k and GPT-4-Turbo.

In the example below, there is no API specification, but the response from the API is rather lengthy. For this scenario, we will employ GPT-4-32k.

In [19]:
# set up the request parameters
params = {
  'api_key': 'demo',
  'type': 'search',
  'ebay_domain': 'ebay.com',
  'search_term': 'memory cards'
}

# make the http GET request to Countdown API
api_result = requests.get('https://api.countdownapi.com/request', params)

num_tokens = num_tokens_from_string(str(api_result.json())) # this is a custom function we created in common/utils.py
print("Token count:",num_tokens,"\n")  

# print the first 2000 characters of JSON response from Countdown API
print(json.dumps(api_result.json())[:2000], "...")

Token count: 14423 

{"request_info": {"success": true, "demo": true}, "request_parameters": {"type": "search", "ebay_domain": "ebay.com", "search_term": "memory cards"}, "request_metadata": {"ebay_url": "https://www.ebay.com/sch/i.html?_nkw=memory+cards&_sacat=0&_dmd=1&_fcid=1"}, "search_results": [{"position": 1, "title": "Sandisk Micro SD Card Memory 32GB 64GB 128GB 256GB 512GB 1TB Lot Extreme Ultra", "epid": "203914554350", "link": "https://www.ebay.com/itm/203914554350", "image": "https://i.ebayimg.com/thumbs/images/g/A7wAAOSwemNjTz~l/s-l500.jpg", "condition": "Brand New", "seller_info": {"name": "terashack", "review_count": 62152, "positive_feedback_percent": 99.9}, "is_auction": false, "buy_it_now": false, "free_returns": true, "sponsored": false, "prices": [{"value": 9.99, "raw": "$9.99"}, {"value": 429.99, "raw": "$429.99"}], "price": {"value": 9.99, "raw": "$9.99"}}, {"position": 2, "title": "Micro SD Card Memory 32GB 64GB 128GB 256GB 512GB Lot Extreme Ultra US", "epid": "404

So, the answer from this product query (the demo only works with 'memory cards' - you will need to sign up for their trial if you want to try any query with an API key), is about 16.5k tokens. When combined with the prompt, we won't have any other option than to use GPT-4-32k or GPT-4 turbo models. 

In [20]:
class MySimpleAPISearch(BaseTool):
    """Tool for simple API calls that doesn't require OpenAPI 3.0 specs"""
    
    name = "apisearch"
    description = "useful when the questions includes the term: apisearch.\n"

    api_key: str
    
    def _run(self, query: str) -> str:
        
        params = {
          'api_key': self.api_key,
          'type': 'search',
          'ebay_domain': 'ebay.com',
          'search_term': query
        }

        # make the http GET request to Countdown API
        api_result = requests.get('https://api.countdownapi.com/request', params)
        
        try:
            response = json.dumps(api_result.json())
        except Exception as e:
            response = e
        
        return response
            
    async def _arun(self, query: str) -> str:
        """Use the tool asynchronously."""
        raise NotImplementedError("This Tool does not support async")

In [21]:
tools = [MySimpleAPISearch(api_key='demo')]
agent = create_openai_tools_agent(llm, tools, APISEARCH_PROMPT)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False, return_intermediate_steps=True )

This time let's use the .stream() method 

In [22]:
for chunk in agent_executor.stream({"question": 'what is the price for SanDisk "memory cards"? give me the links please', "language":"English"}):
    # Agent Action
    if "actions" in chunk:
        for action in chunk["actions"]:
            print(f"Calling Tool: `{action.tool}` with input `{action.tool_input}`")
    # Observation
    elif "steps" in chunk:
        continue
        # for step in chunk["steps"]:
        #     print(f"Tool Result: `{step.observation}`")
    # Final result
    elif "output" in chunk:
        printmd(f'Final Output: {chunk["output"]}')
    else:
        raise ValueError()
    print("---")

Calling Tool: `apisearch` with input `SanDisk memory cards price`
---
Calling Tool: `apisearch` with input `SanDisk memory cards buy online`
---
Aquí tienes algunos precios y enlaces para comprar tarjetas de memoria SanDisk:

1. **SanDisk Micro SD Card 16GB 32GB 64GB 128GB TF Class 10 for Smartphones Tablets**
   - Precio: USD 5.20 - USD 18.90
   - [Enlace](https://www.ebay.com/itm/324736594273)

2. **SanDisk Ultra 128GB 80MB/S Class 10 Micro SD MicroSDXC TF Memory Card SDSQUNS**
   - Precio: USD 12.99
   - [Enlace](https://www.ebay.com/itm/333222448982)

3. **Sandisk Micro SD Card Ultra Memory Card with MicroSD to SD Adapter Wholesale Lot**
   - Precio: USD 8.44 - USD 352.81
   - [Enlace](https://www.ebay.com/itm/324163010105)

4. **Sandisk Micro SD Card Ultra Memory Card 32GB 64GB 128GB 512GB 1TB Wholesale lot**
   - Precio: USD 2.48 - USD 203.56
   - [Enlace](https://www.ebay.com/itm/195635604530)

5. **SanDisk Industrial 8GB 16GB Micro SD Memory Card Class 10 UHS-I WHOLESALE PRICE*

Final Output: Aquí tienes algunos precios y enlaces para comprar tarjetas de memoria SanDisk:

1. **SanDisk Micro SD Card 16GB 32GB 64GB 128GB TF Class 10 for Smartphones Tablets**
   - Precio: USD 5.20 - USD 18.90
   - [Enlace](https://www.ebay.com/itm/324736594273)

2. **SanDisk Ultra 128GB 80MB/S Class 10 Micro SD MicroSDXC TF Memory Card SDSQUNS**
   - Precio: USD 12.99
   - [Enlace](https://www.ebay.com/itm/333222448982)

3. **Sandisk Micro SD Card Ultra Memory Card with MicroSD to SD Adapter Wholesale Lot**
   - Precio: USD 8.44 - USD 352.81
   - [Enlace](https://www.ebay.com/itm/324163010105)

4. **Sandisk Micro SD Card Ultra Memory Card 32GB 64GB 128GB 512GB 1TB Wholesale lot**
   - Precio: USD 2.48 - USD 203.56
   - [Enlace](https://www.ebay.com/itm/195635604530)

5. **SanDisk Industrial 8GB 16GB Micro SD Memory Card Class 10 UHS-I WHOLESALE PRICE**
   - Precio: USD 9.87 - USD 363.12
   - [Enlace](https://www.ebay.com/itm/274312158070)

6. **Sandisk 256MB SD Memory Card (10 pcs)**
   - Precio: USD 17.00
   - [Enlace](https://www.ebay.com/itm/126212649895)

7. **SanDisk SD Memory Card 2/4/8/16/32/64/128/256 GB Extreme Pro lot Ultra ORIGINAL**
   - Precio: USD 3.75 - USD 56.95
   - [Enlace](https://www.ebay.com/itm/155349331966)

8. **Qty=1 SD CARD SANDISK 256MB**
   - Precio: USD 9.75
   - [Enlace](https://www.ebay.com/itm/196000732047)

9. **Sandisk SDXC Memory Card 64GB (B11)PACK OF 6**
   - Precio: USD 36.80
   - [Enlace](https://www.ebay.com/itm/116229610004)

10. **Sandisk Micro SD Card Ultra w/ Reader 16GB 32GB 64GB 128GB Memory Wholesale lot**
    - Precio: USD 8.86 - USD 15.84
    - [Enlace](https://www.ebay.com/itm/274394522008)

Espero que esta información te sea útil para encontrar la tarjeta de memoria SanDisk que necesitas.

---


# Summary

In this notebook, we learned about how to create very smart API agents for simple or complex APIs that use Swagger or OpenAPI specifications.
We see, again, that the key to success is to use: Agents with Expert tools + GPT-4 + good prompts.

As homework, try to create a shopping assistant for Etsy e-commerce site using the following API spec: (you will need to register for free and create an API-Key)

- https://developers.etsy.com/documentation/
- https://www.etsy.com/openapi/generated/oas/3.0.0.json

# NEXT

The Next Notebook will guide you on how we stick everything together. How do we use the features of all notebooks and create a brain agent that can respond to any request accordingly.