<a target="_blank" href="https://colab.research.google.com/github/cohere-ai/notebooks/blob/main/notebooks/agents/Multi_Step_Tool_Use_Spotify_v2.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Multi-Step Tool Use with Spotify Dataset

This example demonstrates an agent that performs analysis on a Spotify tracks dataset (via a Python interpreter tool) while also having access to a web search tool.

## Setup

In [1]:
import json
import os

import cohere

co = cohere.ClientV2("COHERE_API_KEY") # Get your free API key: https://dashboard.cohere.com/api-keys

In [2]:
! pip install tavily-python --q

from tavily import TavilyClient
tavily_client = TavilyClient(api_key="TAVILY_API_KEY")

## Step 1: Define the tools

Here, we define the web search tool, which uses the Tavily Python client to perform web searches.



In [3]:
# here's a web search engine
def web_search(query: str) -> list[dict]:
  response = tavily_client.search(query, max_results=3)['results']
  return {"results": response}

In [4]:
# the LLM is equipped with a description of the web search engine
web_search_tool = {
    "type": "function",
    "function": {
        "name": "web_search",
        "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Query to search the internet with"
                }
            },
            "required": ["query"]
        }
    }
}


Here, we define the Python interpreter tool, which uses the `exec` function to execute Python code.

In [5]:
# here's a python console, which can be used to access the spreadsheet, but also more generally to code and plot stuff
import io, contextlib
def python_interpreter(code: str) -> list[dict]:
    output = io.StringIO()
    try:
        # Redirect stdout to capture print statements
        with contextlib.redirect_stdout(output):
            exec(code, globals())
    except Exception as e:
        return {
            "error": str(e),
            "executed_code": code
        }
    # Get stdout
    return {
  		"console_output": output.getvalue(),
        "executed_code": code
  	}



In [6]:
# the LLM is equipped with a description of a python console
python_interpreter_tool = {
    "type": "function",
    "function": {
        "name": "python_interpreter",
        "description": "Executes python code and returns the result. The code runs in a static sandbox without internet access and without interactive mode, so print output or save output to a file.",
        "parameters": {
            "type": "object",
            "properties": {
                "code": {
                    "type": "string",
                    "description": "Python code to execute"
                }
            },
            "required": ["code"]
        }
    }
}


In [None]:
functions_map = {
    "web_search": web_search,
    "python_interpreter": python_interpreter,
}

We'll also need the spotify_data dataset, which contains information about Spotify tracks such as the track information, release information, popularity metrics, and musical characteristics. You can find the [dataset here](https://github.com/cohere-ai/notebooks/blob/main/notebooks/guides/advanced_rag/spotify_dataset.csv).

In [7]:
# Display the first few rows of the dataset
import pandas as pd
file_path = './spotify_dataset.csv'
spotify_data = pd.read_csv(file_path)
spotify_data.head(3)

  from pandas.core import (


Unnamed: 0,track_name,artist(s)_name,artist_count,released_year,released_month,released_day,in_spotify_playlists,in_spotify_charts,streams,in_apple_playlists,...,key,mode,danceability,valence,energy,acousticness,instrumentalness,liveness,speechiness,release_date
0,Seven (feat. Latto) (Explicit Ver.),"Latto, Jung Kook",2,2023,7,14,553,147,141381703.0,43,...,B,Major,80,89,83,31,0,8,4,2023-07-14
1,LALA,Myke Towers,1,2023,3,23,1474,48,133716286.0,48,...,C#,Major,71,61,74,7,0,10,4,2023-03-23
2,vampire,Olivia Rodrigo,1,2023,6,30,1397,113,140003974.0,94,...,F,Major,51,32,53,17,0,31,6,2023-06-30


Here is the task that the agent needs to perform:

In [9]:
message = """What's the age and citizenship of the artists who had the top 3 most streamed songs on Spotify in 2023?

You have access to a dataset with information about Spotify songs from the past 10 years, located at ./spotify_dataset.csv.
You also have access to the internet to search for information not available in the dataset.
You must use the dataset when you can, and if stuck you can use the internet.
Remember to inspect the dataset and get a list of its columnsto understand its structure before trying to query it. Take it step by step.
"""

## Step 2: Run the tool use workflow

Next, we run the tool use workflow involving for steps:

- Get the user message
- Model generates tool calls, if any
- Execute tools based on the tool calls generated by the model
- Model either generates more tool calls or returns a response with citations

Looking at the example output, the agent performs the task in a sequence of 3 steps:

- Inspect the dataset and get a list of its columns.
- Write and execute Python code to find the top 3 most streamed songs on Spotify in 2023 and their respective artists.
- Search for the age and citizenship of each artist on the internet.

In [10]:
model = "command-r-plus-08-2024"
tools = [web_search_tool,python_interpreter_tool]

# Step 1: get user message
print(f"USER MESSAGE:\n{message}")
print("="*50)

messages = [{'role': 'user','content': message}]

# 2 - Model generates tool calls, if any
res = co.chat(model=model,
        messages=messages,
        tools=tools,
        temperature=0)

# Keep invoking tools as long as the model generates tool calls
while res.message.tool_calls:
    # Tool plan and tool calls
    print("\nTOOL PLAN:")
    print(res.message.tool_plan)

    print("\nTOOL CALLS:")
    for tc in res.message.tool_calls:
        if tc.function.name == "python_interpreter":
            print(f"Tool name: {tc.function.name}")
            tool_call_prettified = print("\n".join(f"  {line}" for line_num, line in enumerate(json.loads(tc.function.arguments)["code"].splitlines())))
            print(tool_call_prettified)
        else:
            print(f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}")

    messages.append({'role': 'assistant',
                    'tool_calls': res.message.tool_calls,
                    'tool_plan': res.message.tool_plan})

    # 3 - Execute tools based on the tool calls generated by the model
    print("\nTOOL RESULTS:")
    for tc in res.message.tool_calls:
        tool_result = functions_map[tc.function.name](**json.loads(tc.function.arguments))
        tool_content = [json.dumps(tool_result)]
        print(tool_result, "\n")
        
        messages.append({'role': 'tool',
                        'tool_call_id': tc.id,
                        'tool_content': tool_content}) 

    # 4 - Model either generates more tool calls or returns a response
    res = co.chat(model=model,
                messages=messages,
                tools=tools,
                temperature=0)
    
messages.append({"role": "assistant", "content": res.message.content[0].text})

print("\nRESPONSE:")
print(res.message.content[0].text)

if res.message.citations:
    print("\nCITATIONS:")
    for citation in res.message.citations:
        print(f"Start: {citation.start} | End: {citation.end} | Text: '{citation.text}'")
        print("Sources:")
        if citation.sources:
            for source in citation.sources:
                print(source.id)
        print("-"*50)

USER MESSAGE:
What's the age and citizenship of the artists who had the top 3 most streamed songs on Spotify in 2023?

You have access to a dataset with information about Spotify songs from the past 10 years, located at ./spotify_dataset.csv.
You also have access to the internet to search for information not available in the dataset.
You must use the dataset when you can, and if stuck you can use the internet.
Remember to inspect the dataset and get a list of its columnsto understand its structure before trying to query it. Take it step by step.


TOOL PLAN:
I will first inspect the dataset to understand its structure and the columns it contains. Then, I will write and execute Python code to find the top 3 most streamed songs on Spotify in 2023 and their respective artists. Finally, I will search for the age and citizenship of each artist online.

TOOL CALLS:
Tool name: python_interpreter
  import pandas as pd
  
  df = pd.read_csv('spotify_dataset.csv')
  
  print(df.columns)
None

TO