# MLB Stats Report: Mixture of Agents with Agno and Groq

In this notebook, we will showcase the concept of [Mixture of Agents (MoA)](https://arxiv.org/pdf/2406.04692) using [Agno Agents](https://github.com/agno-agi/agno) and [Groq API](https://console.groq.com/playground). 

The Mixture of Agents approach involves leveraging multiple AI agents, each equipped with different language models, to collaboratively complete a task. By combining the strengths and perspectives of various models, we can achieve a more robust and nuanced result. 

In our project, multiple MLB Writer agents, each utilizing a different language model (`llama3-70b-8192`, `llama3-8b-8192`, `gemma2-9b-it`, and `mixtral-8x7b-32768`), will independently generate game recap articles based on game data collected from other Agno Agents. These diverse outputs will then be aggregated by an MLB Editor agent, which will synthesize the best elements from each article to create a final, polished game recap. This process not only demonstrates the power of collaborative AI but also highlights the effectiveness of integrating multiple models to enhance the quality of the generated content.

## About Agno

Agno is a lightweight framework for building multi-modal Agents.


### Key features
Here’s why you should build Agents with Agno:

- Lightning Fast: Agent creation is 5000x faster than LangGraph (see performance).
- Model Agnostic: Use any model, any provider, no lock-in.
- Multi Modal: Native support for text, image, audio and video.
- Multi Agent: Delegate tasks across a team of specialized agents.
- Memory Management: Store user sessions and agent state in a database.
- Knowledge Stores: Use vector databases for Agentic RAG or dynamic few-shot.
- Structured Outputs: Make Agents respond with structured data.
- Monitoring: Track agent sessions and performance in real-time on agno.com.

Join us on our [Discord server](https://agno.link/discord) where we are actively building with the community. 


## How it works

- **Step 1:** Create an `Agent`
- **Step 2:** Add Tools (functions), Knowledge (vectordb) and Storage (database)
- **Step 3:** Serve using Streamlit, FastApi or Django to build your AI application

### Setup

In [7]:
# Import packages
import os
import json
import statsapi
import datetime
from datetime import timedelta, datetime
import pandas as pd

from agno.agent import Agent
from agno.models.groq import Groq

In [8]:
api_key = os.getenv('GROQ_API_KEY')

We will configure multiple LLMs using [Agno Agents](https://github.com/agno-agi/agno), each requiring a Groq API Key for access which you can create [here](https://console.groq.com/keys). These models include different versions of Meta's LLaMA 3 and other specialized models like Google's Gemma 2 and Mixtral. Each model will be used by different agents to generate diverse outputs for the MLB game recap.

In [9]:
model_llama70b = Groq(id="llama3-70b-8192", api_key=api_key)
model_llama8b  = Groq(id="llama3-groq-8b-8192-tool-use-preview", api_key=api_key)
model_gemma2   = Groq(id="gemma2-9b-it", api_key=api_key)
model_mixtral  = Groq(id="mixtral-8x7b-32768", api_key=api_key)

### Define Tools

First, we will define specialized tools to equip some of the agents with to assist in gathering and processing MLB game data. These tools are designed to fetch game information and player boxscores via the [MLB-Stats API](https://github.com/toddrob99/MLB-StatsAPI). By providing these tools to our agents, they can call them with relevant information provided by the user prompt and infuse our MLB game recaps with accurate, up-to-date external information.

- **get_game_info**: Fetches high-level information about an MLB game, including teams, scores, and key players.
- **get_batting_stats**: Retrieves detailed player batting statistics for a specified MLB game.
- **get_pitching_stats**: Retrieves detailed player pitching statistics for a specified MLB game.

For more information on tool use/function calling with Agno Mixture of Agents, check out [Agno Documentation](https://docs.agno.com/introduction).

In [10]:
def get_game_info(game_date: str, team_name: str) -> str:
    """Gets high-level information on an MLB game.
    
    Params:
    game_date: The date of the game of interest, in the form "yyyy-mm-dd". 
    team_name: MLB team name. Both full name (e.g. "New York Yankees") or nickname ("Yankees") are valid. If multiple teams are mentioned, use the first one
    """
    sched = statsapi.schedule(start_date=game_date,end_date=game_date)
    sched_df = pd.DataFrame(sched)
    game_info_df = sched_df[sched_df['summary'].str.contains(team_name, case=False, na=False)]

    game_info = {
        "game_id": str(game_info_df.game_id.tolist()[0]),
        "home_team": game_info_df.home_name.tolist()[0],
        "home_score": game_info_df.home_score.tolist()[0],
        "away_team": game_info_df.away_name.tolist()[0],
        "away_score": game_info_df.away_score.tolist()[0],
        "winning_team": game_info_df.winning_team.tolist()[0],
        "series_status": game_info_df.series_status.tolist()[0]
    }

    return json.dumps(game_info)


def get_batting_stats(game_id: str) -> str:
    """Gets player boxscore batting stats for a particular MLB game
    
    Params:
    game_id: The 6-digit ID of the game
    """
    boxscores=statsapi.boxscore_data(game_id)
    player_info_df = pd.DataFrame(boxscores['playerInfo']).T.reset_index()

    away_batters_box = pd.DataFrame(boxscores['awayBatters']).iloc[1:]
    away_batters_box['team_name'] = boxscores['teamInfo']['away']['teamName']

    home_batters_box = pd.DataFrame(boxscores['homeBatters']).iloc[1:]
    home_batters_box['team_name'] = boxscores['teamInfo']['home']['teamName']

    batters_box_df = pd.concat([away_batters_box, home_batters_box]).merge(player_info_df, left_on = 'name', right_on = 'boxscoreName')
    batting_stats = batters_box_df[['team_name','fullName','position','ab','r','h','hr','rbi','bb','sb']].to_dict(orient='records')

    return json.dumps(batting_stats)


def get_pitching_stats(game_id: str) -> str:
    """Gets player boxscore pitching stats for a particular MLB game
    
    Params:
    game_id: The 6-digit ID of the game
    """
    boxscores=statsapi.boxscore_data(game_id)
    player_info_df = pd.DataFrame(boxscores['playerInfo']).T.reset_index()

    away_pitchers_box = pd.DataFrame(boxscores['awayPitchers']).iloc[1:]
    away_pitchers_box['team_name'] = boxscores['teamInfo']['away']['teamName']

    home_pitchers_box = pd.DataFrame(boxscores['homePitchers']).iloc[1:]
    home_pitchers_box['team_name'] = boxscores['teamInfo']['home']['teamName']

    pitchers_box_df = pd.concat([away_pitchers_box,home_pitchers_box]).merge(player_info_df, left_on = 'name', right_on = 'boxscoreName')
    pitching_stats = pitchers_box_df[['team_name','fullName','ip','h','r','er','bb','k','note']].to_dict(orient='records')

    return json.dumps(pitching_stats)

### Define Agents

In Agno, Agents are autonomous entities designed to execute a task using their Knowledge, Memory, and Tools. 

- **MLB Researcher**: Uses the `get_game_info` tool to gather high-level game information.
- **MLB Batting and Pitching Statistician**: Retrieves player batting and pitching boxscore stats using the `get_batting_stats` and `get_pitching_stats` tools.
- **MLB Writers**: Three agents, each using different LLMs (LLaMA-8b, Gemma-9b, Mixtral-8x7b), to write game recap articles.
- **MLB Editor**: Edits the articles from the writers to create the final game recap.

#### Mixture of Agents

In this demo, although the MLB Researcher and MLB Statistician agents use tool calling to gather data for the output, our Mixture of Agents framework consists of the three MLB Writer agents and the MLB Editor. This makes our MoA architecture a simple 2 layer design, but more complex architectures are possible to improve the output even more:

![Alt text](mixture_of_agents_diagram.png)

In [16]:
default_date = datetime.now().date() - timedelta(1) # Set default date to yesterday in case no date is specified

mlb_researcher = Agent(
    model=model_mixtral,
    description="An detailed accurate MLB researcher extracts game information from the user question",
    instructions=[
        f"Parse the Team and date (use {default_date} if user does not specify) from the user question.",
        "Pass to your get_game_info tool",
        """
        Respond in the following format:
            game_id: game_id
            home_team: home_team
            home_score: home_score
            away_team: away_team
            away_score: away_score
            winning_team: winning_team
            series_status: series_status
        """        
    ],
    tools=[get_game_info],   
)

mlb_batting_statistician = Agent(
    model=model_mixtral,
    description="An industrious MLB Statistician analyzing player boxscore stats for the relevant game",
    instructions=[
        "Given information about a MLB game, retrieve ONLY boxscore player batting stats for the game identified by the MLB Researcher",
        "Your analysis should be atleast 500 words long, and include inning-by-inning statistical summaries",
        ],
    tools=[get_batting_stats],
)

mlb_pitching_statistician = Agent(
    model=model_mixtral,
    description="An industrious MLB Statistician analyzing player boxscore stats for the relevant game",
    instructions=[
        "Given information about a MLB game, retrieve ONLY boxscore player pitching stats for a specific game",
        "Your analysis should be atleast 500 words long, and include inning-by-inning statistical summaries",
        ],
    tools=[get_pitching_stats],
)

mlb_writer_llama = Agent(
    model=model_llama70b,
    description="An experienced, honest, and industrious writer who does not make things up",
    instructions=[
        """
            Write a game recap article using the provided game information and stats.
            Key instructions:
            - Include things like final score, top performers and winning/losing pitcher.
            - Use ONLY the provided data and DO NOT make up any information, such as specific innings when events occurred, that isn't explicitly from the provided input.
            - Do not print the box score
        """,
        "Your recap from the stats should be at least 1000 words. Impress your readers!!!"
    ],
)

mlb_writer_gemma = Agent(
    model=model_gemma2,
    description="An experienced and honest writer who does not make things up",
    instructions=["Write a detailed game recap article using the provided game information and stats"],
)

mlb_writer_mixtral = Agent(
    model=model_mixtral,
    description="An experienced and honest writer who does not make things up",
    instructions=["Write a detailed game recap article using the provided game information and stats"],
)

mlb_editor = Agent(
    model=model_llama70b,
    description="An experienced editor that excels at taking the best parts of multiple texts to create the best final product",
    instructions=["Edit recap articles to create the best final product."],
)

In [12]:
user_prompt = 'write a recap of the Yankees game on July 14, 2024'

In [None]:
game_information = mlb_researcher.run(user_prompt, stream=False)
print(game_information)

batting_stats  = mlb_batting_statistician.run(game_information, stream=False)
pitching_stats = mlb_pitching_statistician.run(game_information, stream=False)

stats = f"Statistical summaries for the game:\n\nBatting stats:\n{batting_stats}\n\nPitching stats:\n{pitching_stats}"
llama_writer   = mlb_writer_llama.run(stats, stream=False)
gemma_writer   = mlb_writer_gemma.run(stats, stream=False)
mixtral_writer = mlb_writer_mixtral.run(stats, stream=False)

# Edit final outputs
editor_inputs = [llama_writer, gemma_writer, mixtral_writer]
editor = mlb_editor.run("\n".join(editor_inputs), stream=False)

print(editor)

# Conclusion:

Using Agno Agents + Groq allows developers to build a cost effective and high quality solution for generating MLB game recaps. The Mixture-of-Agents approach leverages multiple AI agents, each equipped with small and different language models, to complete a task.

We use multiple MLB Writer agents utilizing different language models to independently generate game recap articles based on game data collected from other Agno Agents. An MLB Editor agent then synthesizes the best elements from each article to create a final, polished game recap.

# Connect with us:
Join the [Agno Discord](https://agno.link/discord) and [Groq Discord](https://discord.com/invite/groq) servers to discuss your ideas and get support from the community. If you have questions, feel free to reach out to us at [Discourse](https://agno.link/community)

# Resources:
Mixture-of-Agents:
https://arxiv.org/pdf/2406.04692

Agno:
https://docs.agno.com/introduction