## [Riddler League Baseball](https://fivethirtyeight.com/features/which-baseball-team-will-win-the-riddler-fall-classic/)

Riddler League Baseball, also known as the RLB, consists of three teams: the Mississippi Moonwalkers, the Delaware Doubloons and the Tennessee Taters.

Each time a batter for the Moonwalkers comes to the plate, they have a 40 percent chance of getting a walk and a 60 percent chance of striking out. Each batter for the Doubloons, meanwhile, hits a double 20 percent percent of the time, driving in any teammates who are on base, and strikes out the remaining 80 percent of the time. Finally, each batter for the Taters has a 10 percent chance of hitting a home run and a 90 percent chance of striking out.

During the RLB season, each team plays an equal number of games against each opponent. Games are nine innings long and can go into extra innings just like in other baseball leagues. Which of the three teams is most likely to have the best record at the end of the season?

In [1]:
from rlb_definitions import Team, Game, make_random_team, simulation
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
# Riddler League Baseball Specifications:
moonwalkers = Team(name='Moonwalkers', k_rate=.6, bb_rate=.4, hr_rate=0, double_rate=0)
doubloons = Team(name='Doubloons', k_rate=.8, bb_rate=0, hr_rate=0, double_rate=.2)
taters = Team(name='Taters', k_rate=.9, bb_rate=0, hr_rate=.1, double_rate=0)
rlb_league = [moonwalkers, doubloons, taters]

In [3]:
rlb_results = simulation(teams=rlb_league, num=10_000)
rlb_results_df = pd.DataFrame(rlb_results)

In [4]:
rlb_results_df.head()

Unnamed: 0,away,home,away_score,home_score,total_innings,winner,loser,extra_innings
0,Moonwalkers,Doubloons,2,1,9.0,Moonwalkers,Doubloons,False
1,Moonwalkers,Taters,6,5,16.0,Moonwalkers,Taters,True
2,Doubloons,Moonwalkers,3,1,9.0,Doubloons,Moonwalkers,False
3,Doubloons,Taters,2,4,8.5,Taters,Doubloons,False
4,Taters,Moonwalkers,4,6,10.0,Moonwalkers,Taters,True


In [5]:
def winning_percentage(team, results_df):
    games_played = results_df[(results_df['away'] == team) | (results_df['home'] == team)].shape[0]
    games_won = results_df[results_df['winner'] == team].shape[0]
    return games_won / games_played

In [6]:
print(f"Taters's winning percentage: {winning_percentage('Taters', rlb_results_df)}")
print(f"Moonwalkers's winning percentage: {winning_percentage('Moonwalkers', rlb_results_df)}")
print(f"Doubloons's winning percentage: {winning_percentage('Doubloons', rlb_results_df)}")

Taters's winning percentage: 0.572275
Moonwalkers's winning percentage: 0.535125
Doubloons's winning percentage: 0.3926


In [7]:
def runs_scored(team, results_df):
    team_away = (results_df['away'] == team)
    team_home = (results_df['home'] == team)
    games_played = results_df[team_away | team_home].shape[0]
    runs_scored = results_df[team_away]['away_score'].sum() + results_df[team_home]['home_score'].sum()
    return runs_scored, runs_scored / games_played

In [8]:
def runs_against(team, results_df):
    team_away = (results_df['away'] == team)
    team_home = (results_df['home'] == team)
    games_played = results_df[team_away | team_home].shape[0]
    opp_runs_scored = results_df[team_away]['home_score'].sum() + results_df[team_home]['away_score'].sum()
    return opp_runs_scored, opp_runs_scored / games_played

In [9]:
def run_differential(team, results_df):
    return runs_scored(team, results_df)[0] - runs_against(team, results_df)[0], runs_scored(team, results_df)[1] - runs_against(team, results_df)[1]

In [10]:
print(f"Taters's scored {runs_scored('Taters', rlb_results_df)[0]} runs ({runs_scored('Taters', rlb_results_df)[1]} per game)")
print(f"Moonwalkers's scored {runs_scored('Moonwalkers', rlb_results_df)[0]} runs ({runs_scored('Moonwalkers', rlb_results_df)[1]} per game)")
print(f"Doubloons's scored {runs_scored('Doubloons', rlb_results_df)[0]} runs ({runs_scored('Doubloons', rlb_results_df)[1]} per game)")

Taters's scored 121122 runs (3.02805 per game)
Moonwalkers's scored 137079 runs (3.426975 per game)
Doubloons's scored 96960 runs (2.424 per game)


In [11]:
print(f"Taters's scored {runs_scored('Taters', rlb_results_df)[0]} runs and gave up {runs_against('Taters', rlb_results_df)[0]} runs ({run_differential('Taters', rlb_results_df)[0]} differential)")
print(f"Moonwalkers's scored {runs_scored('Moonwalkers', rlb_results_df)[0]} runs and gave up {runs_against('Moonwalkers', rlb_results_df)[0]} runs ({run_differential('Moonwalkers', rlb_results_df)[0]} differential)")
print(f"Doubloons's scored {runs_scored('Doubloons', rlb_results_df)[0]} runs and gave up {runs_against('Taters', rlb_results_df)[0]} runs ({run_differential('Doubloons', rlb_results_df)[0]} differential)")      

Taters's scored 121122 runs and gave up 116252 runs (4870 differential)
Moonwalkers's scored 137079 runs and gave up 109017 runs (28062 differential)
Doubloons's scored 96960 runs and gave up 116252 runs (-32932 differential)


In [12]:
# Percent extra inning games
rlb_results_df['extra_innings'].sum() / rlb_results_df.shape[0]

0.13688333333333333

In [13]:
# Add in a random team with 75% strikeout rate
win_totals = []
for _ in range(1_000):
    rlb_plus_one_random = [moonwalkers, doubloons, taters, make_random_team(name='Randos', k_rate=0.75)]
    rlb_plus_randos_results = simulation(teams=rlb_plus_one_random, num=40)
    rlb_plus_randos_results_df = pd.DataFrame(rlb_plus_randos_results)
    sim_win_pcts = dict(rlb_plus_randos_results_df['winner'].value_counts(normalize=True))
    win_totals.append(sim_win_pcts)
random_win_totals = pd.DataFrame(win_totals)

In [14]:
# The Taters are juiced
random_win_totals.mean()

Taters         0.295083
Moonwalkers    0.278765
Doubloons      0.220862
Randos         0.205290
dtype: float64

Probably want to run more simulations for the "Random Team" but Taters are still champs!