# Value of Plays Using Run Expectancy

## The Run Expectancy Matrix

The **run expectancy matrix** is a fundamental sabermetrics concept that calculates the average number of runs expected to score for each game state (combination of baserunners and outs).

**Key concepts:**
- 8 possible base states (each of 3 bases can be empty or occupied: 2³ = 8)
- 3 possible out states (0, 1, or 2 outs)
- **24 total game states** (8 × 3 = 24)

## Runs Scored in the Remainder of the Inning

$$\text{runs}_{\text{roi}} = \text{runs}_{\text{Total in Inning}} - \text{runs}_{\text{So far in Inning}}$$

This calculates how many runs will be scored from the current game state until the end of the inning.

In [24]:
import pandas as pd
fields = pd.read_csv('../data/fields.csv')
headers = fields['Header'].tolist()

retro2016 = pd.read_csv('../data/all2016.csv', names=headers, low_memory=False)

In [None]:
retro2016['RUNS_BEFORE'] = retro2016['AWAY_SCORE_CT'] + retro2016['HOME_SCORE_CT']
retro2016['HALF_INNING'] = retro2016[['GAME_ID', 'INN_CT', 'BAT_HOME_ID']].astype(str).agg(' '.join, axis=1)

In [26]:
retro2016['RUNS_SCORED'] = (
    (retro2016['BAT_DEST_ID'] > 3) + 
    (retro2016['RUN1_DEST_ID'] > 3) + 
    (retro2016['RUN2_DEST_ID'] > 3) + 
    (retro2016['RUN3_DEST_ID'] > 3)
)

In [29]:
half_innings = retro2016.groupby('HALF_INNING').agg({
    'EVENT_OUTS_CT': 'sum',
    'RUNS_SCORED': 'sum',
    'RUNS_BEFORE': 'first'
}).reset_index()

# Rename columns
half_innings.columns = ['HALF_INNING', 'OUTS_INNING', 'RUNS_INNING', 'RUNS_START']

# Calculate max_runs
half_innings['MAX_RUNS'] = half_innings['RUNS_INNING'] + half_innings['RUNS_START']

In [32]:
# Merge retro2016 with half_innings data
retro2016 = retro2016.merge(half_innings, on='HALF_INNING', how='inner')

# Calculate runs scored in remainder of inning
retro2016['RUNS_ROI'] = retro2016['MAX_RUNS'] - retro2016['RUNS_BEFORE']
