# Massey's Method, Offense and Defense

Massey's Method of rating is specified as a system of linear equations as follows

$Mr = p$,

where,

- $M$ is a $n$ x $n$ matrix,
  - each $M_{ii}$ is the number of games played by the i-th team
  - each $M_{ij}$ is the negation of games played by the i-th team agains the j-th team
- $r$ are the ratings we are trying to estimate, and
- $p$ is the point differentials across games played.

We can breakdown $M$, $r$ and $p$ into individual parts <cite data-cite="2012:langville"></cite>.

We can breakdown $M$ as follows.

$M = T - P$,

where

- $T$ is a diagonal matrix and each $T_{ii}$ is the number of games played by the i-th team, and
- $P$ is an off-diagonal matrix where $P_{ij}$ is the number of games played by the i-th team against the j-th team.

We can breakdown $r$ as follows.

$r = o + d$,

where,

- $o$ is the offensive rating, and
- $d$ is the defensive rating.

We can breakdown $p$ as follows.

$p = f - a$,

where,

- $f$ is the points scored `for` a team, and
- $a$ is the points scored `against` a team.

With some algebraic manipulation, we can estimate $d$ and then $o$ <cite data-cite="2012:langville"></cite>. To estimate $d$, we need to solve the following.

$(T + P)d = Tr - f$

Once we have $d$, then $o$ is computed as

$o = r - d$.

## NCAAF 2005

Here's the advanced version of Massey's Method applied to the ACC (NCAAF) 2005 season.

In [1]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

def get_ncaaf():
    return pd.read_csv('./ranking/acc-2005-ncaaf.csv')

def get_teams(df):
    return sorted(list(set(df.t1) | set(df.t2)))

def get_fap(df):
    def get_f(t):
        return df[df.t1 == t].s1.sum() + df[df.t2 == t].s2.sum()
    
    def get_a(t):
        return -df[df.t1 == t].s2.sum() + -df[df.t2 == t].s1.sum()
    
    teams = get_teams(df)
    x = pd.DataFrame([{'for': get_f(t), 'against': get_a(t)} for t in teams], index=teams)
    x['differential'] = x['for'] + x['against']
    return x

def get_M(df):
    def get_games_played(t1, t2):
        if t1 == t2:
            return df[(df.t1 == t1) | (df.t2 == t2)].shape[0]
        else:
            q1 = (df.t1 == t1) & (df.t2 == t2)
            q2 = (df.t1 == t2) & (df.t2 == t1)
            q = q1 | q2
            return -df[q].shape[0]

    teams = get_teams(df)
    mat = [[get_games_played(t1, t2) for t2 in teams] for t1 in teams]
    mat = pd.DataFrame(mat, index=teams, columns=teams)
    return mat

def get_MTP(df):
    M = get_M(df)
    
    teams = get_teams(df)
    T = pd.DataFrame(np.diag(pd.Series(np.diag(M))), index=teams, columns=teams)
    P = T - M
    return M, T, P

def get_r(df):
    M = get_M(df)
    M.iloc[-1,:] = 1
    
    p = get_fap(df).differential
    
    model = LinearRegression()
    model.fit(M, p)
    model.intercept_, model.coef_
    
    return pd.Series(model.coef_, index=M.index)

def get_ratings(df):
    f = get_fap(df)['for']
    r = get_r(df)
    _, T, P = get_MTP(df)
    
    X = T + P
    y = T.dot(r) - f
    
    model = LinearRegression()
    model.fit(X, y)
    d = pd.Series(model.coef_, index=X.index)
    o = r - d
    
    return pd.DataFrame({
        'r': r,
        'o': o,
        'd': d
    })

def get_rankings(df):
    return pd.DataFrame({c: df[c].sort_values(ascending=False).index for c in df.columns})

Pretty neat. You can see that Miami is in the first spot overall and in terms of offensive, but in terms of defense, VT is in the first spot.

In [2]:
get_ratings(get_ncaaf())

Unnamed: 0,r,o,d
Duke,-6.8,9.2,-16.0
Miami,36.2,29.2,7.0
UNC,10.0,8.6,1.4
UVA,14.6,15.066667,-0.466667
VT,36.0,27.933333,8.066667


In [3]:
get_rankings(get_ratings(get_ncaaf()))

Unnamed: 0,r,o,d
0,Miami,Miami,VT
1,VT,VT,Miami
2,UVA,UVA,UNC
3,UNC,Duke,UVA
4,Duke,UNC,Duke


## NBA, 2021

Here's the method applied to the NBA 2021 season up to Thanksgiving.

In [4]:
def get_nba():
    x = pd.read_csv('./nba/2021.csv')\
        .rename(columns={
            'a_team': 't1', 
            'h_team': 't2', 
            'a_score': 's1', 
            'h_score': 's2'})
    x = x[x.preseason == False]\
        .drop(columns=['preseason'])\
        .reset_index(drop=True)
    return x

In [5]:
get_ratings(get_nba())

Unnamed: 0,r,o,d
76ers,2.100543,3.144971,-1.044428
Bucks,1.999356,5.413072,-3.413715
Bulls,4.235154,8.977294,-4.74214
Cavaliers,1.429301,-3.493474,4.922775
Celtics,0.579776,1.640658,-1.060882
Clippers,3.79229,-3.760822,7.553112
Grizzlies,-4.525625,-0.503283,-4.022343
Hawks,3.338752,11.853671,-8.514919
Heat,6.488263,3.167715,3.320548
Hornets,1.227789,18.738008,-17.510219


It's interesting to note that the Warriors' success is driven by their defense and not offense! Steph Curry?!

In [6]:
get_rankings(get_ratings(get_nba()))

Unnamed: 0,r,o,d
0,Warriors,Hornets,Warriors
1,Jazz,Lakers,Mavericks
2,Suns,Hawks,Nuggets
3,Heat,Bulls,Clippers
4,Nets,Pelicans,Spurs
5,Bulls,Suns,Thunder
6,Clippers,Pacers,Pistons
7,Hawks,Trail Blazers,Cavaliers
8,Pacers,Bucks,Jazz
9,Trail Blazers,Kings,Heat


## NFL, 2021

Here's the method applied to the NFL 2021 season up to Thanksgiving.

In [7]:
def get_nfl():
    x = pd.read_csv('./nfl/2021.csv')\
        .rename(columns={
            'team1': 't1', 
            'team2': 't2', 
            'score1': 's1', 
            'score2': 's2'})\
        .drop(columns=['week'])
    x['t1'] = x['t1'].apply(lambda s: s.strip())
    x['t2'] = x['t2'].apply(lambda s: s.strip())
    
    return x

In [8]:
get_ratings(get_nfl())

Unnamed: 0,r,o,d
49ers,1.565063,-1.060366,2.625429
Bears,-9.32254,-8.056754,-1.265787
Bengals,-0.607977,-1.024311,0.416334
Bills,8.425259,4.682797,3.742462
Broncos,-2.431469,-7.665542,5.234073
Browns,-2.723871,-0.667741,-2.05613
Buccaneers,6.321022,4.566463,1.754559
Cardinals,8.235687,5.565128,2.670558
Chargers,-0.551756,-0.169482,-0.382274
Chiefs,2.706593,1.612262,1.09433


The Broncos has the best defense, but one of the worst offenses!

In [9]:
get_rankings(get_ratings(get_nfl()))

Unnamed: 0,r,o,d
0,Bills,Cowboys,Broncos
1,Cardinals,Cardinals,Seahawks
2,Patriots,Bills,Patriots
3,Buccaneers,Buccaneers,Bills
4,Cowboys,Colts,Cardinals
5,Colts,Eagles,49ers
6,Chiefs,Titans,Packers
7,Titans,Patriots,Panthers
8,Rams,Chiefs,Buccaneers
9,Eagles,Rams,Chiefs
