# Ranking ACC Teams from 2018-19 Season

### Introduction:
All 15 ACC teams play in the ACC men's basketball tournament, and rankings are determined by conference records.  While using conference record is an adequate method for ranking these teams, we will create a statistic that is more predictive of success in the ACC tournament.  Below we use box score data from the 2018-2019 season to seed ACC teams for the 2019 ACC tournament.

We will create an adjusted version of plus/minus to rank each ACC team during the 2018-19 season.  The normal plus/minus statistic can be used to calculate the point differential between a team and its opponents over a season.  For example, if team A ends the season with a +20 plus/minus then team A scored 20 more total points than its opponents scored against it.  We will adjust this statistic to take into account both the quality of the competition team A played and also whether the games were home or away.  Note, we rank the teams based on the 18 games each played in the season before the tournament.

### Explanation of Normal Plus/Minus Statistic
Let's say team A played 5 games in a season with the plus/minus of those games listed here: +10, -10, +12, -3, -3.  Team A beat the first team by 10, etc... We take the sum of those numbers to find that team A had a +6 plus/minus over the season.

### Explanation of Adjusted Plus/Minus Statistic
Let's say team A had a plus/minus of +1 for a home game and a plus/minus of +1 for an away game.  All else considered equal, winning by 1 point on the road is intuitively more difficult than winning by 1 point at home.  Below we also show that the home team on average beat the away team by 1.28 points.  Therefore, homecourt advantage inflates plus/minus, and we should weight team A's +1 home-game win less than its +1 away-game win.  Standardizing the plus/minus of both home games and away games accomplishes this.  

Let's denote standardized plus/minus as spm.  Now let's say that team A had an spm of +1 for a home game and a spm +1 for an away game.  Unlike the regular plus/minus statistic, +1 spm has the same interpretation regardless of whether the game was played home or away.  All else considered equal, achieving a home game spm of +1 is just as difficult as achieving an away game spm of +1.

After standardizing each team's plus/minus for their individual games, we then take into account quality of competition.  Let's say team A beats Notre Dame by an spm of +1 in its first game of the season.  Below we calculate that an average ACC team finished the 2018-19 season with 9 wins.  Notre Dame finished the season with 3 wins, so Notre Dame had $\frac{1}{3}$ of the ave. number of wins.  Now let $apm_i$ denote adjusted plus/minus and let $spm_i$ denote standardized plus/minus where $i=1,2,3$ and denotes the first, second, third, etc... game a team played in the season.  Let $oppwins$ denote the opponent's number of wins and let $avewins$ denote the ave. number of wins in the ACC.  Then we create the following formula:

For $spm_i>=0$, $apm_i$ = $spm_i*\frac{oppwins}{avewins}$

For $spm_i<0$, $apm_i$ = $spm_i*(1+\frac{avewin-oppwins}{avewins}$)

Essentially, if the opponent has a record that is x% better than average, then $spm_i$ increases by x%.  On the other hand, if the opponent has a record that is x% worse than average, then $spm_i$ decreases by x%.  Therefore, $apm_i$ is weighted by the quality of competition played.

We rate each team by their total adjusted plus/minus over the season.  Thus, each team's rating is $\Sigma_{i=1}^{18}apm_i$

### Comparing Each Team's Predicted Rating to Actual Rating
In reality, these teams were rated based on the number of wins they had against other conference teams - teams' records against each other were used to determine tie-breakers. The last table below shows that the predicted rating system had the exact same ranking as the actual system for the top 4 teams.  The biggest difference between the two systems is that Georgia Tech is ranked 10 in the actual system but 14 in the predicted system.  The predicted system pushed Georgia Tech to a lower ranking due to four key losses.  Two 20+ point losses happened against Clemson and Louisville.  These huge point differentials also happened at home, which exarcerbated their effect in the predicted system.  They also had two 10+ pt losses to Miami and Notre Dame.  These two teams had low numbers of wins, which exarcerbated the effects of those losses in the predicted system.  



### Import Packages and Clean Data

In [1]:
import os
import sqlite3
import pandas as pd
import numpy as np

In [2]:
con = sqlite3.connect("acc1819.db")

In [3]:
# create games dataframe
query = con.execute("SELECT * FROM games")
cols = [column[0] for column in query.description]
games = pd.DataFrame.from_records(data = query.fetchall(), columns = cols)

In [4]:
#create box_scores dataframe
query1 = con.execute("SELECT * FROM box_scores")
cols1 = [column[0] for column in query1.description]
box_scores = pd.DataFrame.from_records(data = query1.fetchall(), columns = cols1)

In [5]:
box_scores.head()

Unnamed: 0,GameId,Team,Home,Score,AST,TOV,STL,BLK,Rebounds,ORB,DRB,FGA,FGM,3FGM,3FGA,FTA,FTM,Fouls
0,1,Virginia Tech Hokies,1,81,19,7,5,1,24,2,22,55,33,11,18,5,4,13
1,1,Notre Dame Fighting Irish,0,66,13,11,2,5,30,13,17,56,23,13,34,13,7,10
2,2,North Carolina State Wolfpack,0,87,17,16,4,3,50,17,33,68,31,11,30,18,14,23
3,2,Miami (FL) Hurricanes,1,82,12,7,7,1,27,9,18,61,28,10,25,29,16,14
4,3,Duke Blue Devils,1,87,16,12,13,6,39,12,27,67,32,7,23,21,16,15


In [6]:
# filter out ACC tournament games
games1=games[games["NeutralSite"] == 0].reset_index(drop=True)

In [7]:
games1.head()

Unnamed: 0,GameId,GameDate,NeutralSite,AwayTeam,HomeTeam
0,1,1/1/2019 13:00,0,Notre Dame Fighting Irish,Virginia Tech Hokies
1,2,1/3/2019 19:00,0,North Carolina State Wolfpack,Miami (FL) Hurricanes
2,3,1/5/2019 3:27,0,Clemson Tigers,Duke Blue Devils
3,4,1/5/2019 12:00,0,Boston College Eagles,Virginia Tech Hokies
4,5,1/5/2019 12:00,0,Syracuse Orange,Notre Dame Fighting Irish


In [8]:
#merge away team stats to games1 table
games2 = games1.merge(box_scores,how="inner",left_on=['GameId','AwayTeam'],right_on=['GameId','Team'])[['GameId','GameDate','HomeTeam','AwayTeam','Score']].rename(columns={'Score':'AwayScore'}) 

In [9]:
#merge home team stats to games2 table
games3 = games2.merge(box_scores,how="inner",left_on=['GameId','HomeTeam'],right_on=['GameId','Team'])[['GameId','GameDate','AwayTeam','AwayScore','HomeTeam','Score']].rename(columns={'Score':'HomeScore'})

In [10]:
#create column with 1 if hometeam won, 0 otherwise
games3['HomeWin'] = games3[['AwayScore','HomeScore']].apply(lambda x: 1 if x['HomeScore']>x['AwayScore'] else 0,axis=1)

#create column with 1 if awayteam won, 0 otherwise
games3['AwayWin'] = games3[['AwayScore','HomeScore']].apply(lambda x: 1 if x['HomeScore']<x['AwayScore'] else 0,axis=1)

#create column for point differential for hometeam
games3['HomeDiff'] = games3[['AwayScore','HomeScore']].apply(lambda x: x['HomeScore']-x['AwayScore'],axis=1)

#create column for point differential for awayteam
games3['AwayDiff'] = games3[['AwayScore','HomeScore']].apply(lambda x: x['AwayScore']-x['HomeScore'],axis=1)

In [11]:
#games 3 now contains a row for each game played and columns with point differentials
games3.head()

Unnamed: 0,GameId,GameDate,AwayTeam,AwayScore,HomeTeam,HomeScore,HomeWin,AwayWin,HomeDiff,AwayDiff
0,1,1/1/2019 13:00,Notre Dame Fighting Irish,66,Virginia Tech Hokies,81,1,0,15,-15
1,2,1/3/2019 19:00,North Carolina State Wolfpack,87,Miami (FL) Hurricanes,82,0,1,-5,5
2,3,1/5/2019 3:27,Clemson Tigers,68,Duke Blue Devils,87,1,0,19,-19
3,4,1/5/2019 12:00,Boston College Eagles,66,Virginia Tech Hokies,77,1,0,11,-11
4,5,1/5/2019 12:00,Syracuse Orange,72,Notre Dame Fighting Irish,62,0,1,-10,10


In [12]:
#prepare data to calculate total games each team won in the season
hometeam_wins = games3.groupby(['HomeTeam'])[['HomeWin']].sum().reset_index()
awayteam_wins = games3.groupby(['AwayTeam'])[['AwayWin']].sum().reset_index()

#calculate total wins each team had in the season
total_wins = hometeam_wins['HomeWin'] + awayteam_wins['AwayWin']

total_wins1 = pd.concat([awayteam_wins['AwayTeam'],pd.Series(total_wins)],axis=1).rename(columns={"AwayTeam":"Team",0:"Wins"})
total_wins1

Unnamed: 0,Team,Wins
0,Boston College Eagles,5
1,Clemson Tigers,9
2,Duke Blue Devils,14
3,Florida State Seminoles,13
4,Georgia Tech Yellow Jackets,6
5,Louisville Cardinals,10
6,Miami (FL) Hurricanes,5
7,North Carolina State Wolfpack,9
8,North Carolina Tar Heels,16
9,Notre Dame Fighting Irish,3


In [13]:
#create column for wins that away team had in the season
games4 = games3.merge(total_wins1,how="left",left_on=["AwayTeam"],right_on=["Team"])

In [14]:
#create column for wins that home team had in the season
games5 = games4.merge(total_wins1,how="left",left_on=["HomeTeam"],right_on=["Team"])

In [15]:
#games6 now contains columns that list total wins the away team and the home team had in the season
games6 = games5.rename(columns={"Wins_x":"AwayWins","Wins_y":"HomeWins"}).drop(columns=['GameId','GameDate','AwayScore',
                                                                                       'HomeScore','HomeWin','AwayWin',
                                                                                       'Team_x','Team_y'])
games6.head()

Unnamed: 0,AwayTeam,HomeTeam,HomeDiff,AwayDiff,AwayWins,HomeWins
0,Notre Dame Fighting Irish,Virginia Tech Hokies,15,-15,3,12
1,North Carolina State Wolfpack,Miami (FL) Hurricanes,-5,5,9,5
2,Clemson Tigers,Duke Blue Devils,19,-19,9,14
3,Boston College Eagles,Virginia Tech Hokies,11,-11,5,12
4,Syracuse Orange,Notre Dame Fighting Irish,-10,10,10,3


In [16]:
#home team on average beat away team by 1.28 pts
games6[['HomeDiff','AwayDiff']].describe()

Unnamed: 0,HomeDiff,AwayDiff
count,135.0,135.0
mean,1.281481,-1.281481
std,14.477286,14.477286
min,-38.0,-30.0
25%,-8.0,-11.5
50%,3.0,-3.0
75%,11.5,8.0
max,30.0,38.0


In [17]:
#create column for mean wins the ACC teams had over the season
games6['mean_wins'] = total_wins.mean()
games6.head()

Unnamed: 0,AwayTeam,HomeTeam,HomeDiff,AwayDiff,AwayWins,HomeWins,mean_wins
0,Notre Dame Fighting Irish,Virginia Tech Hokies,15,-15,3,12,9.0
1,North Carolina State Wolfpack,Miami (FL) Hurricanes,-5,5,9,5,9.0
2,Clemson Tigers,Duke Blue Devils,19,-19,9,14,9.0
3,Boston College Eagles,Virginia Tech Hokies,11,-11,5,12,9.0
4,Syracuse Orange,Notre Dame Fighting Irish,-10,10,10,3,9.0


### Calculate Adjusted Plus Minus


In [18]:
#standardize home pt differential
games6['HomeDiffStand'] = (games6['HomeDiff']-games6['HomeDiff'].mean())/games6['HomeDiff'].std()
#standardize away pt differential
games6['AwayDiffStand'] = (games6['AwayDiff']-games6['AwayDiff'].mean())/games6['AwayDiff'].std()

In [19]:
games6['HomeStat_sub'] = games6.apply(lambda x: x['HomeDiffStand']*(x['AwayWins']/x['mean_wins']) if x['HomeDiffStand']>=0
                                                                   else x['HomeDiffStand']*(2-(x['AwayWins']/x['mean_wins'])),axis=1)

games6['AwayStat_sub'] = games6.apply(lambda x: x['AwayDiffStand']*(x['HomeWins']/x['mean_wins']) if x['AwayDiffStand']>=0
                                                                   else x['AwayDiffStand']*(2-(x['HomeWins']/x['mean_wins'])),axis=1)

In [20]:
#HomeDiffStand and AwayDiffStand are columns for spm of home and away team, respectively
games6.head()

Unnamed: 0,AwayTeam,HomeTeam,HomeDiff,AwayDiff,AwayWins,HomeWins,mean_wins,HomeDiffStand,AwayDiffStand,HomeStat_sub,AwayStat_sub
0,Notre Dame Fighting Irish,Virginia Tech Hokies,15,-15,3,12,9.0,0.947589,-0.947589,0.315863,-0.631726
1,North Carolina State Wolfpack,Miami (FL) Hurricanes,-5,5,9,5,9.0,-0.433885,0.433885,-0.433885,0.241047
2,Clemson Tigers,Duke Blue Devils,19,-19,9,14,9.0,1.223884,-1.223884,1.223884,-0.543948
3,Boston College Eagles,Virginia Tech Hokies,11,-11,5,12,9.0,0.671294,-0.671294,0.372941,-0.447529
4,Syracuse Orange,Notre Dame Fighting Irish,-10,10,10,3,9.0,-0.779254,0.779254,-0.69267,0.259751


In [21]:
homestat_sub = games6.groupby(['HomeTeam'])['HomeStat_sub'].sum().reset_index()
awaystat_sub = games6.groupby(['AwayTeam'])['AwayStat_sub'].sum().reset_index()

In [22]:
adj_plus_minus = homestat_sub['HomeStat_sub'] + awaystat_sub['AwayStat_sub']

In [23]:
adj_plus_minus1 = pd.concat([awaystat_sub['AwayTeam'],pd.Series(adj_plus_minus)],axis=1).rename(columns={"AwayTeam":"Team",0:"adj_plus_minus"})

In [24]:
adj_plus_minus2 = adj_plus_minus1.sort_values(by=["adj_plus_minus"],ascending=False).reset_index(drop=True)
adj_plus_minus2.index = adj_plus_minus2.index + 1
adj_plus_minus3 = adj_plus_minus2.reset_index().rename(columns={"index":"Rank","adj_plus_minus":"Rating"})

In [25]:
adj_plus_minus3.to_csv("ACCRankings1819.csv",index=False)

In [26]:
actual_ranking = ['Virginia Cavaliers','North Carolina Tar Heels',
                 'Duke Blue Devils','Florida State Seminoles',
                 'Virginia Tech Hokies','Syracuse Orange',
                 'Louisville Cardinals','North Carolina State Wolfpack',
                 'Clemson Tigers','Georgia Tech Yellow Jackets',
                 'Boston College Eagles','Miami FL Hurricanes',
                 'Wake Forest Demon Deacons','Pittsburgh Panthers',
                 'Notre Dame Fighting Irish']
actual_rating = [16,16,14,13,12,10,10,9,9,6,5,5,4,3,3]

In [27]:
#compare our rankings to actual rankings
pd.concat((adj_plus_minus3[['Rank','Team','Rating']],pd.DataFrame(actual_rating,actual_ranking).reset_index()),axis=1).rename(columns={'Team':'Predicted Order','Rating':'Adj Plus Minus','index':'Actual Order',0:'Wins'})

Unnamed: 0,Rank,Predicted Order,Adj Plus Minus,Actual Order,Wins
0,1,Virginia Cavaliers,14.237373,Virginia Cavaliers,16
1,2,North Carolina Tar Heels,12.041113,North Carolina Tar Heels,16
2,3,Duke Blue Devils,8.427221,Duke Blue Devils,14
3,4,Florida State Seminoles,5.212195,Florida State Seminoles,13
4,5,Louisville Cardinals,5.165009,Virginia Tech Hokies,12
5,6,Virginia Tech Hokies,4.630554,Syracuse Orange,10
6,7,Clemson Tigers,2.679492,Louisville Cardinals,10
7,8,North Carolina State Wolfpack,0.513249,North Carolina State Wolfpack,9
8,9,Syracuse Orange,-0.291645,Clemson Tigers,9
9,10,Miami (FL) Hurricanes,-4.701561,Georgia Tech Yellow Jackets,6
