## **All Games Analysis** 

### **Data Structure**

- Below we have imported libraries within Python that will be key for data analysis.  
- We have also taken game data from the Hudl Data Exports folder in Google Drive and transported them to VSCode via CSV. 

### **Goals**

- Our ultimate goal is to work towards building a recommendation engine that will determine which training strategies are most conducive to improving team performance. 
    - This will take time to refine as we will need an immense amount of practice and game data across several teams to ultimately produce an engine capable of providing consistently solid recommendations

- However, since we are just dealing with Plymouth High School for now, we will analyze their season in-depth through several lens. 
    - We will continue to analyze and determine which variables are most important towards winning in high school basketball
    - Given those variables, we will look to see if the team is making any improvements in those areas over the course of the season
    - Then we will see if different drills in practice training data aligns with the improvements the team is making by using regression analysis 

### **Import Libraries and Game Data**

In [156]:
import pandas as pd
import matplotlib.pyplot as plt

DCD = pd.read_csv("DCD_game_data.csv")
Novi = pd.read_csv("Novi_game_data.csv")
Pioneer = pd.read_csv("Pioneer_game_data.csv")
Rochester = pd.read_csv("Rochester_game_data.csv")
SLE = pd.read_csv("SLE_game_data.csv")
Summit = pd.read_csv("Summit_game_data.csv")
num_games = 6

### **Concatenating Dataframes into One**

In [157]:
Games = pd.concat([Summit, SLE, Pioneer, DCD, Rochester, Novi])
Games.index = range(1, (len(Games)+1))
Games.rename(columns = {"Period": "Opponent"}, inplace = True)
Games

Unnamed: 0,Opponent,eFG%,TO%,OREB%,DREB%,FTF,VPS,FGM,FGA,FG%,...,SLOB%,PPSLOB,BLOB,BLOB%,PPBLOB,DEFL,STL,BLK,FOUL,CHG
1,Summit,53.00%,46.80%,36.40%,57.50%,0.36,0.61,15,33,45.50%,...,25.00%,0.5,6,0.00%,0.0,6,4,1,16,0
2,SLE,47.70%,20.20%,51.40%,73.70%,0.23,1.2,29,64,45.30%,...,80.00%,1.2,5,40.00%,0.8,14,11,0,13,2
3,Pioneer,47.20%,20.90%,36.70%,73.30%,0.41,1.02,23,54,42.60%,...,20.00%,0.4,10,20.00%,0.3,13,7,2,13,2
4,DCD,51.90%,21.60%,44.80%,69.70%,0.54,1.07,25,52,48.10%,...,40.00%,0.6,7,57.10%,1.29,17,11,1,21,0
5,Rochester,52.00%,14.30%,34.50%,75.00%,0.37,0.99,25,51,49.00%,...,28.60%,0.71,11,36.40%,0.64,4,10,0,19,0
6,Novi,46.30%,19.50%,32.30%,58.30%,0.15,0.77,23,54,42.60%,...,0.00%,0.0,6,83.30%,1.5,5,4,2,19,0


### **Dropping Duplicate Columns**

In [158]:
Games = Games.drop(["OREB%.1", "DREB%.1", "eFG%.1", "TP", "PPG", "TO%.1", "SLOB%", "BLOB%", "CHG"], axis = 1)
Games

Unnamed: 0,Opponent,eFG%,TO%,OREB%,DREB%,FTF,VPS,FGM,FGA,FG%,...,TO,A/TO,SLOB,PPSLOB,BLOB,PPBLOB,DEFL,STL,BLK,FOUL
1,Summit,53.00%,46.80%,36.40%,57.50%,0.36,0.61,15,33,45.50%,...,34,0.18,4,0.5,6,0.0,6,4,1,16
2,SLE,47.70%,20.20%,51.40%,73.70%,0.23,1.2,29,64,45.30%,...,18,0.33,5,1.2,5,0.8,14,11,0,13
3,Pioneer,47.20%,20.90%,36.70%,73.30%,0.41,1.02,23,54,42.60%,...,17,0.53,5,0.4,10,0.3,13,7,2,13
4,DCD,51.90%,21.60%,44.80%,69.70%,0.54,1.07,25,52,48.10%,...,18,0.5,5,0.6,7,1.29,17,11,1,21
5,Rochester,52.00%,14.30%,34.50%,75.00%,0.37,0.99,25,51,49.00%,...,10,0.5,7,0.71,11,0.64,4,10,0,19
6,Novi,46.30%,19.50%,32.30%,58.30%,0.15,0.77,23,54,42.60%,...,14,0.5,1,0.0,6,1.5,5,4,2,19


In [159]:
Games["TO%"] = (((Games["TO"]*100) / (Games["FGA"] + Games["AST"] + Games["TO"] + (Games["FTA"] * 0.44)))/100).round(2)
Games["SLOBP"] = (Games["SLOB"] * Games["PPSLOB"]).round()
Games["BLOBP"] = (Games["BLOB"] * Games["PPBLOB"]).round()


### **Converting All Data to Float Type**

In [160]:
Games["eFG%"] = (Games["eFG%"].str.replace('%', '').astype(float) / 100).round(2)
Games["FG%"] = (Games["FG%"].str.replace('%', '').astype(float) / 100).round(2)
Games["FT%"] = (Games["FT%"].str.replace('%', '').astype(float) / 100).round(2)
Games["2FG%"] = (Games["2FG%"].str.replace('%', '').astype(float) / 100).round(2)
Games["3FG%"] = (Games["3FG%"].str.replace('%', '').astype(float) / 100).round(2)
Games["OREB%"] = (Games["OREB%"].str.replace('%', '').astype(float) / 100).round(2)
Games["DREB%"] = (Games["DREB%"].str.replace('%', '').astype(float) / 100).round(2)
Games

Unnamed: 0,Opponent,eFG%,TO%,OREB%,DREB%,FTF,VPS,FGM,FGA,FG%,...,SLOB,PPSLOB,BLOB,PPBLOB,DEFL,STL,BLK,FOUL,SLOBP,BLOBP
1,Summit,0.53,0.43,0.36,0.57,0.36,0.61,15,33,0.46,...,4,0.5,6,0.0,6,4,1,16,2.0,0.0
2,SLE,0.48,0.19,0.51,0.74,0.23,1.2,29,64,0.45,...,5,1.2,5,0.8,14,11,0,13,6.0,4.0
3,Pioneer,0.47,0.19,0.37,0.73,0.41,1.02,23,54,0.43,...,5,0.4,10,0.3,13,7,2,13,2.0,3.0
4,DCD,0.52,0.2,0.45,0.7,0.54,1.07,25,52,0.48,...,5,0.6,7,1.29,17,11,1,21,3.0,9.0
5,Rochester,0.52,0.13,0.34,0.75,0.37,0.99,25,51,0.49,...,7,0.71,11,0.64,4,10,0,19,5.0,7.0
6,Novi,0.46,0.18,0.32,0.58,0.15,0.77,23,54,0.43,...,1,0.0,6,1.5,5,4,2,19,0.0,9.0


### **Splitting Dataframe into Basic and Advanced Stats Dataframes**

In [161]:
Games = Games[["Opponent", "PF", "PA", "+/-", "FGM", "FGA", "FG%", "2FGM", "2FGA", "2FG%", "3FGM", "3FGA", "3FG%", "FTM", "FTA", "FT%", "OREB", "DREB", "REB", "AST", "TO", "A/TO", "DEFL", "STL", "BLK", "FOUL", "MINS", "PPP", "eFG%", "OREB%", "DREB%", "TO%", "PoT", "SCP", "PiP", "VPS", "FTF", "SLOB", "PPSLOB", "SLOBP", "BLOB", "PPBLOB", "BLOBP"]]
Basic = Games.iloc[:, :27]
Basic["+/-"] = Basic["PF"] - Basic["PA"]
Basic

Unnamed: 0,Opponent,PF,PA,+/-,FGM,FGA,FG%,2FGM,2FGA,2FG%,...,DREB,REB,AST,TO,A/TO,DEFL,STL,BLK,FOUL,MINS
1,Summit,36,78,-42,15,33,0.46,10,24,0.42,...,23,31,6,34,0.18,6,4,1,16,32
2,SLE,75,57,18,29,64,0.45,26,51,0.51,...,28,46,6,18,0.33,14,11,0,13,34
3,Pioneer,62,60,2,23,54,0.43,18,37,0.49,...,22,33,9,17,0.53,13,7,2,13,34
4,DCD,73,67,6,25,52,0.48,21,38,0.55,...,23,36,9,18,0.5,17,11,1,21,37
5,Rochester,63,54,9,25,51,0.49,22,38,0.58,...,15,25,5,10,0.5,4,10,0,19,34
6,Novi,53,60,-7,23,54,0.43,19,42,0.45,...,14,24,7,14,0.5,5,4,2,19,34


In [162]:
Advanced = Games.iloc[:, 27:]
Advanced["Opponent"] = Games["Opponent"]
last = Advanced.iloc[:, -1]
Advanced = Advanced.drop(Advanced.columns[-1], axis =1)
Advanced.insert(0, last.name, last)
Advanced["PPFGA"] = ((Basic["PF"] - Basic["FTM"]) / Basic["FGA"]).round(2)
last2 = Advanced.iloc[:, -1]
Advanced = Advanced.drop(Advanced.columns[-1], axis =1)
Advanced.insert(1, last2.name, last2)
Advanced

Unnamed: 0,Opponent,PPFGA,PPP,eFG%,OREB%,DREB%,TO%,PoT,SCP,PiP,VPS,FTF,SLOB,PPSLOB,SLOBP,BLOB,PPBLOB,BLOBP
1,Summit,1.06,0.56,0.53,0.36,0.57,0.43,4,4,20,0.61,0.36,4,0.5,2.0,6,0.0,0.0
2,SLE,0.95,1.05,0.48,0.51,0.74,0.19,24,21,50,1.2,0.23,5,1.2,6.0,5,0.8,4.0
3,Pioneer,0.94,0.88,0.47,0.37,0.73,0.19,11,9,32,1.02,0.41,5,0.4,2.0,10,0.3,3.0
4,DCD,1.04,1.04,0.52,0.45,0.7,0.2,19,11,30,1.07,0.54,5,0.6,3.0,7,1.29,9.0
5,Rochester,1.04,1.05,0.52,0.34,0.75,0.13,11,6,40,0.99,0.37,7,0.71,5.0,11,0.64,7.0
6,Novi,0.93,0.86,0.46,0.32,0.58,0.18,14,10,34,0.77,0.15,1,0.0,0.0,6,1.5,9.0


In [163]:
Basic_Averages = Basic.drop("Opponent", axis = 1)
Basic_Averages = Basic_Averages.mean().round(2)
Basic_Averages = Basic_Averages.to_frame().T
Basic_Averages["FG%"] = (Basic_Averages["FGM"] / Basic_Averages["FGA"]).round(2)
Basic_Averages["2FG%"] = (Basic_Averages["2FGM"] / Basic_Averages["2FGA"]).round(2)
Basic_Averages["3FG%"] = (Basic_Averages["3FGM"] / Basic_Averages["3FGA"]).round(2)
Basic_Averages["A/TO"] = (Basic_Averages["AST"] / Basic_Averages["TO"]).round(2)
Basic_Averages["FT%"] = (Basic_Averages["FTM"] / Basic_Averages["FTA"]).round(2)
Basic_Averages

Unnamed: 0,PF,PA,+/-,FGM,FGA,FG%,2FGM,2FGA,2FG%,3FGM,...,DREB,REB,AST,TO,A/TO,DEFL,STL,BLK,FOUL,MINS
0,60.33,62.67,-2.33,23.33,51.33,0.45,19.33,38.33,0.5,4.0,...,20.83,32.5,7.0,18.5,0.38,9.83,7.83,1.0,16.83,34.17


In [167]:
Advanced_Averages = Advanced.drop("Opponent", axis = 1)
Advanced_Averages = Advanced_Averages.mean().round(2)
Advanced_Averages = Advanced_Averages.to_frame().T
Advanced_Averages["eFG%"] = (((Basic_Averages["FGM"] * num_games) + (0.5 * (Basic_Averages["3FGM"] * num_games))) / (Basic_Averages["FGA"] * num_games)).round(2)
Advanced_Averages["TO%"] = (((Basic_Averages["TO"] * num_games)) / ((Basic_Averages["FGA"] * num_games) + (0.44 * (Basic_Averages["FTA"] * num_games)) + (Basic_Averages["TO"] * num_games))).round(2)
Advanced_Averages["PPFGA"] = (((Basic_Averages["PF"] * num_games) - (Basic_Averages["FTM"] * num_games)) / (Basic_Averages["FGA"] * num_games)).round(2)
Advanced_Averages

Unnamed: 0,PPFGA,PPP,eFG%,OREB%,DREB%,TO%,PoT,SCP,PiP,VPS,FTF,SLOB,PPSLOB,SLOBP,BLOB,PPBLOB,BLOBP
0,0.99,0.91,0.49,0.39,0.68,0.24,13.83,10.17,34.33,0.94,0.34,4.5,0.57,3.0,7.5,0.76,5.33
