#### Author: Wade Bryson
#### GitHub: https://github.com/WadeBryson
#### Class: 44-608 Data Analytics Fundamentals
#### Module 6 Project 6: Tell Your Own Story with Data

# **<p style="text-align: center;">North Andrew Boys Basketball</p>**
### *<p style="text-align: center;">A Statistical Analysis of the Post Hudl Era</p>*
***

## **North Andrew Background Information**

<img align="right" width="125" height="125" src="https://static.hudl.com/users/temp/5021822_b3d38657e36a4706b8bff82999688cdc.gif">

##### North Andrew is a small school district located along Highway 71 at Rosendale, MO. North Andrew’s constituents live in the small towns of Bolckow, Rosendale, Fillmore, and their surrounding rural areas. North Andrew’s Boys Basketball program has season statistics dating back to 1967. North Andrew kept primitive, basic stats for years but their ability to keep and analyze their stats improved dramatically in the 2014-2015 season when they purchased Hudl. Hudl is a subscription based service that ties real time data analysis with video footage and produces advanced statistics that are easily accessible. With all of this information readily available, high school programs that have found ways to read, analyze, and then implement strategies using their statistics have thrived.



## **Initial Look at Data Set**

##### For years North Andrew only kept track of points scored, three pointers made, and free throws. Now, with help from Hudl, we keep track of all kinds of different statistics. Let’s take a look at the Post-Hudl Era Database below.

In [4]:
import pandas as pd
Hudl_Era = pd.read_csv('Post_Hudl_Era_Stats.csv')

# By default pandas only shows a few columns on the right and a few at the end. We want to change the default settings so it shows all columns.
pd.set_option('display.max_columns', None)
Hudl_Era.head(7)

Unnamed: 0,Athletes,Year,Grade,GP,eFG%,VPS,FGM,FGA,FG%,2FGM,2FGA,2FG%,3FGM,3FGA,3FG%,FTM,FTA,FT%,PF,PPG,+/-,MINS,SCP,PiP,OREB,DREB,REB,AST,TO,A/TO,STL,BLK,FOUL,CHG
0,Tanner McDaniel,2020-2021,12th,32,60.80%,1.3,273,528,51.70%,177,271,65.30%,96,257,37.40%,95,122,77.90%,737,23.0,319,946,82,306,36,78,114,88,114,0.77,66,9,69,0
1,Levi Linville,2018-2019,12th,25,68.90%,1.9,175,254,68.90%,175,253,69.20%,0,1,0.00%,92,146,63.00%,442,17.7,349,584,122,346,110,86,196,30,40,0.75,39,9,64,1
2,Owen Graham,2021-2022,12th,28,49.5%,1.69,187,379,49.3%,186,367,50.7%,1,12,8.3%,89,145,61.4%,464,16.6,109,861,99,356,97,257,354,113,93,1.22,26,96,68,0
3,Ryan Hughes,2016-2017,11th,31,57.90%,1.9,101,215,47.00%,54,113,47.80%,47,102,46.10%,82,100,82.00%,331,10.7,285,376,26,90,12,38,50,145,54,2.69,45,3,25,1
4,Tanner McDaniel,2019-2020,11th,25,51.40%,1.15,166,359,46.20%,129,233,55.40%,37,126,29.40%,72,104,69.20%,441,17.6,197,681,51,216,50,75,125,68,80,0.85,44,21,72,0
5,Braxon Linville,2021-2022,9th,26,50.7%,1.12,104,213,48.8%,96,192,50.0%,8,21,38.1%,67,110,60.9%,283,10.9,144,746,49,192,80,79,159,49,78,0.63,23,0,53,1
6,Caleb Patterson,2016-2017,11th,29,59.50%,1.66,153,259,59.10%,151,254,59.40%,2,5,40.00%,60,97,61.90%,368,12.7,308,350,111,292,96,109,205,36,30,1.2,33,17,71,2


In [7]:
# Displaying the amount of athletes in the Database (# Rows)
numb_athletes = len(Hudl_Era)
print ("Number of Athletes:", numb_athletes)

# Displaying the amount of columns in the Database
numb_columns = len(Hudl_Era.columns)
print ("Number of Columns:", numb_columns)

Number of Athletes: 144
Number of Columns: 34


##### The Post-Hudl Era Database has three columns containing categorical information about a player; their name, the year, and their grade. It also has 31 columns containing different numeric statistics accumulated throughout the season. This drastic increase in information available has forever changed the North Andrew Boys’ Basketball Program.

## **Intoduction to Modern Statistics**

#### **Value Points System (VPS)** 

##### Let’s now dive into some of the new advanced statistics that we track. The first one we will focus on is the Value Points System, “VPS”. The Value Points System gives points for statistics that are widely accepted as being positive outcomes (Points + Rebounds + 2*(Assists + Charges + Steals + Blocks)). It then divides this number by a number that gives points for negative statistics (FT Misses + 2*(Misses + Fouls + Turnovers)). With this formula a VPS of 1 is considered average. Let’s look at some basic statistics with the North Andrew Boys’ Basketball Program and our VPS.

In [17]:
# Setting the precision to 2 decimal places
pd.set_option("display.precision", 2)

Hudl_Era['VPS'].describe()

count    144.00
mean       1.20
std        0.96
min        0.00
25%        0.88
50%        1.12
75%        1.37
max        9.00
Name: VPS, dtype: float64

##### As with most statistics, there are some flaws with the Value Points System. Athletes that play very little minutes can really have their VPS score inflated. Overall, both of our mean and median are pretty significantly above the standardized average of 1.0 which is really good.


#### **Plus - Minus (+/-)**

##### Now let's take a look at **MY FAVORITE** stat, plus-minus. Plus-minus is an integer that represents the score differential while an athlete was on the court. At the small school level, it is not realistic for us to expect every single athlete to be a great scorer, rebounder, or ball handler. But we do expect every athlete to compete and the plus-minus stat is purely a representation of how much we outscored or got outscored while that player was on the floor. Let’s look at the top 10 plus-minus values with North Andrew Boys’ Basketball Program and their respective stats that season.

In [18]:
Hudl_Era.nlargest(10, '+/-')

Unnamed: 0,Athletes,Year,Grade,GP,eFG%,VPS,FGM,FGA,FG%,2FGM,2FGA,2FG%,3FGM,3FGA,3FG%,FTM,FTA,FT%,PF,PPG,+/-,MINS,SCP,PiP,OREB,DREB,REB,AST,TO,A/TO,STL,BLK,FOUL,CHG
11,Ryan Hughes,2017-2018,12th,29,59.60%,1.67,96,197,48.70%,53,100,53.00%,43,97,44.30%,47,65,72.30%,282,9.7,613,670,10,104,4,36,40,144,67,2.15,49,0,36,1
7,Caleb Patterson,2017-2018,12th,30,57.50%,1.99,137,240,57.10%,135,231,58.40%,2,9,22.20%,59,84,70.20%,335,11.2,583,703,76,262,80,126,206,46,26,1.77,31,13,41,2
19,Jacob Powelson,2017-2018,12th,30,55.30%,1.53,118,256,46.10%,71,135,52.60%,47,121,38.80%,37,55,67.30%,320,10.7,533,601,60,118,59,86,145,61,30,2.03,41,14,56,9
13,Lance Streeby,2017-2018,12th,30,60.90%,1.49,124,238,52.10%,82,129,63.60%,42,109,38.50%,43,62,69.40%,333,11.1,530,655,39,152,22,79,101,73,40,1.82,37,2,58,1
40,Cole Thorburn,2017-2018,12th,30,48.00%,1.23,64,153,41.80%,45,101,44.60%,19,52,36.50%,18,36,50.00%,165,5.5,394,414,17,82,23,50,73,31,22,1.41,46,18,55,1
38,Drake Simmons,2018-2019,12th,25,46.30%,1.22,105,260,40.40%,74,150,49.30%,31,110,28.20%,21,31,67.70%,262,10.5,390,609,30,112,17,105,122,96,74,1.3,42,25,61,4
1,Levi Linville,2018-2019,12th,25,68.90%,1.9,175,254,68.90%,175,253,69.20%,0,1,0.00%,92,146,63.00%,442,17.7,349,584,122,346,110,86,196,30,40,0.75,39,9,64,1
53,Aidan DeLong,2017-2018,12th,23,57.10%,1.51,92,170,54.10%,82,144,56.90%,10,26,38.50%,12,19,63.20%,206,9.0,343,443,38,136,36,86,122,24,23,1.04,26,31,59,2
47,Ryan Shultz,2017-2018,12th,30,61.80%,1.52,51,85,60.00%,48,77,62.30%,3,8,37.50%,14,23,60.90%,119,4.0,339,352,15,78,23,50,73,27,21,1.29,10,5,31,0
44,Jaden Baker,2018-2019,11th,25,58.50%,1.8,51,100,51.00%,36,49,73.50%,15,51,29.40%,16,24,66.70%,133,5.3,321,547,19,72,19,84,103,82,37,2.22,47,5,54,7
