# Basketball Analytics

Use the attached data to calculate plus/minus for each player in each game. Plus/Minus is defined as the team’s score differential while the player is on the court. 

Note: When a player is substituted before or during a set of free throws but was on the court at the time of the foul that caused the free throw, he is considered to be on the court for the free throws for the purposes of plus/minus. A player substituted in before a free throw but after a foul is not considered to be on the court until after the conclusion of the free throws.

This folder includes three data sets: Play by Play, Event Code Description and Game Lineups. Please submit your answer in a .csv file and save your code, spreadsheets and all other work in a zip file.

Please submit your answer in a spreadsheet titled “Your_Team_Name_Q1_BBALL.csv” as the title, substituting in the name of your team for “Your_Team_Name.” The final product should have 3 columns. Column 1: Game_ID, Column 2: Player_ID, Column 3 Player_Plus/Minus.

In [1]:
# import packages
import pandas as pd

### Load 'NBA Hackathon - Event Codes.txt'

This dataset provides look up values for the event message types and action types found in the play by play dataset. Each code is converted to an English language description of the event.

In [2]:
Event_Codes = pd.read_table('NBA Hackathon - Event Codes.txt', sep='\t', header = 0)
Event_Codes.head(15)

Unnamed: 0,Event_Msg_Type,Action_Type,Event_Msg_Type_Description,Action_Type_Description
0,1,0,Made Shot,No Shot
1,1,1,Made Shot,Jump Shot
2,1,3,Made Shot,Hook Shot
3,1,4,Made Shot,Tip Shot
4,1,5,Made Shot,Layup Shot
5,1,6,Made Shot,Driving Layup Shot
6,1,7,Made Shot,Dunk Shot
7,1,8,Made Shot,Slam Dunk Shot
8,1,9,Made Shot,Driving Dunk Shot
9,1,40,Made Shot,Layup Shot


### Load 'NBA Hackathon – Game Lineup Data Sample (50 Games).txt'

This dataset provides start of period player availability.
* Game_id – A unique game code for each game
* Period (Quarter) – The associated period of the line up (overtime period are indicated by values greater than 4)
* Person_id – A unique identifier for each player
* Team_id – A unique identifier for each team
* Status – A variable indicating whether a player is active (A) or inactive (I)

In [3]:
Game_Lineup = pd.read_table('NBA Hackathon - Game Lineup Data Sample (50 Games).txt', sep='\t', header = 0)
Game_Lineup.head(10)

Unnamed: 0,Game_id,Period,Person_id,Team_id,status
0,021fd159b55773fba8157e2090fe0fe2,1,881f83d2dee3f18c7d1751659406144e,012059d397c0b7e5a30a5bb89c0b075e,A
1,021fd159b55773fba8157e2090fe0fe2,1,27ea17a8685c4919f157e83fe9cb2d9e,cff694c8186a4bd377de400e4f60fe47,A
2,021fd159b55773fba8157e2090fe0fe2,1,57bbd7e30bc694aeee9ee40c583e6811,cff694c8186a4bd377de400e4f60fe47,A
3,021fd159b55773fba8157e2090fe0fe2,1,cec898a1d355dbfbad8c760615fde1af,012059d397c0b7e5a30a5bb89c0b075e,A
4,021fd159b55773fba8157e2090fe0fe2,1,33963fe856a1523ff46438ba07d1d99f,cff694c8186a4bd377de400e4f60fe47,A
5,021fd159b55773fba8157e2090fe0fe2,1,a99f44bbff39e352191a870e17f04537,012059d397c0b7e5a30a5bb89c0b075e,A
6,021fd159b55773fba8157e2090fe0fe2,1,c00264c3114d23bac482e9de50fb7d28,cff694c8186a4bd377de400e4f60fe47,A
7,021fd159b55773fba8157e2090fe0fe2,1,89706b99ddd00dc05d37ef5cafc04276,012059d397c0b7e5a30a5bb89c0b075e,A
8,021fd159b55773fba8157e2090fe0fe2,1,2b313e2bcef0268bc8e9415132ba9997,012059d397c0b7e5a30a5bb89c0b075e,A
9,021fd159b55773fba8157e2090fe0fe2,1,307beab25b1021a548b4a47550bc4b25,cff694c8186a4bd377de400e4f60fe47,A


### Load 'NBA Hackathon - Play by Play Data Sample (50 Games).txt'

This dataset provides play by play information on the event level for each game.

To properly sort the events in a game use the following sequence of sorted columns: Period (ascending), PC_Time (descending), WC_Time (ascending), Event_Number (ascending)
* Event_Number – An ordered counter for each event in a game. Note, this number may not be perfectly sequential so please use the sorting methodology outlined above
* Event_Msg_Type, Action_Type – Coded descriptions of what happened during the event
* WC_Time – The in-arena time of the event in Unix format. It is coded in tenths of a second.
* PC_Time – The time on the game clock in tenths of a second (e.g. 7200 corresponds to 720 seconds/12 minutes remaining in the quarter)
* Option 1 – On a shot attempt, this column will tell you the point value of the shot
    * On free throw attempts, if the value in this column is 1, it means it was a made free throw, otherwise, it was missed.
* Person1, Person2 – The person_ids of the players who are directly associated with the event (e.g. If the event is an assisted made basket, Person1 is the shot maker and Person2 is the player who assisted)
    * In the case of a substitution, the Event_Msg_Type will be 8, Person1 will be the ID for the player leaving the game, and Person2 will be the ID for the player entering the game.
* Team_id – The team_id associated with Person1

In [4]:
Play_by_Play = pd.read_table('NBA Hackathon - Play by Play Data Sample (50 Games).txt', sep='\t', header = 0)
Play_by_Play.head(10)

Unnamed: 0,Game_id,Event_Num,Event_Msg_Type,Period,WC_Time,PC_Time,Action_Type,Option1,Option2,Option3,Team_id,Person1,Person2,Team_id_type
0,021fd159b55773fba8157e2090fe0fe2,0,12,1,546427,7200,0,0,0,0,1473d70e5646a26de3c52aa1abd85b1f,6bcf6c1f8c373d25fca1579bc4464a91,6bcf6c1f8c373d25fca1579bc4464a91,0
1,021fd159b55773fba8157e2090fe0fe2,1,10,1,546495,7200,0,0,0,0,012059d397c0b7e5a30a5bb89c0b075e,89706b99ddd00dc05d37ef5cafc04276,307beab25b1021a548b4a47550bc4b25,2
2,021fd159b55773fba8157e2090fe0fe2,2,2,1,546665,7050,1,3,0,0,012059d397c0b7e5a30a5bb89c0b075e,cec898a1d355dbfbad8c760615fde1af,6bcf6c1f8c373d25fca1579bc4464a91,2
3,021fd159b55773fba8157e2090fe0fe2,3,4,1,546714,6960,0,0,0,0,012059d397c0b7e5a30a5bb89c0b075e,307beab25b1021a548b4a47550bc4b25,6bcf6c1f8c373d25fca1579bc4464a91,2
4,021fd159b55773fba8157e2090fe0fe2,6,6,1,546886,6920,4,0,0,0,cff694c8186a4bd377de400e4f60fe47,c00264c3114d23bac482e9de50fb7d28,89706b99ddd00dc05d37ef5cafc04276,3
5,021fd159b55773fba8157e2090fe0fe2,7,5,1,546887,6920,5,0,0,0,cff694c8186a4bd377de400e4f60fe47,c00264c3114d23bac482e9de50fb7d28,6bcf6c1f8c373d25fca1579bc4464a91,3
6,021fd159b55773fba8157e2090fe0fe2,8,6,1,547110,6820,1,0,0,0,012059d397c0b7e5a30a5bb89c0b075e,57bbd7e30bc694aeee9ee40c583e6811,89706b99ddd00dc05d37ef5cafc04276,2
7,021fd159b55773fba8157e2090fe0fe2,9,1,1,547220,6740,49,2,0,0,012059d397c0b7e5a30a5bb89c0b075e,a99f44bbff39e352191a870e17f04537,881f83d2dee3f18c7d1751659406144e,2
8,021fd159b55773fba8157e2090fe0fe2,10,1,1,547395,6580,1,2,0,0,cff694c8186a4bd377de400e4f60fe47,57bbd7e30bc694aeee9ee40c583e6811,c00264c3114d23bac482e9de50fb7d28,3
9,021fd159b55773fba8157e2090fe0fe2,12,2,1,547524,6450,1,3,0,0,012059d397c0b7e5a30a5bb89c0b075e,881f83d2dee3f18c7d1751659406144e,6bcf6c1f8c373d25fca1579bc4464a91,2
