<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#1.-Import-Packages" data-toc-modified-id="1.-Import-Packages-1">1. Import Packages</a></span></li><li><span><a href="#2.-Read-in-Data" data-toc-modified-id="2.-Read-in-Data-2">2. Read in Data</a></span></li><li><span><a href="#3.-Combining-Dataframes" data-toc-modified-id="3.-Combining-Dataframes-3">3. Combining Dataframes</a></span></li></ul></div>

# 1. Import Packages

In [1]:
import pandas as pd
import numpy as np
import matplotlib as plt
%matplotlib inline

# 2. Read in Data

In [2]:
inj = pd.read_csv('../data/injuries_clean.csv')
nba18 = pd.read_csv('../data/2017_2018_game_logs.csv')
nba17 = pd.read_csv('../data/2016_2017_game_logs.csv')
nba16 = pd.read_csv('../data/2015_2016_game_logs.csv')

# 3. Combining Dataframes

In order to be able to successfully track whether or not an injury was suffered in a given game, we want to create a new data frame combining the injuries frames with the game log data.  To do this, we first need to ensure the columns align, so we will need to add empty columns to the injury frame.  Additionally, we want to edit the injury data frame so that the dates are in line with the 2015-2018 NBA game log data we have.

In [3]:
# first we know we can clean out the unnamed: 0 column in each frame
inj.drop(columns = ['Unnamed: 0'], inplace = True)
nba18.drop(columns = ['Unnamed: 0'], inplace = True)
nba17.drop(columns = ['Unnamed: 0'], inplace = True)
nba16.drop(columns = ['Unnamed: 0'], inplace = True)

In [5]:
# setting masks for the date for each season of data
start_18 = '2017-10-17'
end_18 = '2018-06-08'
start_17 = '2016-10-25'
end_17 = '2017-06-12'
start_16 = '2015-10-27'
end_16 = '2016-06-19'
# resetting our injury data frame to only be for the range of nba game logs that we have
inj = inj[((inj['date'] >= start_18) & (inj['date'] <= end_18)) |
    ((inj['date'] >= start_17) & (inj['date'] <= end_17)) |
    ((inj['date'] >= start_16) & (inj['date'] <= end_16))]

In [6]:
# adding new columns to the injury data frame for the merge
inj_new_cols = list(set(nba18.columns) - set(inj.columns))
for col in inj_new_cols:
    inj[col] = ''

In [10]:
nba_new_cols = list(set(inj.columns) - set(nba18.columns))
for col in nba_new_cols:
    nba18[col] = ''
    nba17[col] = ''
    nba16[col] = ''