# Python SportsDataIO NCAAF BoxScore - Template

Creating a Python Template where we can analyse SDIO API Box Score per game data. We can use this template to quickly work through the data and gather ewhat we need.

The ability to work with basic data structures will be key for this task.

In [None]:
import json # Python built-in package json for encoding and decoding JSON data.
import requests # library for working with HTTP requests in Python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Below is an alternative way to pull in our API data using a get request in event we do not have an already pre-defined file.

In [None]:
url = ("https://api.sportsdata.io/v3/cfb/stats/json/BoxScore/14447?key=x")
print(url)
print(type(url))

NOTE 

The URL will not work as the key has been removed for data privacy reasons as this is not open source data

In [None]:
response = requests.get(url) # get request

data = response.json() #convert to JSON 
print(type(response))
print(type(data))

Below we will read in our pre-defined JSON file.

We can see above our outer layer of this JSON file is now a Python list. This would our data in what we already we know will be in the form of dictionaries. We can cut into our data to pull out required information.
- Game
- Periods
- PlayerGames
- TeamGames
- ScoringPlays

### Game

In [None]:
print(len(data))
print(type(data[0]['Game']))
print(data[0]['Game'])

### Periods

In [None]:
var = data[0]['Periods']
print(var)

In [None]:
df = pd.DataFrame(var) # convert to dictionary
df 

Visualize the data

In [None]:
#
plt.figure(figsize=(5,3))
sns.set_style('darkgrid')
sns.scatterplot(x=df.Number, y=df.AwayScore, color = "r", alpha = 1.0).set(title="Period | Away Score") # lineplot
#
plt.figure(figsize=(5,3))
sns.set_style('darkgrid')
sns.scatterplot(x=df.Number, y=df.HomeScore, color = "b", alpha = 1.0).set(title="Period | Home Score") # lineplot

### Player Games

In [None]:
print(data[0]['PlayerGames'][0].keys())

How many Player Games  have we for this fixture ? This will be important for context and writing programs for pulling out specific Player Games data.

In [None]:
print(len(data[0]['PlayerGames']))

Write a small program that will pull out the player information for each individual record of PlayerGames below ! We want to save this data into a Python list. We will use this data structure as we will want to maintain the order of our data here. We are going to yous these data lists to build a apnadas dataframe and export this to an excel file as this wont be alot of data (roughly 60 records)

In [None]:
# list data structure (ordered data structure)
player_ids = []
names = []  
positions = []

# player games player info
for container in data[0]['PlayerGames']:
    # apppend data to out empty lists
    player_ids.append(container['PlayerID'])
    names.append(container['Name'])
    positions.append(container['Position'])
    
#Create DataFrame including all above features and randomly generated data
players = pd.DataFrame ({'PlayerIds':player_ids,'PlayerNames':names,'PlayerPositions':positions})
#create a copy of original dataframe ...(good practice in event we do some df manipualtion)
original_player_dataframe = players.copy(deep = True)

#Display head of dataframe, dataset information, summary information & dataframe size
display(players.head(), players.info(), players.describe(include=object), players.shape)

We can export the dataframe to an excel file as seen below.

In [None]:
players.to_excel(r'C:\file_location\ncaaf_players_export_dataframe.xlsx', index=False)

### Team Games
How many team games are included in our data for specific fixture.

In [None]:
print(len(data[0]['TeamGames']))

In [None]:
print(data[0]['TeamGames'][0].keys())
print(data[0]['TeamGames'][0].values())

In [None]:
print(data[0]['TeamGames'][1])

### ScoringPlays
How many Scroing Plays are included in our data for specific fixture.

In [None]:
print(len(data[0]['ScoringPlays']))

In [None]:
print(data[0]['ScoringPlays'][0])

In [None]:
plays = data[0]['ScoringPlays']

In [None]:
df2 = pd.DataFrame(plays) # convert to dictionary
df2.drop(['GameID', 'TimeRemainingMinutes','TimeRemainingSeconds','DriveSummary','ScoringTeamID'], axis=1,inplace=True)
df2.head()

We can export the dataframe to an excel file as seen below.

In [None]:
df2.to_excel(r'C:\file_location\ncaaf_scoingPlays_export_dataframe.xlsx', index=False)