## Sportsball 2018 - Visualizing a Fantasy Football Season
### by John Larson

## Investigation Overview

My fantasy football league is managed through ESPN. There are a limited number of features on the league page for gaining insights into player performance. This can make it difficult for league members to manage their teams effectively. The goal of this project is to create helpful visualizations to help league members understand strengths/weaknesses and where they rank in the league.

## Dataset Overview

Two dataframes were created through ESPN's accessible fantasy football API. [Steven Morse](https://stmorse.github.io/), an instructor in the Department of Mathematics at the U.S. Military Academy, posted a couple articles containing instructions and code that were instrumental in helping me efficienctly create [seasonscores](https://stmorse.github.io/journal/espn-fantasy-python.html) and [boxscores](https://stmorse.github.io/journal/espn-fantasy-2-python.html) csv files using ESPN's API. Steven's articles also inspired me to make boxplots and radial charts to visualize data for my league.

In [1]:
# import all packages and set plots to be embedded inline:
import numpy as np
import pandas as pd
from plotly.offline import plot, iplot, init_notebook_mode
import plotly.graph_objs as go
# Make plotly work with Jupyter notebook
init_notebook_mode(connected = True)

# suppress warnings from final output
import warnings
warnings.simplefilter("ignore")

In [2]:
# load in the datasets as pandas dataframes:
boxscores = pd.read_csv('espn-api-to-csv/boxscores.csv')
seasonscores = pd.read_csv('espn-api-to-csv/seasonscores.csv')

## Top Players for Team 4

James Conner lead my team to a championship this year, with 210 fantasy points over 13 weeks.

Other notable contributors were my quarterback committee of Rodgers and Ryan, my top wide receiver Stefon Diggs, and a surprisingly successful George Kittle.

In [3]:
selected_team = 'Team 4'
# Isolate one team:
relevant_team = boxscores[boxscores['teamName'] == selected_team]
# Eliminate bench players:
relevant_players = relevant_team[relevant_team['slotId'] != 20]
# Identify the top eight players for the season
top_performers_list = relevant_players.groupby('playerName')['appliedStatTotal']\
    .sum().sort_values(ascending = False)[0:8].index.tolist()
# Restrict relevant_players to the top performers:
relevant_player_scores = relevant_players[relevant_players['playerName'].isin(top_performers_list)]
# Top performers df:
top_performers = relevant_player_scores.groupby('playerName')\
    .sum()['appliedStatTotal'].sort_values(ascending = False)

data = [go.Bar(x = top_performers.keys(), y = top_performers.tolist())]

layout = go.Layout(
        title = 'Top Players for {}'.format(selected_team),
        yaxis = {'title':'Total Season Score'},
        hovermode = 'closest',
        autosize = False,
        width = 720,
        height = 480,
        margin = dict(
            l = 50,
            r = 50,
            b = 100,
            t = 100,
            pad = 4),
        bargroupgap = 0.1)

fig = {'data':data,'layout':layout}
iplot(fig)

## 2018 Regular Season Fantasy Scoring
Scores for the season were all over the map, ranging from Team 2 putting up a measly 60 points in week 8, to Team 5 posting 169 points in Week 4. There doesn't seem to be a relationship between variance of scores and magnitude of scores. In other words, a team's scoring consistency is not correlated to the team's success in term of how many points they score on average.

My team (Team 4) ranks 4th in average score. One thing I remember about my fantasy season that's easily seen in this boxplot is my three weeks in a row of scoring 121 points.

Sharing this chart with leaguemates would be a helpful way for them to understand their team's scoring consistency and how they stack up amongst the competition.

In [4]:
# Reorder teams by mean so that they're plotted nicely
teamorder = np.array(seasonscores.groupby('Team')['Score'].mean().sort_values().index)

# Plot season scores
data = []
for team in teamorder:
    # 'z' is a team specific dataframe
    z = seasonscores[seasonscores['Team'] == team]
    trace = go.Box(
        y = z['Score'],
        name = team,
        text = 'Week ' + z['Week'].astype(str),
        boxpoints = 'all',
        jitter = 0.4,
        pointpos = 0)
    data.append(trace)
                   
layout = go.Layout(
    title = '2018 Regular Season Fantasy Scoring',
    showlegend = False,
    yaxis = dict(title = 'Score', range = [50,170], nticks = 10),
    hovermode = 'closest',
    autosize = False,
    width = 720,
    height = 480,
    margin = dict(
        l = 50,
        r = 50,
        b = 100,
        t = 100,
        pad = 4))

fig = {'data':data,'layout':layout}
iplot(fig)

## Average Positional Scoring in Wins and Losses for Team 4
Each purple trace shows an individual week. The black line shows the average positional score.

On average, in matchups that I won, my QB scored 20.1, RBs averaged 16.7, WRs averaged 12.8, TE scored 10.2, and D/ST scored 7.8.

There's a smaller sample size in the losses radial chart, which makes sense considering I only lost three matchups this season. On average, in matchups that I lost, my QB scored 20.8, RBs averaged 10.5, WRs averaged 9.7, TE scored 14.2, and D/ST scored 3.3.

The biggest difference in positional scoring between wins and losses is at the RB position (6.2). This means I probably lost matchups due mostly to lackluster RB performance. Even though QBs generally score the most on fantasy teams, these visualizations are evidence that solid performances out of other postions is actually more vital to winning matchups.

In [5]:
# Filter out bench players ('slotId' = 20):
positional_stats = (boxscores[(boxscores['slotId'] != 20)]
 .filter(items=['teamName', 'matchupPeriodId', 'position', 'appliedStatTotal', 'wonMatchup'], axis=1)
 # group by team, matchup, and postion and take the mean positional score using .agg:
 .groupby(['teamName', 'matchupPeriodId', 'position'])
 .agg({'appliedStatTotal': 'mean'})
 # Pivot table on 'position' to create new columns:
 .unstack('position')
 .reset_index())
# Create 'Won' column by taking the min of 'wonMatchup':
positional_stats['Won'] = boxscores.groupby(['teamName', 'matchupPeriodId']).agg({'wonMatchup': 'min'}).reset_index(drop=True)
# Rearrange columns:
positional_stats.columns = ['Team', 'Matchup', 'D/ST', 'QB', 'RB', 'TE', 'WR', 'Won']
# Round floats:
positional_stats = positional_stats.round(2)

In [6]:
# Box scores for selected team:
df_team = positional_stats[positional_stats['Team'] == selected_team]
# Box scores for selected team in wins:
df_team_win = df_team[df_team['Won'] == True]
# List of positions. 'D/ST' is listed twice to connect the radial trace:
positions = ['D/ST', 'QB', 'RB', 'WR', 'TE', 'D/ST']
# Loop through matchups:
data_win = []
for m in df_team_win['Matchup']:
    trace1 = go.Scatterpolar(
        # Loop through positions:
        r = [df_team_win[df_team_win['Matchup'] == m][p].item() for p in positions],
        theta = positions,
        fill = 'toself',
        opacity = 0.4,
        mode = 'lines+markers',
        line = dict(width = 0.5, color = 'rgb(131, 90, 241)'),
        marker = dict(size = 1),
        name = 'Week {}'.format(m))
    data_win.append(trace1)

# Create trace for average positional scores:
trace2 = go.Scatterpolar(
    r = [df_team_win[p].mean() for p in positions],
    theta = positions,
    name = 'Season avg.',
    opacity = 1,
    line = dict(width = 2,color = 'black'))
data_win.append(trace2)

layout = go.Layout(
    title = 'Average Positional Scoring in Wins for {}'.format(selected_team),
    hovermode = 'closest',
    polar = dict(radialaxis = dict(visible = True, range = [0,df_team[positions].max().max()])),
    showlegend = False,
    autosize = False,
    width = 800,
    height = 520,
    margin = dict(
        l = 50,
        r = 50,
        b = 100,
        t = 100,
        pad = 4))

fig = {'data':data_win,'layout':layout}
iplot(fig)

In [7]:
# Box scores for selected team in losses:
df_team_loss = df_team[df_team['Won'] == False]
# List of positions. 'D/ST' is listed twice to connect the radial trace:
positions = ['D/ST', 'QB', 'RB', 'WR', 'TE', 'D/ST']
# Loop through matchups:
data_loss = []
for m in df_team_loss['Matchup']:
    trace1 = go.Scatterpolar(
        # Loop through positions:
        r = [df_team_loss[df_team_loss['Matchup'] == m][p].item() for p in positions],
        theta = positions,
        fill = 'toself',
        opacity = 0.4,
        mode = 'lines+markers',
        line = dict(width = 0.5, color = 'rgb(131, 90, 241)'),
        marker = dict(size = 1),
        name = 'Week {}'.format(m))
    data_loss.append(trace1)

# Create trace for average positional scores:
trace2 = go.Scatterpolar(
    r = [df_team_loss[p].mean() for p in positions],
    theta = positions,
    name = 'Season avg.',
    opacity = 1,
    line = dict(width = 2,color = 'black'))
data_loss.append(trace2)

layout = go.Layout(
    title = 'Average Positional Scoring in Losses for {}'.format(selected_team),
    hovermode = 'closest',
    polar = dict(radialaxis = dict(visible = True, range = [0,df_team[positions].max().max()])),
    showlegend = False,
    autosize = False,
    width = 800,
    height = 520,
    margin = dict(
        l = 50,
        r = 50,
        b = 100,
        t = 100,
        pad = 4))
    
fig = {'data':data_loss,'layout':layout}
iplot(fig)

## Check Out My [Sportsball Dashboard](https://sportsball-2018-dashboard.herokuapp.com/)!

Visualization is my favorite part of the data analysis process. Creating team specific charts over the course of the whole season seemed restrictive to me, so I made an interactive dashboard with Plotly's Dash app.

In addition to some tweaks to the existing visualizations, I added a team selection dropdown and a week selection slider so that users can explore the data on their own.