## Question 1: Is there a relationship between quarterback fantasy points scored and team defense fantasy points scored on the same team?
In other words, does a quarterback who scores more fantasy points (and presumably more real NFL points for his team) have any effect on how their team's defense performs?
- **What is your hypothesis?**: I hypothesize that quarterback fantasy football scoring will have no effect on team defense fantasy point scoring. The quarterback plays on the offensive side of the ball, and so how the QB performs should not statistically influence how his defense performs, since they are not on the football field at the same time (teams take turns switching off playing offense/defense)
- **How does this relate to the researcher's question?**: This question is related to the researcher's question because they were trying to use a model to predict the number of fantasy points a quarterback/team defense might score. If quarterback fantasy point scoring was a strong predictor of team defensive scoring (or vice versa), the researchers could've added this to their model.
- **How does this relate to Part 1?**: This question is a more specific form of the research question I asked in part 1. I'm examining this question for quarterbacks/team defenses from the same team.
- **Why use this data?**: This data provides information about the team QBs/DSTs belonged to, as well as how many fantasy points they scored for each week of the NFL season.
- **Which features will you be using?**: I will be using `fantasy points scored`, `team`, and `player` to complete this assignment
- **How many observations are there for each feature?**: There are the same number of observations for each feature, which is 15 * the number of players at that position.
    - For QBs: 585
    - For DSTs: 480
    - Since the reason there are more quarterbacks is because some teams played multiple quarterbacks, I will simplify this analysis by combining the QBs that played for an individual team and treating them as if they were 1 QB

In [2]:
# Code from earlier sections to import and clean the dataset

# import packages and dataset
import pandas as pd

# These were imported for each fantasy football position type (QB or DST)
QB = pd.read_excel('../Data/Accuscore Evaluation.xlsx', sheet_name='QB Projections')
DST = pd.read_excel('../Data/Accuscore Evaluation.xlsx', sheet_name= 'DST Projections')

# Remove duplicate columns (PLAYERID/ESPNID is simply an alias for PLAYER)
QB = QB.drop(['PLAYERID', 1], axis=1)
# Since defenses are played per team, as long as we have the TEAM data, we know what the name of the player is
DST = DST.drop(['ESPNID', 'PLAYER'], axis=1)
# Rename ORDER column to WEEK for clarity (since order describes the week of the NFL season)
QB = QB.rename(columns={'ORDER':'WEEK'})
DST = DST.rename(columns={'ORDER':'WEEK'})


# Remove Ben Roethelisberger's and Andrew Luck's bye weeks (Week 4), since the authors forgot to
ben_bye = ((QB['WEEK'] == 4) & (QB['PLAYER'] == 'Ben Roethlisberger')).astype(int).idxmax()
luck_bye = ((QB['WEEK'] == 4) & (QB['PLAYER'] == 'Andrew Luck')).astype(int).idxmax()
QB = QB.drop([ben_bye, luck_bye], axis=0)
# Get a list of all of the NFL teams as a set
teams = set(QB['TEAM'])
# Get a list of the weeks of the NFL season as a set (we'll need these to figure out how many points the QBs for a team scored in a given week)
weeks = set(QB['WEEK'])
display(teams, weeks)
# Additional cleaning to be able to make this plot
import seaborn as sns
# Create a df where rows will be team-week pairs, and columns will be QB and DST scoring
team_scoring_df = pd.DataFrame()
# For each NFL team
for team in teams:
    # For each week of the NFL season from 1-16
    for week in weeks:
        # Figure out how many fantasy points that team's quarterback(s) scored that week, and add it to the df with the row as a team-week pair, and the column as 'QB Scoring'
        team_scoring_df.loc[team + '-' + str(week), 'QB Scoring'] = QB.loc[(QB['WEEK'] == week) & (QB['TEAM'] == team), 'Actuals'].sum()
        # Add however many points the team's defense unit scored to the df, and add it to the df with the row as a team-week pair, and the column as 'DST Scoring'
        team_scoring_df.loc[team + '-' + str(week), 'DST Scoring'] = DST.loc[(DST['WEEK'] == week) & (DST['TEAM'] == team), 'Actuals'].sum()

{'ARI',
 'ATL',
 'BAL',
 'BUF',
 'CAR',
 'CHI',
 'CIN',
 'CLE',
 'DAL',
 'DEN',
 'DET',
 'GB',
 'HOU',
 'IND',
 'JAC',
 'KC',
 'MIA',
 'MIN',
 'NE',
 'NO',
 'NYG',
 'NYJ',
 'OAK',
 'PHI',
 'PIT',
 'SD',
 'SEA',
 'SF',
 'STL',
 'TB',
 'TEN',
 'WAS'}

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16}

In [None]:
# Model 1 - Same team, QB-def pairs
import statsmodels.api as sm
# Add a constant to the data (need mx + b, not just mx)
to_model = sm.add_constant(team_scoring_df)
# Create input data (X), which is the constant and QB scoring
X = to_model.loc[:, ['const', 'QB Scoring']]
# Create output data (y), which is the DST scoring
y = to_model['DST Scoring']
# Fit a linear model to the DST vs QB data
model = sm.OLS(y,X).fit()
# Print out a summary of the statistics of the model
model.summary()

In [None]:
# Model 2 - What about on different teams?

In [None]:
# Model 3 - Modeling QB fpts as a function of defensive points scored by the opposing team, and defensive fpts scored by the QB's team

In [3]:
# Model 4 - Modeling QB fpts as a function of mean defensive points scored by the opposing team, and mean QB points scored by the opposing team

Unnamed: 0,TEAM,WEEK,Actuals,2013-06-10 00:00:00,2013-08-09 00:00:00,2013-08-16 00:00:00,2013-08-23 00:00:00,2013-08-30 00:00:00,2013-09-06 00:00:00,2013-09-13 00:00:00,...,2013-10-25 00:00:00,2013-11-01 00:00:00,2013-11-08 00:00:00,2013-11-15 00:00:00,2013-11-22 00:00:00,2013-11-29 00:00:00,2013-12-06 00:00:00,2013-12-13 00:00:00,2013-12-20 00:00:00,Variation
0,ATL,1,9,9.7,9.7,9.9,10.2,10.2,10.3,,...,,,,,,,,,,0.072000
1,BUF,1,-2,12.8,11.6,11.6,11.9,12.4,12.0,,...,,,,,,,,,,0.223000
2,CHI,1,13,17.9,18.2,19.4,18.8,17.4,17.4,,...,,,,,,,,,,0.633667
3,CIN,1,1,8.4,8.5,8.7,8.6,9.1,8.8,,...,,,,,,,,,,0.061667
4,CLE,1,29,4.2,7.1,6.9,7.1,7.2,7.2,,...,,,,,,,,,,1.413667
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
475,WAS,16,12,6.2,6.2,6.1,6.0,6.1,5.9,6.9,...,8.0,9.2,9.4,9.4,9.6,10.8,8.7,10.8,10.8,3.463571
476,CAR,16,14,14.4,9.4,10.5,10.7,10.3,9.5,9.6,...,13.4,12.8,12.8,12.2,11.5,11.3,11.6,11.5,12.1,1.703476
477,JAC,16,7,3.6,3.5,3.7,3.6,3.5,3.6,3.7,...,3.0,2.6,2.6,2.7,3.1,2.9,2.7,3.0,2.6,0.156619
478,BAL,16,7,12.0,10.8,11.0,10.9,9.8,11.0,10.2,...,9.0,10.6,11.5,10.8,11.5,12.1,12.6,10.9,9.6,1.683476
