# Fantasy Football in 1966

## Introduction
This is a quick project that was used in a [post](https://www.palmercjones.com) on [my website](https://www.palmercjones.com), the premise is "If fantasy football was popularized in 1966, what would the best starting roster be?" Go check out the post if you want to know some more of the history of the NFL/AFL in 1966 and the idea behind this analysis. If you're just here for the data science then continue on below!

## Tools needed for this analysis
I used the [pro-football-reference.com](https://www.pro-football-reference.com/years/1966/) database, ~~which is conveniently accessible in python through the [sportsipy](https://github.com/davidjkrause/sportsipy) package (see below). This package doesn't have great documentation for beginners, I recommend asking ChatGPT for help in determining what data is available. Also, I had issues with it not being maintained very well as of February 2024, I ended up using a fork made by [davidjkrause](https://github.com/davidjkrause). Huge thanks to David!~~ Update: pro-football-reference.com has instituted a limit on the number of requests you can make per minute, which basically makes it unusable for the purpose of ths project. A much more straightforward approach was to simply go to the website itself and download the relevant data as csv files. These are available on my github, the original licenses and fair use policies apply from pro-football-reference.com. For scoring, I used the setting from my personal league which is hosted through [Sleeper](https://sleeper.com/) (This is a half-point per reception league).

#### Install Pandas
We will be using pandas to work with our data, let's go ahead and install and import that now.

In [59]:
pip install pandas

Note: you may need to restart the kernel to use updated packages.


In [60]:
import pandas as pd
from functools import reduce #also need this tool for joining more than 2 dataframes at a time

#### Defining the scoring system
Let's define all of the stats that we care about and their respective scoring attributes:
- (+0.04) Passing Yard
- (+4.0) Passing TD
- (+2.0) 2-Point Conversion (pass, run, or reception)
- (-2.0) Pass Intercepted
- ~~(-1.0) Pick 6 Thrown~~ *wasn't able to find data on pick 6s*
- (+0.1) Rushing Yard
- (+6.0) Rushing TD
- (-2.0) Fumble Lost
<br> *I was only able to find the total number of fumbles (not specifically lost) and this also counts special teams fumbles. Best I could do!*
- (+0.5) Reception
- (+0.1) Receiving Yard
- (+6.0) Receiving TD
- ~~(+6.0) Defensive TD~~
- ~~(+10.0) 0 Points Allowed~~
- ~~(+6.0) 1-6 Points Allowed~~
- ~~(+4.0) 7-13 Points Allowed~~
- ~~(+2.0) 14-20 Points Allowed~~
- ~~(+0.0) 21-27 Points Allowed~~
- ~~(-1.0) 28-34 Points Allowed~~
- ~~(-4.0) 35+ Points Allowed~~
- ~~(+1.0) Sack~~
- ~~(+2.0) Interception~~
- ~~(+2.0) Fumble Recovery~~
- ~~(+2.0) Safety~~
- ~~(+1.0) Forced Fumble~~
- ~~(+2.0) Blocked Kick~~
- ~~(+6.0) Fumble Recovery TD~~
<br>
*Removed the defensive stats because I decided not do all the week-by-week calculations for defenses*

Now let's load this data into a dataframe. This will serve as a good reference point for how we want to structure our player data later on.

In [61]:
# create a dataframe (or series) that has the columns with the scoring rules coeffecients
scoringCoefs = pd.DataFrame([[0.04, 4.0, 2.0, -2.0, 0.1, 6.0, -2.0, 0.5, 0.1, 6.0]],
                            columns=['PassYds', 'PassTD', '2PM', 'Int', 'RushYds', 'RushTD', 'Fmb', 'Rec', 'RecYds', 'RecTD'])

#### Player Data
Let's the data that was manually scraped from pro-football-reference.com. Note, the NFL and AFL data was in separate pages so we'll need to merge them.

In [62]:
# NFL Stats
NFL_Passing = pd.read_csv('NFL_passing_1966.csv')
NFL_Rushing = pd.read_csv('NFL_rushing_1966.csv')
NFL_Receiving = pd.read_csv('NFL_receiving_1966.csv')

# AFL Stats
AFL_Passing = pd.read_csv('AFL_passing_1966.csv')
AFL_Rushing = pd.read_csv('AFL_rushing_1966.csv')
AFL_Receiving = pd.read_csv('AFL_receiving_1966.csv')
AFL_Scoring = pd.read_csv('AFL_scoring_1966.csv') # this is only being used because the 2-point conversion stats don't show up on the other tables (only the AFL had a 2-point conversion in the rules in 1966)

# Combine the AFL stats onto the NFL stats (these were separated because in 1996 these were funtionally different leagues)
NFL_Passing = pd.concat([NFL_Passing, AFL_Passing])
NFL_Rushing = pd.concat([NFL_Rushing, AFL_Rushing])
NFL_Receiving = pd.concat([NFL_Receiving, AFL_Receiving])
NFL_Scoring = AFL_Scoring # only the AFL had a 2-point conversion in 1966, no need to combine with NFL

# Ok, before we merge these 4 dataframes together, let's just filter out the columns we really care about. This will make merging them a lot easier later.
# Note: We have to rename some columns because they have the same name between tables (like "yards" being used on passing and rushing).
NFL_Passing = NFL_Passing[["Player-additional", "Player", "Tm", "Age", "Pos", "G", "Cmp", "Att", "Yds", "TD", "Int"]]
NFL_Passing = NFL_Passing.rename(columns={"Att": "PassAtt", "Yds": "PassYds", "TD": "PassTD"})

NFL_Rushing = NFL_Rushing[["Player-additional", "Player", "Tm", "Age", "Pos", "G", "Att", "Yds", "TD", "Fmb"]]
NFL_Rushing = NFL_Rushing.rename(columns={"Att": "RushAtt", "Yds": "RushYds", "TD": "RushTD", "Fmb": "RushFmb"})

NFL_Receiving = NFL_Receiving[["Player-additional", "Player", "Tm", "Age", "Pos", "G", "Rec", "Yds", "TD", "Fmb"]]
NFL_Receiving = NFL_Receiving.rename(columns={"Yds": "RecYds", "TD": "RecTD", "Fmb": "RecFmb"})

NFL_Scoring = NFL_Scoring[["Player-additional", "Player", "Tm", "Age", "Pos", "G", "2PM"]]

# Finally, we'll combine these 4 dataframes into 1 dataframe with a merge
NFL_Player = reduce(lambda  left,right: pd.merge(left,right,on=['Player-additional', 'Player', 'Tm', 'Age', 'Pos', 'G'],
                                            how='outer', copy=False), [NFL_Passing, NFL_Rushing, NFL_Receiving, NFL_Scoring])

print(NFL_Player)


    Player-additional           Player   Tm  Age Pos   G   Cmp  PassAtt  \
0            AlleDu00      Duane Allen  CHI   29  TE  11   NaN      NaN   
1            AlliJi00      Jim Allison  SDG   23  RB  14   NaN      NaN   
2            AlwoLa00  Lance Alworth*+  SDG   26  FL  13   NaN      NaN   
3            AndeBi00    Bill Anderson  GNB   30  TE  10   NaN      NaN   
4            AndeDo00   Donny Anderson  GNB   23  RB  14   NaN      NaN   
..                ...              ...  ...  ...  ..  ..   ...      ...   
381          WoodDi00        Dick Wood  MIA   30  QB  14  83.0    230.0   
382          WoodGa00        Gary Wood  NYG   24  QB  14  81.0    170.0   
383          WoodTo01   Tom Woodeshick  PHI   25  RB  14   NaN      NaN   
384          WrigLo20    Lonnie Wright  DEN   21  SS  14   NaN      NaN   
385          YewcTo00       Tom Yewcic  BOS   34   P   7   NaN      NaN   

     PassYds  PassTD   Int  RushAtt  RushYds  RushTD  RushFmb   Rec  RecYds  \
0        NaN     NaN

In [63]:
# Add a new column to the NFL_Player dataframe, and go row by row multiplying out the scores and adding them to the new column

# Adding new column
NFL_Player["Score"] = 0
NFL_Player = NFL_Player.fillna(0)

# for loop going row by row
for i in NFL_Player.index:
    
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'PassYds'] * scoringCoefs.loc[0, 'PassYds'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'PassTD'] * scoringCoefs.loc[0, 'PassTD'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, '2PM'] * scoringCoefs.loc[0, '2PM'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'Int'] * scoringCoefs.loc[0, 'Int'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'RushYds'] * scoringCoefs.loc[0, 'RushYds'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'RushTD'] * scoringCoefs.loc[0, 'RushTD'])
    NFL_Player.loc[i, 'Score'] += (max(NFL_Player.loc[i, 'RushFmb'], NFL_Player.loc[i, 'RecFmb']) * scoringCoefs.loc[0, 'Fmb'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'Rec'] * scoringCoefs.loc[0, 'Rec'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'RecYds'] * scoringCoefs.loc[0, 'RecYds'])
    NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'RecTD'] * scoringCoefs.loc[0, 'RecTD'])
    
    # rounding to 2 decimal places
    NFL_Player.loc[i, 'Score'] = round(NFL_Player.loc[i, 'Score'],2)
    
    # changing positions to fit modern descriptions
    if NFL_Player.loc[i, 'Pos'] in ['FL', 'LE', 'SE', 'LE/TE', 'FL/HB']:
        NFL_Player.loc[i, 'Pos'] = 'WR'
    if NFL_Player.loc[i, 'Pos'] in ['E', 'TE/FL', 'TE/LE']:
        NFL_Player.loc[i, 'Pos'] = 'TE'
    if NFL_Player.loc[i, 'Pos'] in ['HB', 'FB', 'HB/FB', 'FB/RB']:
        NFL_Player.loc[i, 'Pos'] = 'RB'
    
    # printing
    print(NFL_Player.loc[i, 'Pos'], NFL_Player.loc[i, 'Score'], NFL_Player.loc[i, 'Player'])
    



  NFL_Player.loc[i, 'Score'] += (NFL_Player.loc[i, 'Rec'] * scoringCoefs.loc[0, 'Rec'])


TE 4.3 Duane Allen
RB 47.2 Jim Allison
WR 253.8 Lance Alworth*+
TE 2.4 Bill Anderson
RB 20.7 Donny Anderson
TE 42.5 Taz Anderson
TE 67.5 Fred Arbanas+
RB 27.0 Jon Arnett
RB 128.7 Willie Asbury
RB -3.0 Pervis Atkins
LHB 112.9 Joe Auer
RCB/FS 0.0 Bill Baird
K 1.5 Sam Baker
WR 112.8 Gary Ballman
RB 3.4 Pete Banaszak
RB 10.1 Billy Ray Barnes
WR 27.3 Gary Barnes
RB 5.0 Tom Barrington
RB 189.9 Dick Bass*
WR 16.0 Glenn Bass
QB 46.32 Pete Beathard
LLB 0.0 Bobby Bell*+
RB 14.7 Joe Bellino
QB 3.8 Bob Berry
WR 148.6 Raymond Berry
RDE 0.0 Verlon Biggs*
WR 49.7 Fred Biletnikoff
TE -0.4 Charlie Bivins
QB 90.66 George Blanda
RB 64.4 Sid Blanks
RB 92.8 Emerson Boozer*
RLB 0.0 John Bramlett*
QB 35.46 Zeke Bratkowski
QB 144.2 John Brodie
K 0.0 Tommy Brooker
RB 165.3 Bill Brown
RB 136.4 Timmy Brown
RB 3.1 Charlie Bryant
QB 69.72 Rudy Bukich
RB 27.2 Amos Bullocks
RB 52.04 Ronnie Bull
WR 150.8 Chris Burford
WR 54.8 Vern Burke
RB 177.5 Bobby Burnett*
WR 1.4 John Burrell
RB 121.46 Ode Burrell
RB 32.7 Cannonb

In [64]:
# Filter by position, and save results to CSV file

NFL_QBs = NFL_Player.query("Pos == 'QB'").sort_values(by=['Score'], ascending=False)
print(NFL_QBs)
NFL_QBs.to_csv('QBs_1966.csv', index=False)

NFL_RBs = NFL_Player.query("Pos == 'RB'").sort_values(by=['Score'], ascending=False)
print(NFL_RBs)
NFL_RBs.to_csv('RBs_1966.csv', index=False)

NFL_WRs = NFL_Player.query("Pos == 'WR'").sort_values(by=['Score'], ascending=False)
print(NFL_WRs)
NFL_WRs.to_csv('WRs_1966.csv', index=False)

NFL_TEs = NFL_Player.query("Pos == 'TE'").sort_values(by=['Score'], ascending=False)
print(NFL_TEs)
NFL_TEs.to_csv('TEs_1966.csv', index=False)

    Player-additional            Player   Tm  Age Pos   G    Cmp  PassAtt  \
235          MereDo00     Don Meredith*  DAL   28  QB  13  177.0    344.0   
300          RyanFr00       Frank Ryan*  CLE   30  QB  14  200.0    382.0   
175          JurgSo00  Sonny Jurgensen*  WAS   32  QB  14  254.0    436.0   
86           DawsLe00      Len Dawson*+  KAN   31  QB  14  159.0    284.0   
131          HadlJo00         John Hadl  SDG   26  QB  14  200.0    375.0   
..                ...               ...  ...  ...  ..  ..    ...      ...   
236          MeyeRo00         Ron Meyer  PIT   22  QB   4    7.0     19.0   
252          MyerTo00         Tom Myers  DET   23  QB   1    0.0      1.0   
280          RakeLa00   Larry Rakestraw  CHI   24  QB   1    0.0      0.0   
307          ShinDi00       Dick Shiner  WAS   24  QB  14    0.0      5.0   
195          LeexJa00         Jacky Lee  HOU   27  QB   8    4.0      8.0   

     PassYds  PassTD  ...  RushAtt  RushYds  RushTD  RushFmb  Rec  RecYds  