# // Getting NFL Data
___
The first step in getting this project underway is going to be getting **massive** amounts of NFL data from the web. I will be working in this notebook to "show my work" and for others to learn how to if they're curious. Ultimately, I'll also turn it into a regular `.py` Python script that you can run if you're so inclined.

For that, we're going to start out using the `requests` and `BeautifulSoup` libraries as well as API calls to glean information from:
- [Reddit recommendations](https://www.reddit.com/r/fantasyfootball/comments/34mbth/datasets_for_fantasy_football/)
- **nflgame** [here](http://wseaton.com/pulling-data-with-nflgame.html) and [here](https://pypi.org/project/nflgame/)
- [fantasydata](https://fantasydata.com/)
- [Pro Football Reference](https://www.pro-football-reference.com/)

EDIT: After further review (see some commentary below), the 3 I'm going to try and work with the most are:
- [Pro Football Reference](https://www.pro-football-reference.com/)
- [FFToday](http://www.fftoday.com/stats/)
- [The Football Database](https://www.footballdb.com/fantasy-football/index.html)

In [4]:
# Importing our necessary libraries
import requests
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import requests

from bs4 import BeautifulSoup
from selenium import webdriver
from time import sleep
%matplotlib inline

## Let's take a look at ~~nflgame~~

___About___ `nflgame`:
nflgame is an API to retrieve and read NFL Game Center JSON data. It can work with real-time data, which can be used for fantasy football.

nflgame works by parsing the same JSON data that powers NFL.com’s live GameCenter. Therefore, nflgame can be used to report game statistics while a game is being played.

The package comes pre-loaded with game data from every pre- and regular season game from 2009 up until the present (Author tries to update it every week). Therefore, querying such data does not actually ping NFL.com.

However, if you try to search for data in a game that is being currently played, the JSON data will be downloaded from NFL.com at each request (so be careful not to inspect for data too many times while a game is being played). If you ask for data for a particular game that hasn’t been cached to disk but is no longer being played, it will be automatically cached to disk so that no further downloads are required.

- Going to work through a little example of one-off coding first.
- **Important:** If you haven't installed nflgame yet, you'll need to by running: `pip install nflgame`

In [2]:
import nflgame
import datetime

nflgame requires Python 2.6+ and does not yet work with Python 3
You are running Python version 3.6


SystemExit: 1

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


**Welp. Going to switch gears for a bit then...**

## Looking at ~~fantasydata~~
- Require membership and paid subscription
- Going to avoid this for now since I want people to be able to follow along for free if they want.

## Scraping Pro Football Reference:

## Now looking at FFToday
- Need to register for free
- Looking at .5 PPR scoring

In [78]:
res = requests.get('http://www.fftoday.com/stats/playerstats.php?Season=2018&GameWeek=1&PosID=10&LeagueID=193033')
soup = BeautifulSoup(res.content, 'lxml')

In [79]:
fftoday_qbs = pd.DataFrame(columns= ['player', 'team', 'game', 'pass_comp',
                                'pass_att', 'pass_yds', 'pass_TD', 'pass_INT',
                                'rush_att', 'rush_yds', 'rush_TD', 'fpoints'])

In [80]:
fftoday_qbs

Unnamed: 0,player,team,game,pass_comp,pass_att,pass_yds,pass_TD,pass_INT,rush_att,rush_yds,rush_TD,fpoints


In [81]:
player = []
team = []
week = []
pass_comp = []
pass_att = []
pass_yds = []
pass_TD = []
pass_INT = []
rush_att = []
rush_yds = []
rush_TD = []
fpoints = []

In [82]:
qb_columns_lists = [player,
                    team,
                    week,
                    pass_comp,
                    pass_att,
                    pass_yds,
                    pass_TD,
                    pass_INT,
                    rush_att,
                    rush_yds,
                    rush_TD,
                    fpoints]

In [83]:
for row in soup.find('table', {'cellpadding':2}).find_all('tr')[2:]:
    cells = row.find_all('td')
    for index, selection in enumerate(qb_columns_lists):
        selection.append(cells[index].text.strip())

In [84]:
player

['1. Ryan Fitzpatrick',
 '2. Drew Brees',
 '3. Philip Rivers',
 '4. Patrick Mahomes',
 '5. Tyrod Taylor',
 '6. Case Keenum',
 '7. Aaron Rodgers',
 '8. Russell Wilson',
 '9. Tom Brady',
 '10. Joe Flacco',
 '11. Andrew Luck',
 '12. Kirk Cousins',
 '13. Alex Smith',
 '14. Ben Roethlisberger',
 '15. Andy Dalton',
 '16. Cam Newton',
 '17. Ryan Tannehill',
 '18. Jared Goff',
 '19. Matthew Stafford',
 '20. Mitchell Trubisky',
 '21. Sam Darnold',
 '22. Blake Bortles',
 '23. Jimmy Garoppolo',
 '24. Deshaun Watson',
 '25. Derek Carr',
 '26. Matt Ryan',
 '27. Eli Manning',
 '28. Dak Prescott',
 '29. Sam Bradford',
 '30. Marcus Mariota',
 '31. Josh Allen',
 '32. Lamar Jackson',
 '33. Blaine Gabbert',
 '34. Nick Foles',
 '35. DeShone Kizer',
 '36. Nathan Peterman',
 '37. Matt Cassel']