There are two APIs that are used to extract fantasy football data:
* https://fantasy.premierleague.com/drf/elements/: Contains player IDs, names and other general details
* https://fantasy.premierleague.com/drf/element-summary/{id}: contains detailed stats on footballers

# Building blocks

## Exploring API with general details

Import required modules

In [None]:
#import modules
import pandas as pd
import numpy as np
import requests
from time import sleep

In [None]:
# general details of players
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()

In [None]:
main_api[1] # Cech - dictionary of stats

In [None]:
main_api[3]['element_type'] # Kosc (element type is 2)

In [None]:
main_api[14]['element_type'] # Ramsey (element type is 3)

In [None]:
main_api[20]['element_type'] # Welbeck (element type is 4)

From the above, it appears that **element_type** is position.

From the general API, the following are useful:
* id: unique id that will be needed to pull out gameweek level stats
* first_name and web_name (surname)
* element_type: position
* team: numeric id that needs to be mapped to team names

In [None]:
# collect key details in a list of dictionaries
fpl_data = []
for n in range(len(main_api)):
    player_details = main_api[n] # store dictionary for selected player
    player = {'fpl_id': player_details['id'], 
              'full_name': player_details['first_name'] + ' ' + player_details['second_name'], 
              'position': player_details['element_type'], 
              'team': player_details['team']}
    fpl_data.append(player)

In [None]:
fpl_data

## Exploring API with detailed (gameweek-level) stats

We now call the second API to extend our dataset with gameweek level stats.

We define a function to return details for the given player id.

In [None]:
def player_details(i):
    return requests.get('https://fantasy.premierleague.com/drf/element-summary/'+str(i)).json()

Calling the function on a random player from the fpl_data list.

In [None]:
kane = player_details(394) # Harry Kane's details
kane

There is a lot of detail here and levels of json that we need to navigate through. To aid with this, we use the .keys() method to see what levels are available at each stage of the json, and then call the required index. The goal is to get to the game-week level data.

In [None]:
kane.keys()

In [None]:
kane['history'] # gameweek level details

In [None]:
kane['history'][18].items() # gw 1 details

We can append these details to our existing list. Let's test this for Harry Kane.

In [None]:
fpl_data_kane = fpl_data[510] # whilst Kane's id is 394, some ids are skipped in the fpl_data list
fpl_data_kane

In [None]:
kane_details = []
for week in range(28):
    kane_details.append(fpl_data_kane) # first row with general details, repeated for each row
    kane_details[-1] = {**kane_details[-1], **kane['history'][week]} # merge two dictionaries

We can now build our master function...

# Mains functions to build dataset

In [None]:
# import modules
import pandas as pd
import numpy as np
import requests
from time import sleep

In [None]:
# variables for each api
# main api for all players
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()
# api with detailed stats for given player id
def detailed_api(i):
    return requests.get('https://fantasy.premierleague.com/drf/element-summary/'+str(i)).json()

In [None]:
# collect key details from main api
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()
fpl_data_ids = []
for n in range(len(main_api)):
    player_url = main_api[n] # store dictionary for selected player
    player = {'fpl_id': player_url['id'], 
              'full_name': player_url['first_name'] + ' ' + player_url['second_name'], 
              'position': player_url['element_type'], 
              'team': player_url['team']}
    fpl_data_ids.append(player)

In [None]:
# save variables 
fpl_data = [] # empty list that data will be added to
latest_gw = 28 # update this as required

for player in fpl_data_ids: # store each dictionary within the list as 'player'
    id = player['fpl_id'] # get their id
    player_url = detailed_api(id) # save url for their detailed stats
    for week in range(latest_gw): # loop through all available gameweeks
        # generate row of data at each iteration of this loop
        
        # Try for cases where players do not have stats for a particular week (i.e. jan transfers)
        try:
            # first row with general details, repeated for each row
            fpl_data.append(player)

            # more details for each gameweek
            # gameweek level data is captured in the last row (which was just created)
            fpl_data[-1] = {**fpl_data[-1], **player_url['history'][week]} # merge two dictionaries
        except:
            continue            
    print(player['full_name'] + ' done..')
    sleep(0.1) # to not overload api 

In [None]:
fpl_data_df = pd.DataFrame(fpl_data)

In [303]:
fpl_data_df.to_csv('fpldata.csv',encoding='utf-8')

In [308]:
fpl_data_df.to_excel('fpldata.xls') # takes longer but doesn't garble characters