There are two APIs that are used to extract fantasy football data:
* https://fantasy.premierleague.com/drf/elements/: Contains player IDs, names and other general details
* https://fantasy.premierleague.com/drf/element-summary/{id}: contains detailed stats on footballers

# Building blocks

## Exploring API with general details

Import required modules

In [None]:
#import modules
import pandas as pd
import numpy as np
import requests
from time import sleep

In [None]:
# general details of players
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()

In [None]:
main_api[1] # Cech - dictionary of stats

In [None]:
main_api[3]['element_type'] # Kosc (element type is 2)

In [None]:
main_api[14]['element_type'] # Ramsey (element type is 3)

In [None]:
main_api[20]['element_type'] # Welbeck (element type is 4)

From the above, it appears that **element_type** is position.

From the general API, the following are useful:
* id: unique id that will be needed to pull out gameweek level stats
* first_name and web_name (surname)
* element_type: position
* team: numeric id that needs to be mapped to team names

In [None]:
# collect key details in a list of dictionaries
fpl_data = []
for n in range(len(main_api)):
    player_details = main_api[n] # store dictionary for selected player
    player = {'fpl_id': player_details['id'], 
              'full_name': player_details['first_name'] + ' ' + player_details['second_name'], 
              'position': player_details['element_type'], 
              'team': player_details['team']}
    fpl_data.append(player)

In [None]:
fpl_data

## Exploring API with detailed (gameweek-level) stats

We now call the second API to extend our dataset with gameweek level stats.

We define a function to return details for the given player id.

In [None]:
def player_details(i):
    return requests.get('https://fantasy.premierleague.com/drf/element-summary/'+str(i)).json()

Calling the function on a random player from the fpl_data list.

In [None]:
kane = player_details(394) # Harry Kane's details
kane

There is a lot of detail here and levels of json that we need to navigate through. To aid with this, we use the .keys() method to see what levels are available at each stage of the json, and then call the required index. The goal is to get to the game-week level data.

In [None]:
kane.keys()

In [None]:
kane['history'] # gameweek level details

In [None]:
kane['history'][18].items() # gw 1 details

We can append these details to our existing list. Let's test this for Harry Kane.

In [None]:
fpl_data_kane = fpl_data[510] # whilst Kane's id is 394, some ids are skipped in the fpl_data list
fpl_data_kane

In [None]:
kane_details = []
for week in range(28):
    kane_details.append(fpl_data_kane) # first row with general details, repeated for each row
    kane_details[-1] = {**kane_details[-1], **kane['history'][week]} # merge two dictionaries

We can now build our master function...

# Mains functions to build dataset

In [25]:
# import modules
import pandas as pd
import numpy as np
import requests
from time import sleep

In [26]:
# variables for each api
# main api for all players
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()
# api with detailed stats for given player id
def detailed_api(i):
    return requests.get('https://fantasy.premierleague.com/drf/element-summary/'+str(i)).json()

In [27]:
main_api[10]

{'assists': 1,
 'bonus': 8,
 'bps': 488,
 'chance_of_playing_next_round': 100,
 'chance_of_playing_this_round': 100,
 'clean_sheets': 8,
 'code': 69140,
 'cost_change_event': 0,
 'cost_change_event_fall': 0,
 'cost_change_start': -1,
 'cost_change_start_fall': 1,
 'creativity': '104.0',
 'dreamteam_count': 2,
 'ea_index': 0,
 'element_type': 2,
 'ep_next': '5.7',
 'ep_this': '0.0',
 'event_points': 0,
 'first_name': 'Shkodran',
 'form': '4.7',
 'goals_conceded': 23,
 'goals_scored': 3,
 'ict_index': '96.1',
 'id': 12,
 'in_dreamteam': False,
 'influence': '674.2',
 'loaned_in': 0,
 'loaned_out': 0,
 'loans_in': 0,
 'loans_out': 0,
 'minutes': 1751,
 'news': '',
 'news_added': '2018-03-11T16:01:19Z',
 'now_cost': 54,
 'own_goals': 0,
 'penalties_missed': 0,
 'penalties_saved': 0,
 'photo': '69140.jpg',
 'points_per_game': '4.2',
 'red_cards': 0,
 'saves': 0,
 'second_name': 'Mustafi',
 'selected_by_percent': '3.3',
 'special': False,
 'squad_number': 20,
 'status': 'a',
 'team': 1,
 'te

In [28]:
# collect key details from main api
main_api = requests.get('https://fantasy.premierleague.com/drf/elements/').json()
fpl_data_ids = []
img_url = 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/'
for n in range(len(main_api)):
    player_url = main_api[n] # store dictionary for selected player
    player = {'fpl_id': player_url['id'], 
              'full_name': player_url['first_name'] + ' ' + player_url['second_name'], 
              'position': player_url['element_type'], 
              'team': player_url['team'],
              'image': img_url + 'p' + player_url['photo'].replace('jpg','png')}
    fpl_data_ids.append(player)

In [29]:
fpl_data_ids

[{'fpl_id': 1,
  'full_name': 'David Ospina',
  'image': 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/p48844.png',
  'position': 1,
  'team': 1},
 {'fpl_id': 2,
  'full_name': 'Petr Cech',
  'image': 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/p11334.png',
  'position': 1,
  'team': 1},
 {'fpl_id': 3,
  'full_name': 'Damian Emiliano Martinez',
  'image': 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/p98980.png',
  'position': 1,
  'team': 1},
 {'fpl_id': 4,
  'full_name': 'Laurent Koscielny',
  'image': 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/p51507.png',
  'position': 2,
  'team': 1},
 {'fpl_id': 5,
  'full_name': 'Per Mertesacker',
  'image': 'https://platform-static-files.s3.amazonaws.com/premierleague/photos/players/110x140/p17127.png',
  'position': 2,
  'team': 1},
 {'fpl_id': 6,
  'full_name': 'Gabriel Armando de Abr

In [30]:
# save variables 
fpl_data = [] # empty list that data will be added to
latest_gw = 31 # update this as required

for player in fpl_data_ids: # store each dictionary within the list as 'player'
    id = player['fpl_id'] # get their id
    player_url = detailed_api(id) # save url for their detailed stats
    for week in range(latest_gw): # loop through all available gameweeks
        # generate row of data at each iteration of this loop
        
        # Try for cases where players do not have stats for a particular week (i.e. jan transfers)
        try:
            # first row with general details, repeated for each row
            fpl_data.append(player)

            # more details for each gameweek
            # gameweek level data is captured in the last row (which was just created)
            fpl_data[-1] = {**fpl_data[-1], **player_url['history'][week]} # merge two dictionaries
        except:
            continue            
    print(player['full_name'] + ' done..')
    sleep(0.01) # to not overload api 

David Ospina done..
Petr Cech done..
Damian Emiliano Martinez done..
Laurent Koscielny done..
Per Mertesacker done..
Gabriel Armando de Abreu done..
Héctor Bellerín done..
Carl Jenkinson done..
Nacho Monreal done..
Rob Holding done..
Shkodran Mustafi done..
Sead Kolasinac done..
Mesut Özil done..
Santiago Cazorla done..
Aaron Ramsey done..
Francis Coquelin done..
Alex Iwobi done..
Mohamed Elneny done..
Granit Xhaka done..
Jack Wilshere done..
Danny Welbeck done..
Lucas Pérez done..
Alexandre Lacazette done..
Henrikh Mkhitaryan done..
Calum Chambers done..
Reiss Nelson done..
Ainsley Maitland-Niles done..
Matt Macey done..
Edward Nketiah done..
Mathieu Debuchy done..
Konstantinos Mavropanos done..
Pierre-Emerick Aubameyang done..
Joseph Willock done..
Asmir Begovic done..
Artur Boruc done..
Adam Federici done..
Simon Francis done..
Steve Cook done..
Charlie Daniels done..
Adam Smith done..
Tyrone Mings done..
Bradley Smith done..
Nathan Aké done..
Harry Arter done..
Marc Pugh done..
And

Ciaran Clark done..
Jamaal Lascelles done..
Paul Dummett done..
Grant Hanley done..
Jesús Gámez Duarte done..
Florian Lejeune done..
Matt Ritchie done..
Jonjo Shelvey done..
Yoan Gouffran done..
Mohamed Diamé done..
Jack Colback done..
Christian Atsu done..
Dwight Gayle done..
Ayoze Pérez done..
Aleksandar Mitrovic done..
Daryl Murphy done..
Massadio Haidara done..
Siem de Jong done..
Rolando Aarons done..
Javier Manquillo done..
Jacob Murphy done..
Jose Luis Mato Sanmartín done..
Chancel Mbemba done..
Mikel Merino done..
Isaac Hayden done..
Henri Saivet done..
Freddie Woodman done..
Martin Dubravka done..
Alex McCarthy done..
Fraser Forster done..
Ryan Bertrand done..
Maya Yoshida done..
Matt Targett done..
Cédric Soares done..
Jérémy Pied done..
Jack Stephens done..
Sam McQueen done..
Jan Bednarek done..
Steven Davis done..
James Ward-Prowse done..
Dusan Tadic done..
Jordy Clasie done..
Oriol Romeu Vidal done..
Pierre-Emile Højbjerg done..
Sofiane Boufal done..
Joshua Sims done..
Nat

In [31]:
fpl_data_df = pd.DataFrame(fpl_data)

In [32]:
fpl_data_df.to_csv('fpldata.csv',encoding='utf-8')

In [None]:
fpl_data_df.to_excel('fpldata.xls') # takes longer but doesn't garble characters