 ## Python Imports

1. **`requests`:** Used for making HTTP requests to fetch data from the Fantasy Premier League API.

2. **`json`:** Provides methods for working with JSON data, used to parse the JSON response from the API.

3. **`os`:** Allows interaction with the operating system, used for handling file paths and creating directories.

4. **`MinMaxScaler` from `sklearn.preprocessing`:** A class from scikit-learn used for scaling numerical features, applied to normalize player 
data attributes.


In [1]:
import requests
import json
import os
from sklearn.preprocessing import MinMaxScaler



## Fetching Premier League Data

To retrieve data from the Fantasy Premier League API, the script makes use of the `requests` module. The provided code snippet fetches data from 
the URL "https://fantasy.premierleague.com/api/bootstrap-static/" and converts the HTTP response content into a JSON format using `r.json()`.
The resulting data is stored in the `raw_premier_league_data` variable for further processing.


In [2]:
url = "https://fantasy.premierleague.com/api/bootstrap-static/"

r = requests.get(url)

raw_premier_league_data = r.json()

## Displaying Keys from Premier League Data

In [3]:
print("Keys:", raw_premier_league_data.keys())

Keys: dict_keys(['events', 'game_settings', 'phases', 'teams', 'total_players', 'elements', 'element_stats', 'element_types'])


## Extracting and Displaying Team Data

To work with team data from the Premier League dataset, the script performs the following steps:

- Fetches the team data from the Premier League API.
- Prints the keys of the first team in the dataset.
- Specifies the desired keys to keep (`'code', 'id', 'name', 'short_name'`).
- Creates a new list with dictionaries containing only the desired keys for each team.
- Displays the shortened list of teams.


In [4]:
raw_teams_data = raw_premier_league_data["teams"]

print(raw_teams_data[0].keys())
print()

# Specify the keys to keep
desired_keys = ['code', 'id', 'name','short_name']

# Create a new list with dictionaries containing only the desired keys
teams = [{key: value for key, value in entry.items() if key in desired_keys} for entry in raw_teams_data]

# Display the shortened list
print(teams)

dict_keys(['code', 'draw', 'form', 'id', 'loss', 'name', 'played', 'points', 'position', 'short_name', 'strength', 'team_division', 'unavailable', 'win', 'strength_overall_home', 'strength_overall_away', 'strength_attack_home', 'strength_attack_away', 'strength_defence_home', 'strength_defence_away', 'pulse_id'])

[{'code': 3, 'id': 1, 'name': 'Arsenal', 'short_name': 'ARS'}, {'code': 7, 'id': 2, 'name': 'Aston Villa', 'short_name': 'AVL'}, {'code': 91, 'id': 3, 'name': 'Bournemouth', 'short_name': 'BOU'}, {'code': 94, 'id': 4, 'name': 'Brentford', 'short_name': 'BRE'}, {'code': 36, 'id': 5, 'name': 'Brighton', 'short_name': 'BHA'}, {'code': 90, 'id': 6, 'name': 'Burnley', 'short_name': 'BUR'}, {'code': 8, 'id': 7, 'name': 'Chelsea', 'short_name': 'CHE'}, {'code': 31, 'id': 8, 'name': 'Crystal Palace', 'short_name': 'CRY'}, {'code': 11, 'id': 9, 'name': 'Everton', 'short_name': 'EVE'}, {'code': 54, 'id': 10, 'name': 'Fulham', 'short_name': 'FUL'}, {'code': 14, 'id': 11, 'name': 'Liverp

## Creating Team Mapping

To facilitate matching team names with team IDs used in player information, the script creates a dictionary mapping team IDs to their corresponding names. This is particularly useful when dealing with player data that references team IDs.

In [5]:
# Create a dictionary mapping ids to corresponding team names
team_mapping = {team['id']: team['name'] for team in teams}

print(team_mapping)

{1: 'Arsenal', 2: 'Aston Villa', 3: 'Bournemouth', 4: 'Brentford', 5: 'Brighton', 6: 'Burnley', 7: 'Chelsea', 8: 'Crystal Palace', 9: 'Everton', 10: 'Fulham', 11: 'Liverpool', 12: 'Luton', 13: 'Man City', 14: 'Man Utd', 15: 'Newcastle', 16: "Nott'm Forest", 17: 'Sheffield Utd', 18: 'Spurs', 19: 'West Ham', 20: 'Wolves'}


## Extracting and Displaying Player Positions

To analyse player positions in the Premier League dataset, the script performs the following tasks:

- Fetches player position data from the data collected earlier.
- Prints the keys of the first player position entry in order to view which keys we want to keep.
- Specifies the desired keys to keep (`'id', 'singular_name', 'singular_name_short'`).
- Creates a new list with dictionaries containing only the desired keys for each player position.
- Displays the shortened list of player positions.

In [6]:
raw_positions_data = raw_premier_league_data["element_types"]

print(raw_positions_data[0].keys())
print()

# Specify the keys you want to keep
desired_keys = ['id', 'singular_name', 'singular_name_short']

# Create a new list with dictionaries containing only the desired keys
positions = [{key: value for key, value in entry.items() if key in desired_keys} for entry in raw_positions_data]

# Display the shortened list
print(positions)

dict_keys(['id', 'plural_name', 'plural_name_short', 'singular_name', 'singular_name_short', 'squad_select', 'squad_min_play', 'squad_max_play', 'ui_shirt_specific', 'sub_positions_locked', 'element_count'])

[{'id': 1, 'singular_name': 'Goalkeeper', 'singular_name_short': 'GKP'}, {'id': 2, 'singular_name': 'Defender', 'singular_name_short': 'DEF'}, {'id': 3, 'singular_name': 'Midfielder', 'singular_name_short': 'MID'}, {'id': 4, 'singular_name': 'Forward', 'singular_name_short': 'FWD'}]


## Handling None Values in Player Data

To ensure consistent data and handle potential None values in the player data, the script performs the following steps:

- Retrieves player data from the original Premier League dataset.
- Iterates through each player's data.
- Replaces any None values with a default value (e.g., 0).

In [7]:
raw_player_data = raw_premier_league_data["elements"]

# Replace None values with a default value (0)
for player in raw_player_data:
    for key, value in player.items():
        if value is None:
            player[key] = 0

## Specifying Keys for Player Positions

To streamline the analysis of player positions, the script defines specific sets of keys for different player positions. This helps organize and focus on relevant attributes for each position

In [8]:
# Specify the keys you want to keep for each position
desired_keys_general = ['code', 'id', 'first_name', 'second_name', 'element_type', 'form', 'chance_of_playing_next_round', 'now_cost',
                        'points_per_game','selected_by_percent', 'team', 'team_code', 'total_points', 'yellow_cards', 'red_cards','bonus','bps',
                        'influence', 'team_name']
desired_keys_outfield = ['goals_scored', 'assists', 'penalties_missed', 'creativity', 'threat', 'ict_index','expected_goals',
                                 'expected_assists', 'expected_goal_involvements'] + desired_keys_general
desired_keys_midfielders = ['clean_sheets'] + desired_keys_outfield
desired_keys_defenders = ['goals_conceded', 'clean_sheets','expected_goals_conceded', 'own_goals'] + desired_keys_outfield
desired_keys_goalkeepers = ['goals_conceded', 'clean_sheets', 'expected_goals_conceded', 'penalties_saved', 'saves'] + desired_keys_general

## Creating Dictionaries for Each Player Position

This section of the script involves the creation of dictionaries to categorize players based on their positions. Additionally, the team name is added to each player's information based on the corresponding team ID. Here's an overview of the process:

- **Team Name Addition:**
  - The team name is added to each player's information by referencing the `team_mapping` dictionary, using the team ID.

- **Player Categorization:**
  - Players are categorized into the respective dictionaries (`defenders`, `attackers`, `midfielders`, `goalkeepers`) based on their position.


In [9]:
# Create dictionaries for each position
defenders = []
attackers = []
midfielders = []
goalkeepers = []

for entry in raw_player_data:
    position = entry.get('element_type')

    # Adding team name base on team id
    team_id = entry.get('team')
    if team_id in team_mapping:
        entry['team_name'] = team_mapping[team_id]
    
    if position == 1:
        keys_to_use = desired_keys_goalkeepers
        goalkeepers.append({key: entry[key] for key in keys_to_use})
    elif position == 2:
        keys_to_use = desired_keys_defenders
        defenders.append({key: entry[key] for key in keys_to_use})
    elif position == 3:
        keys_to_use =  desired_keys_midfielders
        midfielders.append({key: entry[key] for key in keys_to_use})
    elif position == 4:
        keys_to_use = desired_keys_outfield
        attackers.append({key: entry[key] for key in keys_to_use})

# Display the first dictionary for each position
print("Attackers:", attackers[0])
print()
print("Midfielders:", midfielders[0])
print()
print("Defenders:", defenders[0])
print()
print("Goalkeepers:", goalkeepers[0])

Attackers: {'goals_scored': 0, 'assists': 0, 'penalties_missed': 0, 'creativity': '0.0', 'threat': '0.0', 'ict_index': '0.0', 'expected_goals': '0.00', 'expected_assists': '0.00', 'expected_goal_involvements': '0.00', 'code': 232223, 'id': 1, 'first_name': 'Folarin', 'second_name': 'Balogun', 'element_type': 4, 'form': '0.0', 'chance_of_playing_next_round': 0, 'now_cost': 44, 'points_per_game': '0.0', 'selected_by_percent': '0.2', 'team': 1, 'team_code': 3, 'total_points': 0, 'yellow_cards': 0, 'red_cards': 0, 'bonus': 0, 'bps': 0, 'influence': '0.0', 'team_name': 'Arsenal'}

Midfielders: {'clean_sheets': 0, 'goals_scored': 0, 'assists': 1, 'penalties_missed': 0, 'creativity': '0.5', 'threat': '2.0', 'ict_index': '0.4', 'expected_goals': '0.00', 'expected_assists': '0.00', 'expected_goal_involvements': '0.00', 'code': 153256, 'id': 3, 'first_name': 'Mohamed', 'second_name': 'Elneny', 'element_type': 3, 'form': '0.0', 'chance_of_playing_next_round': 0, 'now_cost': 44, 'points_per_game':

## Normalisation

This section defines the attributes that will be normalised for different player positions. The normalisation process ensures that all values fall within a consistent scale for fair comparison across players.

In [14]:
# Specify the attributes you want to normalize
attributes_to_normalize_general = ['form', 'chance_of_playing_next_round', 'now_cost','points_per_game','selected_by_percent', 'total_points', 
                                   'yellow_cards', 'red_cards','bonus','bps','influence']
attributes_to_normalize_outfield = ['goals_scored', 'assists', 'penalties_missed', 'creativity', 'threat', 'ict_index','expected_goals', 
                                    'expected_assists', 'expected_goal_involvements'] + attributes_to_normalize_general
attributes_to_normalize_midfield = ['clean_sheets'] + attributes_to_normalize_outfield
attributes_to_normalize_defence = ['goals_conceded', 'clean_sheets','expected_goals_conceded', 'own_goals'] + attributes_to_normalize_outfield
attributes_to_normalize_goalkeeper = ['goals_conceded', 'clean_sheets', 'expected_goals_conceded', 'penalties_saved', 'saves'] + attributes_to_normalize_general


## Min-Max Normalisation for attributes

In this step, the attributes specific to attackers are normalised using the Min-Max scaling technique. Min-Max scaling ensures that all attribute values fall within a standardised range, making them comparable. Here's a breakdown of the process:

1. **Extract Attribute Values:**
   - The values for the selected attributes for attackers are extracted into a 2D list.

2. **Min-Max Scaling:**
   - The `MinMaxScaler` from scikit-learn is used to perform Min-Max scaling on the selected attribute values.
values.

3. **Update Normalised Values:**
   - The normalised attribute values are then updated back to the respective dictionaries for each position.

In [15]:
# Extract the selected attribute values into a 2D list
selected_attributes_values = [[player[attr] for attr in attributes_to_normalize_outfield] for player in attackers]

# Use Min-Max scaling to normalise the selected attributes
scaler = MinMaxScaler()
normalized_attributes = scaler.fit_transform(selected_attributes_values)

# Update the normalised values back to the dictionaries
for i, player in enumerate(attackers):
    for j, attr in enumerate(attributes_to_normalize_outfield):
        player[attr] = normalized_attributes[i][j]

This next section applies the same process to the other positions

In [16]:
#Normalising for midfielders
selected_attributes_values = [[player[attr] for attr in attributes_to_normalize_midfield] for player in midfielders]

scaler = MinMaxScaler()
normalized_attributes = scaler.fit_transform(selected_attributes_values)

for i, player in enumerate(midfielders):
    for j, attr in enumerate(attributes_to_normalize_midfield):
        player[attr] = normalized_attributes[i][j]

#Normalising for defenders
selected_attributes_values = [[player[attr] for attr in attributes_to_normalize_defence] for player in defenders]

scaler = MinMaxScaler()
normalized_attributes = scaler.fit_transform(selected_attributes_values)

for i, player in enumerate(defenders):
    for j, attr in enumerate(attributes_to_normalize_defence):
        player[attr] = normalized_attributes[i][j]

#Normalising for goalkeepers
selected_attributes_values = [[player[attr] for attr in attributes_to_normalize_goalkeeper] for player in goalkeepers]

scaler = MinMaxScaler()
normalized_attributes = scaler.fit_transform(selected_attributes_values)

for i, player in enumerate(goalkeepers):
    for j, attr in enumerate(attributes_to_normalize_goalkeeper):
        player[attr] = normalized_attributes[i][j]

We then display the results

In [17]:
# Display the normalized data
print(attackers[0])
print(midfielders[0])
print(defenders[0])
print(goalkeepers[0])

{'goals_scored': 0.0, 'assists': 0.0, 'penalties_missed': 0.0, 'creativity': 0.0, 'threat': 0.0, 'ict_index': 0.0, 'expected_goals': 0.0, 'expected_assists': 0.0, 'expected_goal_involvements': 0.0, 'code': 232223, 'id': 1, 'first_name': 'Folarin', 'second_name': 'Balogun', 'element_type': 4, 'form': 0.058823529411764705, 'chance_of_playing_next_round': 0.0, 'now_cost': 0.02061855670103091, 'points_per_game': 0.0, 'selected_by_percent': 0.0036101083032490976, 'team': 1, 'team_code': 3, 'total_points': 0.0, 'yellow_cards': 0.0, 'red_cards': 0.0, 'bonus': 0.0, 'bps': 0.0, 'influence': 0.0, 'team_name': 'Arsenal'}
{'clean_sheets': 0.0, 'goals_scored': 0.0, 'assists': 0.125, 'penalties_missed': 0.0, 'creativity': 0.0005817335660267597, 'threat': 0.002129925452609159, 'ict_index': 0.001716001716001716, 'expected_goals': 0.0, 'expected_assists': 0.0, 'expected_goal_involvements': 0.0, 'code': 153256, 'id': 3, 'first_name': 'Mohamed', 'second_name': 'Elneny', 'element_type': 3, 'form': 0.06666