## Import necessary libraries and define utility functions.
1. Exploring bootstrap-static Endpoint: Fetch general game and player data.
2. Exploring Player-Specific Data: Use element-summary endpoint to get detailed player stats.
3. Exploring Fixture Data: Use the fixtures endpoint to get fixture information.
4. Exploring Gameweek Data: Fetch live gameweek data for specific gameweeks.
5. Exploring League Data: Fetch standings from a sample league.
6. Summary and Next Steps: Summarize findings and outline next steps for analysis.

In [1]:
# Dependencies
import requests
import pandas as pd

In [2]:
# Function to fetch data from an endpoint
def fetch_data(url):
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Failed to retrieve data. Status code: {response.status_code}")
        return None

# Base URLs for different endpoints
base_url = 'https://fantasy.premierleague.com/api/'

In [3]:
# Exploring bootstrap-static Endpoint
# Fetch data from bootstrap-static endpoint
bootstrap_url = base_url + 'bootstrap-static/'
bootstrap_data = fetch_data(bootstrap_url)

if bootstrap_data:
    # Display keys in the bootstrap data
    print("Keys in bootstrap data:", bootstrap_data.keys())

    # Explore player data
    players_df = pd.DataFrame(bootstrap_data['elements'])
    #print(players_df.head())
    print("-----")
    print(players_df.columns)

Keys in bootstrap data: dict_keys(['events', 'game_settings', 'phases', 'teams', 'total_players', 'elements', 'element_stats', 'element_types'])
-----
Index(['chance_of_playing_next_round', 'chance_of_playing_this_round', 'code',
       'cost_change_event', 'cost_change_event_fall', 'cost_change_start',
       'cost_change_start_fall', 'dreamteam_count', 'element_type', 'ep_next',
       'ep_this', 'event_points', 'first_name', 'form', 'id', 'in_dreamteam',
       'news', 'news_added', 'now_cost', 'photo', 'points_per_game',
       'second_name', 'selected_by_percent', 'special', 'squad_number',
       'status', 'team', 'team_code', 'total_points', 'transfers_in',
       'transfers_in_event', 'transfers_out', 'transfers_out_event',
       'value_form', 'value_season', 'web_name', 'minutes', 'goals_scored',
       'assists', 'clean_sheets', 'goals_conceded', 'own_goals',
       'penalties_saved', 'penalties_missed', 'yellow_cards', 'red_cards',
       'saves', 'bonus', 'bps', 'influence

### bootstrap-static Endpoint Discovery
Use this endpoint to build a foundational dataset of player and team information. This will be the primary source for player attributes and current season stats.

#### Keys in the bootstrap-static Data
1. `events`: 
* Information about each gameweek, including deadlines, average scores, and chips played.
2. `game_settings`: 
* General settings of the game, like squad size, chip availability, and other configuration parameters.
3. `phases`: 
* Data about the different phases or segments of the season.
4. `teams`: 
* Information about each Premier League team, including their names, short names, and codes.
5. `total_players`: 
* Total number of players participating in the Fantasy Premier League.
6. `elements`: 
* Detailed player data. This is a crucial dataset that includes all active players and their associated stats.
7. `element_stats`: 
* Available statistical categories, potentially used to display player data.
8. `element_types`: Information about player positions (e.g., goalkeeper, defender, midfielder, forward).

#### Columns in the Player Data (elements)
Player Information:
1. `first_name`, `second_name`, `web_name`: 
* Names of the players.
2. `team`, `team_code`: 
* Team ID and code indicating the player's current team.
3. `element_type`: 
* Position type ID (e.g., 1 for goalkeepers, 2 for defenders).

Playing Status: 
1. `chance_of_playing_next_round`, `chance_of_playing_this_round`: 
* Probabilities indicating the player's likelihood of playing.
2. `status`: 
* Player's status (e.g., available, injured, suspended).

Fantasy Game Metrics:
1. `now_cost`: 
* Current price of the player in the game (multiplied by 10, e.g., 50 represents 5.0m).
2. `points_per_game`: 
* Average points scored by the player per game.
3. `selected_by_percent`: 
* Percentage of fantasy teams that have selected the player.

Performance Metrics:
1. `minutes`: 
* Total minutes played.
2. `goals_scored`, `assists`, `clean_sheets`, `goals_conceded`, `own_goals`: 
* Key performance indicators.
3. `yellow_cards`, `red_cards`: 
* Number of yellow and red cards received.
4. `bonus`, `bps`: 
* Bonus points and Bonus Points System score.

Advanced Metrics:
1. `influence`, `creativity`, `threat`, `ict_index`: ICT index components measuring player influence, creativity, and threat.
2. `expected_goals`, `expected_assists`, `expected_goal_involvements`, `expected_goals_conceded`: Expected stats based on shot quality, position, and historical data.
3. `starts`: Number of games started.

Per 90 Metrics:
* `expected_goals_per_90`, `saves_per_90`, `expected_assists_per_90`, `expected_goal_involvements_per_90`, `expected_goals_conceded_per_90`, `goals_conceded_per_90`, `clean_sheets_per_90`: Metrics normalized per 90 minutes to allow fair comparison across players.

In [4]:
# Exploring Player-Specific Data
# Example player ID (can be any valid player ID)
player_id = 1
player_url = base_url + f'element-summary/{player_id}/'
player_data = fetch_data(player_url)

if player_data:
    # Display keys in player data
    print("Keys in player data:", player_data.keys())

    # Explore past history
    player_history_df = pd.DataFrame(player_data['history_past'])
    #print(player_history_df.head())
    print(player_history_df.columns)

Keys in player data: dict_keys(['fixtures', 'history', 'history_past'])
Index(['season_name', 'element_code', 'start_cost', 'end_cost', 'total_points',
       'minutes', 'goals_scored', 'assists', 'clean_sheets', 'goals_conceded',
       'own_goals', 'penalties_saved', 'penalties_missed', 'yellow_cards',
       'red_cards', 'saves', 'bonus', 'bps', 'influence', 'creativity',
       'threat', 'ict_index', 'starts', 'expected_goals', 'expected_assists',
       'expected_goal_involvements', 'expected_goals_conceded'],
      dtype='object')


### Player-Specific Data Discovery
This endpoint gives detailed information about a specific player, including their fixtures, current season performance, and past season performance.

#### Keys in Player Data:
1. `fixtures`:
* Contains information about upcoming fixtures for the player’s team. This can include details about the match such as the opponent, whether the match is home or away, and the fixture's status.
2. `history`:
* Provides detailed statistics for each gameweek in the current season. This data is key for analyzing a player’s week-by-week performance.
3. `history_past`:
* Contains summary data of the player's performance in past seasons. This is useful for historical analysis and understanding a player’s consistency over time.


#### Columns in the history_past DataFrame:
1. `season_name`: 
* The name or identifier of the season (e.g., "2022/23"). This helps distinguish data from different seasons.
2. `element_code`: 
* A unique identifier for the player. This can be useful when linking data across different datasets or endpoints.
3. `start_cost` and `end_cost`: 
* The player's cost at the start and end of the season. This data can show how a player's value has changed based on performance and demand.
4. `total_points`: 
* The total fantasy points the player accumulated over the season. This is a key metric for assessing a player's overall value in fantasy football.
5. `minutes`:
* The total minutes the player was on the pitch over the season. It indicates the player's playing time and can correlate with reliability and fitness.
6. `goals_scored`:
* The number of goals the player scored during the season. This is crucial for assessing offensive players but also can be useful for defensive players who contribute offensively.
7. `assists`: 
* The number of assists provided by the player, another important offensive contribution.
8. `clean_sheets`: 
* The number of matches in which the player’s team kept a clean sheet while the player was on the pitch. Essential for defenders and goalkeepers.
9. `goals_conceded`:
* The total number of goals conceded by the player's team while the player was on the field. This metric helps assess the defensive capability of defenders and goalkeepers.
10. `own_goals`:
* Number of own goals scored by the player. This negatively affects a player’s points and is rare.
11. `penalties_saved` and `penalties_missed`: 
* The number of penalties saved (important for goalkeepers) and missed by the player.
12. `yellow_cards` and `red_cards`: 
* The number of yellow and red cards received. Important for assessing discipline and can affect player availability.
13. `saves`: 
* Specific to goalkeepers, the number of saves made. Indicates the goalkeeping activity.
14. `bonus` and `bps`: 
* Bonus points and Bonus Points System (BPS) score. BPS measures a player's overall contribution using various metrics. Bonus points are awarded to the best performers in a match.
15. `influence`, `creativity`, `threat`, `ict_index`: 
* These metrics collectively known as ICT Index represent a player’s Influence (impact on the game), Creativity (assist potential), Threat (goal threat), and a composite ICT Index.
16. `starts`: 
* The number of matches started. It helps gauge the player's importance to the team.
17. `expected_goals`, `expected_assists`, `expected_goal_involvements`, `expected_goals_conceded`: 
* These are advanced metrics indicating the expected number of goals, assists, total goal involvements, and goals conceded, based on the quality of chances. Can be useful for predictive analysis.

In [5]:
# Exploring Fixture Data
# Fetch data from fixtures endpoint
fixtures_url = base_url + 'fixtures/'
fixtures_data = fetch_data(fixtures_url)

if fixtures_data:
    # Convert to DataFrame and explore
    fixtures_df = pd.DataFrame(fixtures_data)
    #print(fixtures_df.head())
    print(fixtures_df.columns)

Index(['code', 'event', 'finished', 'finished_provisional', 'id',
       'kickoff_time', 'minutes', 'provisional_start_time', 'started',
       'team_a', 'team_a_score', 'team_h', 'team_h_score', 'stats',
       'team_h_difficulty', 'team_a_difficulty', 'pulse_id'],
      dtype='object')


### Fixtures Endpoint Discovery
This endpoint offers valuable information about the fixtures (matches) scheduled for the current season, including details about the teams involved, match timings, and difficulty ratings.

Columns in the Fixtures Data
1. `code`:
* A unique identifier for the fixture. This code may be useful for linking or referencing specific fixtures across different datasets.
2. `event`:
* Indicates the gameweek number in which the fixture takes place. This is useful for grouping and analyzing fixtures by gameweek.
3. `finished`:
* A boolean indicating whether the fixture has been completed. True means the match is finished, and False means it is yet to be played.
4. `finished_provisional`:
* Similar to finished, this indicates whether the match has been provisionally marked as finished. This may handle cases where results are not yet final or are under review.
5. `id`:
* A unique identifier for the fixture. This can be used to identify and reference specific fixtures uniquely.
6. `kickoff_time`:
* The scheduled kickoff time for the fixture. This is in ISO 8601 format (e.g., "2023-08-14T19:00:00Z"). It provides precise timing information and can be converted to different time zones as needed.
7. `minutes`:
* The number of minutes played in the fixture. For completed matches, this would typically be 90 (or more if there was extra time).
8. `provisional_start_time`:
* A boolean indicating whether the start time of the fixture is provisional. If True, the kickoff time might change.
9. `started`:
* A boolean indicating whether the fixture has started. True means the match is underway or completed, while False indicates it has not started yet.
10. `team_a` and `team_h`:
* These fields represent the IDs of the away (team_a) and home (team_h) teams. These IDs can be mapped to actual team names using the teams data from the bootstrap-static endpoint.
11. `team_a_score` and `team_h_score`:
* The number of goals scored by the away team (team_a_score) and the home team (team_h_score). For finished matches, these fields show the final scores, providing a direct measure of match outcomes.
12. `stats`:
* This field typically contains a nested list of statistics for the fixture, such as goals, assists, and other relevant performance metrics. It may need further parsing to analyze individual stats.
13. `team_h_difficulty` and `team_a_difficulty`:
* Numerical ratings indicating the difficulty of the fixture for the home (team_h_difficulty) and away (team_a_difficulty) teams. These ratings are usually on a scale (e.g., 1-5) and can help assess the strength of the opposition.
14. `pulse_id`:
* Another unique identifier for the fixture, possibly used internally or for linking with other data.

In [6]:
# Exploring Gameweek Data
# Example gameweek ID
gameweek_id = 1
gameweek_url = base_url + f'event/{gameweek_id}/live/'
gameweek_data = fetch_data(gameweek_url)

if gameweek_data:
    # Display keys in gameweek data
    print("Keys in gameweek data:", gameweek_data.keys())

    # Explore elements key for live player stats
    elements = gameweek_data.get('elements', [])
    if elements:
        gameweek_df = pd.DataFrame(elements)
        #print(gameweek_df.head())
        print(gameweek_df.columns)

Keys in gameweek data: dict_keys(['elements'])
Index(['id', 'stats', 'explain'], dtype='object')


### Gameweek Data Endpoint Discovery
This endpoint is valuable for retrieving live or real-time player performance data during a specific gameweek, providing insights into how players are performing as the matches unfold.

#### Breakdown of Gameweek Data
**Keys in Gameweek Data:**
1. `elements`:
* This key contains live performance data for players. It typically includes a list of dictionaries, with each dictionary representing a player's live stats during the specific gameweek.

**Columns in the elements DataFrame:**
1. `id`:
* This is the player's unique ID. It can be used to link or reference the player with other datasets, such as those from the bootstrap-static endpoint where comprehensive player information is available.
2. `stats`:
* This column usually contains a nested dictionary with detailed statistical data for the player during the gameweek. It might include metrics such as points scored, goals, assists, clean sheets, and other relevant fantasy football metrics.
3. `explain`:
* This column generally provides a breakdown or explanation of how the player's points are calculated for the gameweek. It includes details on specific actions (e.g., goals scored, assists, minutes played) and the corresponding points awarded for those actions.

In [7]:
# Exploring League Data
# Example league ID
league_id = 314  # Replace with a valid league ID
league_url = base_url + f'leagues-classic/{league_id}/standings/'
league_data = fetch_data(league_url)

if league_data:
    # Display keys in league data
    print("Keys in league data:", league_data.keys())

    # Explore standings
    standings_df = pd.DataFrame(league_data['standings']['results'])
    #print(standings_df.head())
    print(standings_df.columns)


Keys in league data: dict_keys(['new_entries', 'last_updated_data', 'league', 'standings'])
Index(['id', 'event_total', 'player_name', 'rank', 'last_rank', 'rank_sort',
       'total', 'entry', 'entry_name'],
      dtype='object')


### League Data Discovery
This endpoint provides insights into how leagues are performing and it's not needed data.


# Summary and Next Steps

Based on the exploration of various endpoints, these data points that can be useful for analysis:

1. **Player Data**: Current season stats, historical performance, and live gameweek data.
2. **Fixtures**: Information on past and upcoming matches, which can be used for contextual analysis.

### Next Steps:
- **Deeper Analysis**: Use the historical player data to identify trends and patterns.
- **Feature Engineering**: Create new features based on fixture difficulty, player form, etc.
- **Modeling**: Use the data to build predictive models for player performance, team selection, and more.
