## Import Libraries
Let's import our fpl_draft_league tool and alias it as fpl. 

In [1]:
import logging

import sys
# Configure the logging output to go to stdout
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
sys.path.append("../") # Enables importing from parent directory
import fpl_draft_league.fpl_draft_league as fpl
import fpl_draft_league.utils as utils

DEBUG:matplotlib:matplotlib data path: /Users/msparre/.virtualenvs/fpl_draft_league/lib/python3.9/site-packages/matplotlib/mpl-data
DEBUG:matplotlib:CONFIGDIR=/Users/msparre/.matplotlib
DEBUG:matplotlib:interactive is False
DEBUG:matplotlib:platform is darwin
DEBUG:matplotlib:CACHEDIR=/Users/msparre/.matplotlib
DEBUG:matplotlib.font_manager:Using fontManager instance from /Users/msparre/.matplotlib/fontlist-v330.json


## Getting the data from draft.premierleague.com


In [2]:
utils.get_json()

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): users.premierleague.com:443


SSLError: HTTPSConnectionPool(host='users.premierleague.com', port=443): Max retries exceeded with url: /accounts/login/ (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)')))

## Inspecting the Data

Using `fpl.get_dataframes(json_file)` we can pull 3 useful dataframes! 
* League entries
* Matches
* Current standings

In [3]:
league_entry_df = utils.get_data('league_entries')
matches_df = utils.get_data('matches')
standings_df = utils.get_data('standings')

The league entries dataframe contains all league participants, with some IDs, names and waiver picks. Most useful bit here is probably a lookup between names, team names and ids. Also the waiver pick may be interesting to compare to performance!!

In [4]:
league_entry_df

Unnamed: 0,entry_id,entry_name,id,joined_time,player_first_name,player_last_name,short_name,waiver_pick
0,301781,Mbappe to Brighton,303734,2023-08-06T20:41:50.138791Z,Michael,Sparre,MS,2
1,301945,Misery Loves Kompany,303901,2023-08-06T20:47:39.198833Z,Bryce,Allred,BA,7
2,301962,Hwanging and banging,303918,2023-08-06T20:48:15.247736Z,Rory,McGinnis,RM,5
3,301967,relegation parade,303924,2023-08-06T20:48:29.893578Z,Josh,Gumacal,JG,4
4,302050,Kante stole my wife,304012,2023-08-06T20:51:50.893775Z,Jack,Thurber,JT,8
5,309454,Untitled FC,311625,2023-08-07T06:34:03.772588Z,Vedant,Sahu,VS,3
6,341081,StevieG08,343640,2023-08-08T03:11:18.954889Z,Christian,Pinho,CP,6
7,341436,dirty Mike&the boyz,344002,2023-08-08T03:42:04.013511Z,Jackson,Nagle,JN,1


The standings dataframe is again quite obvious, a row for each team and their points, their score, their rank. Cool. The only thing is that this is a "BC" view, (Business Current)... it would be cool to see the rankings over time so you can see movers and shakers.

In [5]:
standings_df

Unnamed: 0,last_rank,league_entry,matches_drawn,matches_lost,matches_played,matches_won,points_against,points_for,rank,rank_sort,total
0,,303734,0,0,0,0,0,0,,,0
1,,303901,0,0,0,0,0,0,,,0
2,,303918,0,0,0,0,0,0,,,0
3,,303924,0,0,0,0,0,0,,,0
4,,304012,0,0,0,0,0,0,,,0
5,,311625,0,0,0,0,0,0,,,0
6,,343640,0,0,0,0,0,0,,,0
7,,344002,0,0,0,0,0,0,,,0


The matches dataframe has every match, including unplayed matches and details about who played who, who scored and so on. The `winning_league_entry` and `winning_method` are all "None" so I'm not exactly sure what this is. 

In [6]:
matches_df.columns

Index(['event', 'finished', 'league_entry_1', 'league_entry_1_points',
       'league_entry_2', 'league_entry_2_points', 'started',
       'winning_league_entry', 'winning_method'],
      dtype='object')

In [7]:
print(len(league_entry_df))

8


## Standings Over Time

The first thing I want to explore is league standings over time (week by week). 

I realise that with all of the match data in `matches_df` I can essentially rebuild the history of standings. The only tricky thing is that the `matches_df` is a row per matchup, not a row per team's match. This makes it difficult to plot because I basically need a row by row of team, week, result.

The `fpl.get_points_over_time` function will basically produce a row per team's match, and then produce a plot of the standings over time for you.

In [8]:
stacked_df = fpl.get_matches_stacked(matches_df, league_entry_df)

## Streaks
The next thing I want to explore are winning streaks.
* Who holds the record?!
* Who is someone to watch out for on a hot current streak?

In [None]:
df = fpl.get_streaks(stacked_df)
df.head()

### What are people's record streaks?

In [None]:
df[['team', 'streak']].groupby(['team']).max().sort_values(by='streak', ascending=False)

### Who's on the hot streak now?

In [None]:
df[df['match'] == df.match.max()].sort_values(by='streak', ascending=False)

In [None]:
matches_group = matches_df_stacked.groupby('match')

In [None]:
matches_group.groups

In [None]:
gw_highscores = matches_df_stacked.iloc[matches_group['score'].idxmax()]

In [None]:
gw_highscores

In [None]:
gw_highscores[['team','score']].groupby('team').count().sort_values(by='score', ascending=False)

In [None]:
def find_highscores(group):
    
    group['gw_highscore_index'] = group['score'].idxmax()
    
    return group

In [None]:
df = find_highscores(matches_group)