## New analysis goal: fixed skill sets from tourneys

Use df_matches to identify the group ids that use competitive rulesets.
This gives us the team ids we need.
Then we can get the team roster using https://fumbbl.com/api/team/get/1102662
we need the player ids here.

and the chosen skills using https://fumbbl.com/api/team/getOptions/1102662
this is a string of player ids, with skill ids.

through https://fumbbl.com/api/skill/list we can get a list of skills.
https://fumbbl.com/api/skill/list/2020 and this is the 2020 list.

* World cup training: 9941 (dec 2020), 2 matches in dec 2020 using bb2016, then in okt 2021 for real. 429 matches.
* SUper league: 15615
* templars road to WC: 11605
* entrainment tournois: 12879
* NAF online tournaments : 9298
* Tacklezone: 12013
* Doppelbock: 13198
* Eurobowl practice league: 15643
* Eurobowl 2020 training: 12087 (eurobowl warsaw)

In [2]:
import pandas as pd
import numpy as np
import plotnine as p9
import requests

from mizani.formatters import date_format

# point this to the location of the CSV datasets
path_to_datasets = '../datasets/current/'

# FUMBBL matches
target = 'df_matches.csv'
df_matches = pd.read_csv(path_to_datasets + target) 

target = 'df_mbt.csv'
df_mbt = pd.read_csv(path_to_datasets + target) 

In [3]:
# FUMBBL inducements
target = 'inducements.csv'
inducements = pd.read_csv(path_to_datasets + target) 

# FUMBBL skills
target = 'df_skills.csv'
df_skills = pd.read_csv(path_to_datasets + target) 

In [5]:
%run ../../fumbbl_scraping/src/read_json_file.py
%run ../../fumbbl_scraping/src/write_json_file.py

%run get_team_roster.py

https://fumbbl.com/api/ruleset/get/188
contains info about gold and skills.

From the roster id we can infer the bb version.

In [6]:
df_matches['match_date'] = pd.to_datetime(df_matches['match_date'])
df_matches['week_date'] = pd.to_datetime(df_matches['week_date'])


df_matches['quarter'] = df_matches['match_date'].dt.to_period('Q')
df_matches['month'] = df_matches['match_date'].dt.to_period('M')
df_matches['quarter_date'] = pd.PeriodIndex(df_matches['quarter'] , freq='Q').to_timestamp()
df_matches['month_date'] = pd.PeriodIndex(df_matches['month'] , freq='M').to_timestamp()

df_matches.loc[df_matches['scheduler'].str.contains("Blackbox"), 'division_name'] = 'Blackbox'

df_matches['cr_diff2_bin'] = pd.cut(df_matches['cr_diff2'], bins = [-1*float("inf"), -30, -20, -10, -5, 5, 10, 20, 30, float("inf")], 
 labels=['{-Inf,-30]', '[-30,-20]', '[-20,-10]', '[-10,-5]', '[-5,5]', '[5,10]', '[10,20]', '[20,30]', '[30,Inf]']) 

df_mbt['match_date'] = pd.to_datetime(df_mbt['match_date'])
df_mbt['quarter'] = df_mbt['match_date'].dt.to_period('Q')
df_mbt['month'] = df_mbt['match_date'].dt.to_period('M')
df_mbt['quarter_date'] = pd.PeriodIndex(df_mbt['quarter'] , freq='Q').to_timestamp()
df_mbt['month_date'] = pd.PeriodIndex(df_mbt['month'] , freq='M').to_timestamp()

# Gather Team roster info

need a tournament or group (here called league) id.

In [7]:
is_tournament = 1

# NAF EurOpen 2020 (1.1M, BB2016)
tournament_ids = [53038, 53037, 53040, 53041]
# NAF GBFU 2021 (1.1M, BB2020, no skills available)
tournament_ids = [56214, 56208, 56212, 56213]


# NAF Road to Malta 2022(1.15M BB2020)
tournament_ids = [58323,58324, 58322, 58321]

#is_tournament = 0

# World Cup Training (1.15M BB2020, mix of EB and WC)
#league_ids = [9941]
# Super league [BB2020, no stars, only bribes and master chef, 1.15M, bespoke tiering system]
#league_ids = [15615]



if is_tournament:
    team_ids = []

    for tournament_id in tournament_ids:
        
        tmp_list = (df_matches.query('tournament_id == @tournament_id')['team1_id'].tolist() + 
            df_matches.query('tournament_id == @tournament_id')['team2_id'].tolist())
            
        tmp_list = list(set(tmp_list))

        team_ids.extend(tmp_list)
else:
    team_ids = []

    for league_id in league_ids:
        
        tmp_list = (df_matches.query('league == @league_id')['team1_id'].tolist() + 
            df_matches.query('league == @league_id')['team2_id'].tolist())
            
        tmp_list = list(set(tmp_list))

        team_ids.extend(tmp_list)

In [8]:
# how many teams?
len(team_ids)
#46*4

184

The `get_team_roster()` function is the workhorse here, piecing together the roster containing skills and inducements. For this it needs match data as inducements are considered part of a match, not of a roster. 

In [None]:
df_rosters = get_team_roster(team_ids[0], df_skills, df_matches, inducements)

In [None]:
team_ids.pop(0)

for team_id in team_ids:
    df_rosters_tmp = get_team_roster(team_id, df_skills, df_matches, inducements)
    df_rosters = pd.concat([df_rosters, df_rosters_tmp], ignore_index=True)

In [None]:
df_rosters

In [None]:
#target = 'datasets/current/df_rosters_eurobowl2020'
target = 'datasets/current/df_rosters_road_to_malta'
#target = 'datasets/current/df_rosters_super_league'
#target = 'datasets/current/df_rosters_wc_training'

#df_rosters.to_hdf(target + '.h5', key='df_rosters', mode='w', format = 't',  complevel = 9)
df_rosters.to_csv(target + '.csv')

