# FPL League Report — Cleaned & Commented
This notebook fetches data from the public FPL API, computes league and player metrics, and assembles a textual summary.
Refactor goals:
- Keep outputs and logic identical  
- Standardize imports and formatting  
- Improve readability with sections and comments

## Extract data

## 🔧 Setup & Bootstrap Data

In [93]:
# Standardized imports (consolidated)
import requests
import pandas as pd
from collections import defaultdict
from dotenv import load_dotenv
import os
from openai import OpenAI

league_id = 815838
BASE_URL = "https://fantasy.premierleague.com/api"

bootstrap = requests.get(f"{BASE_URL}/bootstrap-static/").json()

# Map: player_id -> player object for quick lookup
players = {p["id"]: p for p in bootstrap["elements"]}
# Map: team_id -> team name
teams_lookup = {t["id"]: t["name"] for t in bootstrap["teams"]}
# Determine the current gameweek from events
current_gw = next(event["id"] for event in bootstrap["events"] if event["is_current"])

Form helper functions

In [94]:
"""Return formation like "3-5-2" based on starters only."""
def get_formation(picks):
    formation = {"DEF": 0, "MID": 0, "FWD": 0}

    for pick in picks["picks"]:
        if pick["position"] <= 11:  # starters only - to counter bboost
            if pick["element_type"] == 2:  # DEF
                formation["DEF"] += 1
            elif pick["element_type"] == 3:  # MID
                formation["MID"] += 1
            elif pick["element_type"] == 4:  # FWD
                formation["FWD"] += 1

    return f"{formation['DEF']}-{formation['MID']}-{formation['FWD']}"

def get_full_league_standings_and_name(league_id: int):
    standings = []
    page = 1
    while True:
        url = f"{BASE_URL}/leagues-classic/{league_id}/standings/?page_standings={page}"
        resp = requests.get(url).json()

        results = resp["standings"]["results"]
        standings.extend(results)

        if resp["standings"]["has_next"]:
            page += 1
        else:
            break

    league_name = resp["league"]["name"]

    return standings, league_name

In [95]:
league, league_name = get_full_league_standings_and_name(league_id)
entries = [team["entry"] for team in league]
gw = current_gw

all_picks = []
for entry_id in entries:
    picks = requests.get(f"{BASE_URL}/entry/{entry_id}/event/{gw}/picks/").json()
    all_picks.append(picks)

Form the footballers table

In [96]:
picks_count = defaultdict(int)
captain_count = defaultdict(int)

for team in all_picks:
    active_picks = {p["element"]: p for p in team["picks"]}

    # Determine who was captain + vice
    cap_id = next(p["element"] for p in team["picks"] if p["is_captain"])
    vice_id = next(p["element"] for p in team["picks"] if p["is_vice_captain"])

    # Fetch autosubs info (to check if captain missed out)
    autosubs = team.get("automatic_subs", [])

    # Count all picks
    for p in active_picks.values():
        picks_count[p["element"]] += 1

    # If captain is autosubbed out, vice takes over
    cap_out = any(s["element_out"] == cap_id for s in autosubs)
    if cap_out:
        vice_out = any(s["element_out"] == vice_id for s in autosubs)
        if vice_out: #Case where both C and VC don't play
            None
        else:
            captain_count[vice_id] += 1
    else:
        captain_count[cap_id] += 1

# ===== 5. Build DataFrame =====
footballers_data = []
for p in bootstrap["elements"]:
    footballers_data.append({
        "Footballer ID":p["id"] ,
        "Footballer name": p["web_name"],
        "Total points": p["total_points"],
        "GW points": p["event_points"],
        "Real team name": teams_lookup[p["team"]],
        "Real team ID": p["team"],
        "Price (in Millions £)": p["now_cost"] / 10,
        "Price last GW (in Millions £)": p["cost_change_event"] / 10 + (p["now_cost"] / 10),
        "Price difference (in Millions £)": p["cost_change_event"] / 10,
        "Times chosen in squad": picks_count[p["id"]],
        "Times captained": captain_count[p["id"]]
    })

df_footballers = pd.DataFrame(footballers_data)

Final footballers dataframe

In [97]:
df_footballers.iloc[220:270]

Unnamed: 0,Footballer ID,Footballer name,Total points,GW points,Real team name,Real team ID,Price (in Millions £),Price last GW (in Millions £),Price difference (in Millions £),Times chosen in squad,Times captained
220,170,Milner,1,0,Brighton,6,5.0,5.0,0.0,0,0
221,171,Sarmiento,0,0,Brighton,6,5.0,5.0,0.0,0,0
222,172,Watson,0,0,Brighton,6,5.0,5.0,0.0,0,0
223,173,Wieffer,7,3,Brighton,6,5.0,5.0,0.0,0,0
224,174,Yalcouye,0,0,Brighton,6,5.0,5.0,0.0,0,0
225,175,Knight,0,0,Brighton,6,4.5,4.5,0.0,0,0
226,176,Mazilu,0,0,Brighton,6,4.5,4.5,0.0,0,0
227,177,Moran,0,0,Brighton,6,4.5,4.5,0.0,0,0
228,178,Welbeck,1,0,Brighton,6,6.4,6.3,-0.1,0,0
229,179,Ferguson,0,0,Brighton,6,5.5,5.5,0.0,0,0


Form the teams table

## 📊 League Table & Entries

In [98]:
def get_player_points(player_id, gw=current_gw):
    url = f"https://fantasy.premierleague.com/api/element-summary/{player_id}/"
    data = requests.get(url).json()
    for match in data["history"]:
        if match["round"] == gw:
            return match["total_points"]
    return 0

Start rough work -----------

Captaincy auto-switch testing 
(Should be fixed once GW fully ends tmrw)

In [None]:
picks = requests.get(f"{BASE_URL}/entry/{4540331}/event/{2}/picks/").json()

for p in picks["picks"]:
    print(p["element"], players[p["element"]]["web_name"], get_player_points(p["element"])) 

print("VC Joao Pedro points not double counted despite Captain Palmer not playing")

220 Sánchez 3
8 J.Timber 24
370 Frimpong 0
568 Pedro Porro 5
235 Palmer 0
582 Kudus 3
427 Reijnders 2
382 Wirtz 0
249 João Pedro 15
525 Wood 2
64 Watkins 2
470 Dúbravka 6
38 Konsa 0
53 Malen 1
610 Wan-Bissaka 0
VC Joao Pedro points not double counted despite Captain Palmer not playing


In [121]:
picks = requests.get(f"{BASE_URL}/entry/{4013674}/event/{2}/picks/").json()
for p in picks["picks"]:
    print(p["element"], players[p["element"]]["web_name"], get_player_points(p["element"])) 

print("Semenyo points doubled for being captain")

287 Pickford 15
474 Schär 0
5 Gabriel 6
291 Tarkowski 8
507 Aina 1
450 Cunha 2
515 Gibbs-White 2
381 M.Salah 0
82 Semenyo 6
283 Mateta 2
525 Wood 2
139 Verbruggen 1
299 Ndiaye 8
43 Alex Moreno 0
181 Kostoulas 0
Semenyo points doubled for being captain


End rough work --------------

In [100]:
# Fetch league standings

teams_data = []
all_picks = []
captaincy_effectiveness_ratio = {}
all_top_3_contributors_ids = {}

for entry_id in entries:
    entry_info = requests.get(f"{BASE_URL}/entry/{entry_id}/").json()
    player_name = f"{entry_info['player_first_name']} {entry_info['player_last_name']}"
    favourite_team = teams_lookup[entry_info.get("favourite_team")] if entry_info.get("favourite_team") else None
    picks = requests.get(f"{BASE_URL}/entry/{entry_id}/event/{gw}/picks/").json()
    all_picks.append(picks)
    transfers = requests.get(f"{BASE_URL}/entry/{entry_id}/transfers/").json()

    # Transfers of recent GW
    gw_transfers = [
        (players[t["element_in"]]["web_name"],
         players[t["element_out"]]["web_name"],
         players[t["element_in"]]["event_points"] - players[t["element_out"]]["event_points"])
        for t in transfers if t["event"] == gw
    ]

    # Formation
    formation = get_formation(picks)

    # Captain & VC
    captain_id = next(p["element"] for p in picks["picks"] if p["is_captain"])
    captain_name = players[captain_id]["web_name"]
    captain_times_chosen = df_footballers[df_footballers["Footballer ID"] == captain_id]["Times captained"].iloc[0]
    captain_tuple = (captain_name, captain_times_chosen)
    vice_captain = players[next(p["element"] for p in picks["picks"] if p["is_vice_captain"])]["web_name"]

    playing_XI = [
    (players[p["element"]]["web_name"], p["element"])
    for p in picks['picks'] if p["multiplier"] != 0
]
    bench = [
    (players[p["element"]]["web_name"], p["element"])
    for p in picks['picks'] if p["multiplier"] == 0
]
    playing_XI_times_chosen = {}
    for name, pid in playing_XI:
        times_chosen = int(df_footballers[df_footballers["Footballer ID"] == pid]["Times chosen in squad"].iloc[0])
        playing_XI_times_chosen[pid] = times_chosen
    
    playing_XI_frequency = [(players[pid]["web_name"], frequency) for pid, frequency in playing_XI_times_chosen.items()]

    transfers_hits_this_gw = 0
    if picks["entry_history"]["event"] == gw:
        transfers_hits_this_gw = picks["entry_history"]["event_transfers_cost"]

    contributions = defaultdict(int)
    total_hits = 0
    total_captain_points = 0
    total_best_points = 0
    for gameweek in range(1, gw + 1):
        url = f"https://fantasy.premierleague.com/api/entry/{entry_id}/event/{gameweek}/picks/"
        data = requests.get(url).json()
        all_players_gw_points = []
        captain_points = 0

        if "picks" not in data:
            continue

        total_hits += data["entry_history"]["event_transfers_cost"]

        for p in data["picks"]:
            player_id = p["element"]
            multiplier = 1 if p["multiplier"] == 2 else p["multiplier"] #Not rewarding for captaincy for this metric
            gw_points = get_player_points(player_id, gameweek) * multiplier
            contributions[player_id] += gw_points

            player_points = get_player_points(player_id, gameweek)
            all_players_gw_points.append(player_points)
            if p.get("is_captain"):
                captain_points = player_points  # raw points before multiplying

        best_points = max(all_players_gw_points) if all_players_gw_points else 0

        total_captain_points += captain_points
        total_best_points += best_points

        top3 = sorted(contributions.items(), key=lambda x: x[1], reverse=True)[:3]

        top3_percentage_dict = {}
        for pid, pts in top3:
            top3_percentage_dict[players[pid]["web_name"]] = round(100*pts/(entry_info["summary_overall_points"] + total_hits), 2)
        top3_percentage = list(top3_percentage_dict.items())
        all_top_3_contributors_ids[entry_id] = top3_percentage

        top_3_contributors = [
        (players[pid]["web_name"], pts) for pid, pts in top3
    ]

    if total_best_points == 0:
        ratio = 0
    else:
        ratio = round(total_captain_points/total_best_points, 3)
    captaincy_effectiveness_ratio[entry_id] = ratio


    teams_data.append({
        "Entry ID": entry_id,
        "Player name": player_name,
        "Team name": entry_info["name"],
        "Favourite team": favourite_team,
        "Total points": entry_info["summary_overall_points"],
        "GW points": entry_info["summary_event_points"],
        "Transfers (In, Out, Points gained)": gw_transfers,
        "Transfer hits this GW": transfers_hits_this_gw,
        "Transfer hits overall": total_hits,
        "Formation": formation,
        "Captain with times captained this GW": captain_tuple,
        "Vice Captain": vice_captain,
        "Playing XI with ID": playing_XI,
        "Playing XI with times chosen this GW": playing_XI_frequency,
        "Bench with ID": bench,
        "Chips used": picks.get("active_chip"),
        "Top 3 contributors overall with points": top_3_contributors,
        "Percentage contribution overall": top3_percentage,
        "Captaincy effectiveness ratio overall": ratio
    })

df_teams = pd.DataFrame(teams_data)


Add ranking history data to the dataframe

## 📈 Rank History & Movements

In [101]:
def relative_ranks(overall_ranks_dict):
    # Find max number of gameweeks present
    max_gw = max(len(v) for v in overall_ranks_dict.values())
    
    # Initialize result dict with empty lists
    result = {entry_id: [] for entry_id in overall_ranks_dict}
    
    # Process each GW
    for gw in range(max_gw):
        # Collect (entry_id, overall_rank) for this GW
        gw_ranks = {
            entry_id: ranks[gw]
            for entry_id, ranks in overall_ranks_dict.items()
            if gw < len(ranks)  # some teams might have fewer entries
        }
        
        # Sort by overall rank (lower is better)
        sorted_entries = sorted(gw_ranks.items(), key=lambda x: x[1])
        
        # Assign relative ranks (1,2,3,…)
        for rel_rank, (entry_id, _) in enumerate(sorted_entries, start=1):
            result[entry_id].append(rel_rank)
    
    return result

In [102]:
rank_history = {}
for entry_id in entries:
    url = f"{BASE_URL}/entry/{entry_id}/history/"
    resp = requests.get(url).json()
    history = resp.get("current", [])
    rank_history[entry_id] = [gw["overall_rank"] for gw in history]

relative_ranked_teams = relative_ranks(rank_history)

df_rank_history = pd.DataFrame(list(relative_ranked_teams.items()), columns=["Entry ID", "Rankings history"])

# Merge with your main df
df_teams = df_teams.merge(df_rank_history, on="Entry ID", how="left")

Add adjacent teams points difference

In [103]:
diff_col = df_teams["Total points"].diff().fillna(0).astype(int)

# Insert diff column *after* "value"
col_position = df_teams.columns.get_loc("Total points") + 1
df_teams.insert(col_position, "Adjacent points difference", diff_col)


Final teams DataFrame

In [104]:
df_teams


Unnamed: 0,Entry ID,Player name,Team name,Favourite team,Total points,Adjacent points difference,GW points,"Transfers (In, Out, Points gained)",Transfer hits this GW,Transfer hits overall,...,Captain with times captained this GW,Vice Captain,Playing XI with ID,Playing XI with times chosen this GW,Bench with ID,Chips used,Top 3 contributors overall with points,Percentage contribution overall,Captaincy effectiveness ratio overall,Rankings history
0,4013674,Kait Wojtaszek,K8 the Gr8,Arsenal,127,0,50,"[(Mateta, Bowen, 0)]",0,0,...,"(Semenyo, 1)",Wood,"[(Pickford, 287), (Schär, 474), (Gabriel, 5), ...","[(Pickford, 3), (Schär, 1), (Gabriel, 3), (Tar...","[(Verbruggen, 139), (Ndiaye, 299), (Alex Moren...",,"[(Semenyo, 21), (Pickford, 17), (Wood, 15)]","[(Semenyo, 16.54), (Pickford, 13.39), (Wood, 1...",0.467,"[2, 1]"
1,3920569,Eric Zurita,Hustle & Flo,Liverpool,118,-9,40,"[(Reijnders, Sarr, -6)]",0,0,...,"(Haaland, 1)",Ekitiké,"[(Raya, 1), (Andersen, 317), (Aït-Nouri, 402),...","[(Raya, 1), (Andersen, 3), (Aït-Nouri, 3), (Mu...","[(Areola, 600), (Baleba, 167), (Alex Moreno, 4...",,"[(João Pedro, 17), (Raya, 16), (Haaland, 15)]","[(João Pedro, 14.41), (Raya, 13.56), (Haaland,...",0.536,"[1, 2]"
2,4540331,Tim Woodhouse,All Change,Spurs,107,-11,56,"[(Reijnders, Marmoush, 1)]",0,0,...,"(Palmer, 3)",João Pedro,"[(Sánchez, 220), (J.Timber, 8), (Frimpong, 370...","[(Sánchez, 5), (J.Timber, 2), (Frimpong, 1), (...","[(Dúbravka, 470), (Konsa, 38), (Malen, 53), (W...",,"[(J.Timber, 24), (João Pedro, 17), (Wood, 15)]","[(J.Timber, 22.43), (João Pedro, 15.89), (Wood...",0.081,"[6, 3]"
3,4140344,Gerd Woort-Menker,IwobiKlose,,106,-1,46,"[(Aït-Nouri, Frimpong, 1)]",0,0,...,"(Palmer, 3)",Wirtz,"[(Dúbravka, 470), (Aït-Nouri, 402), (Reinildo,...","[(Dúbravka, 9), (Aït-Nouri, 3), (Reinildo, 1),...","[(Verbruggen, 139), (Murillo, 506), (Piroe, 36...",,"[(João Pedro, 17), (Wood, 15), (Kudus, 13)]","[(João Pedro, 16.04), (Wood, 14.15), (Kudus, 1...",0.286,"[3, 4]"
4,5823307,Jasmine Benzine,Arroz Blanco FC,,103,-3,60,[],0,0,...,"(Palmer, 3)",Pickford,"[(Pickford, 287), (Muñoz, 256), (Diouf, 603), ...","[(Pickford, 3), (Muñoz, 2), (Diouf, 1), (Gabri...","[(Dúbravka, 470), (Zubimendi, 26), (Dorgu, 441...",,"[(Pickford, 17), (João Pedro, 17), (Gabriel, 12)]","[(Pickford, 16.5), (João Pedro, 16.5), (Gabrie...",0.115,"[11, 5]"
5,5304099,D'Arcy Williams,TheatreOfMemes,Man Utd,96,-7,41,"[(Ballard, Frimpong, 1)]",0,0,...,"(M.Salah, 4)",Palmer,"[(Sánchez, 220), (Van de Ven, 575), (Wan-Bissa...","[(Sánchez, 5), (Van de Ven, 2), (Wan-Bissaka, ...","[(Dúbravka, 470), (Konsa, 38), (A.Ramsey, 211)...",,"[(João Pedro, 17), (Van de Ven, 12), (Reijnder...","[(João Pedro, 17.71), (Van de Ven, 12.5), (Rei...",0.32,"[4, 6]"
6,7888775,Kamal Logue,Saka Potatoes,Chelsea,89,-7,36,"[(Mateta, Isak, 2)]",0,0,...,"(Ekitiké, 1)",Mateta,"[(Pickford, 287), (James, 225), (Van de Ven, 5...","[(Pickford, 3), (James, 1), (Van de Ven, 2), (...","[(Dúbravka, 470), (Romero, 569), (Bruun Larsen...",,"[(Pickford, 17), (Kudus, 13), (Van de Ven, 12)]","[(Pickford, 19.1), (Kudus, 14.61), (Van de Ven...",0.038,"[5, 7]"
7,9114697,Christopher Kelly Jr,Elland Roadie,Man Utd,85,-4,41,[],0,0,...,"(Bowen, 1)",Aït-Nouri,"[(Sels, 502), (Cucurella, 224), (Kerkez, 371),...","[(Sels, 1), (Cucurella, 1), (Kerkez, 1), (Pedr...","[(Sánchez, 220), (J.Timber, 8), (Cunha, 450), ...",,"[(Johnson, 17), (Cucurella, 13), (Gyökeres, 13)]","[(Johnson, 20.0), (Cucurella, 15.29), (Gyökere...",0.121,"[10, 8]"
8,7547590,John Deutsch,Deutschmeister,Arsenal,83,-2,35,[],0,0,...,"(M.Salah, 4)",Watkins,"[(Sánchez, 220), (Murillo, 506), (Estève, 191)...","[(Sánchez, 5), (Murillo, 6), (Estève, 3), (And...","[(Dúbravka, 470), (Pedro Porro, 568), (Dorgu, ...",,"[(João Pedro, 17), (Sánchez, 11), (M.Salah, 8)]","[(João Pedro, 20.48), (Sánchez, 13.25), (M.Sal...",0.348,"[9, 9]"
9,4993729,Rick Thorley,Gameofthrowins,,82,-1,33,[],0,0,...,"(M.Salah, 4)",Palmer,"[(Sánchez, 220), (Virgil, 373), (Murillo, 506)...","[(Sánchez, 5), (Virgil, 2), (Murillo, 6), (Tar...","[(Dúbravka, 470), (Wan-Bissaka, 610), (Marc Gu...",,"[(João Pedro, 17), (Tarkowski, 12), (Sánchez, ...","[(João Pedro, 20.73), (Tarkowski, 14.63), (Sán...",0.348,"[7, 10]"


## Metrics ##

Helper functions

In [122]:
live = requests.get(f"https://fantasy.premierleague.com/api/event/{current_gw}/live/").json()
points = [p["stats"]["total_points"] for p in live["elements"] if p["stats"]["minutes"] > 0]
avg_points = sum(points) / len(points)

In [123]:
 # Utility: find indices of max/min values with optional absolute value
def all_extremes(series, metric="max", use_abs=False):
    values = series.abs() if use_abs else series

    if metric == "max":
        extreme_value = values.max()
    elif metric == "min":
        extreme_value = values.min()
    else:
        raise ValueError("metric must be 'max' or 'min'")

    return series[values == extreme_value].index.tolist()

def names_from_indices(df, indices, column="Team name"):
    return [df.iloc[i][column] for i in indices if i < len(df)]

def value_from_first_index(df, indices, column):
    if indices:
        return df.iloc[indices[0]][column]
    return None

Logic

In [181]:
rule_based_metrics = []

# 1 League topper
idx_league_topper = all_extremes(df_teams["Total points"], "max")
all_league_toppers = names_from_indices(df_teams, idx_league_topper, "Team name")
rule_based_metrics.append(f"{all_league_toppers} topped the league")

# 2 League bottom
idx_league_bottom = all_extremes(df_teams["Total points"], "min")
all_league_bottoms = names_from_indices(df_teams, idx_league_bottom, "Team name")
rule_based_metrics.append(f"{all_league_bottoms} are at the bottom of the league")

# 4 Max change in GW points (absolute)
idx_max_change = all_extremes(df_teams["GW points"], "max", use_abs=True)
all_teams_max_change = names_from_indices(df_teams, idx_max_change, "Team name")
value_max_change = value_from_first_index(df_teams, idx_max_change, "GW points")

if value_max_change is not None:
    for idx in idx_max_change:
        total_points = int(df_teams.iloc[idx]["Total points"])
        prev_points = total_points - int(df_teams.iloc[idx]["GW points"])
        max_change_in_points = (
            f"{all_teams_max_change} showed the maximum change of points over the last week. "
            f"Their change: {value_max_change} points from {prev_points} to {total_points}."
        )
        rule_based_metrics.append(max_change_in_points)

# 5 Most points gained
idx_most_points_gained = all_extremes(df_teams["GW points"], "max")
all_teams_most_points_gained = names_from_indices(df_teams, idx_most_points_gained, "Team name")
value_most_points_gained = value_from_first_index(df_teams, idx_most_points_gained, "GW points")
if value_most_points_gained is not None:
    rule_based_metrics.append(
        f"{all_teams_most_points_gained} gained the most points ({value_most_points_gained}) this Gameweek."
    )

# 6 Least points gained
idx_least_points_gained = all_extremes(df_teams["GW points"], "min")
all_teams_least_points_gained = names_from_indices(df_teams, idx_least_points_gained, "Team name")
value_least_points_gained = value_from_first_index(df_teams, idx_least_points_gained, "GW points")
if value_least_points_gained is not None:
    rule_based_metrics.append(
        f"{all_teams_least_points_gained} gained the least points ({value_least_points_gained}) this Gameweek."
    )

# 7 Highest scoring footballer
idx_highest_scoring_footballer = all_extremes(df_footballers["GW points"], "max")
all_highest_scoring_footballers = names_from_indices(df_footballers, idx_highest_scoring_footballer, "Footballer name")
value_highest_scoring_footballers = value_from_first_index(df_footballers, idx_highest_scoring_footballer, "GW points")
if value_highest_scoring_footballers is not None:
    rule_based_metrics.append(
        f"{all_highest_scoring_footballers} gained the most points ({value_highest_scoring_footballers}) this Gameweek."
    )

# 8 Lowest scoring footballer
idx_lowest_scoring_footballer = all_extremes(df_footballers["GW points"], "min")
all_lowest_scoring_footballers = names_from_indices(df_footballers, idx_lowest_scoring_footballer, "Footballer name")
value_lowest_scoring_footballers = value_from_first_index(df_footballers, idx_lowest_scoring_footballer, "GW points")
if value_lowest_scoring_footballers is not None:
    rule_based_metrics.append(
        f"{all_lowest_scoring_footballers} gained the least points ({value_lowest_scoring_footballers}) this Gameweek."
    )

# 9 Highest scoring real team
team_scores = {
    team_id: df_footballers.loc[df_footballers["Real team ID"] == team_id, "GW points"].sum()
    for team_id in range(1, 21)
}
if team_scores:
    max_score = max(team_scores.values())
    max_teams = [tid for tid, score in team_scores.items() if score == max_score]
    highest_scoring_real_teams_list = [teams_lookup[tid] for tid in max_teams]
    rule_based_metrics.append(f"{highest_scoring_real_teams_list} scored the most points ({max_score}).")

# 10 Lowest scoring real team
if team_scores:
    min_score = min(team_scores.values())
    min_teams = [tid for tid, score in team_scores.items() if score == min_score]
    lowest_scoring_real_teams_list = [teams_lookup[tid] for tid in min_teams]
    rule_based_metrics.append(f"{lowest_scoring_real_teams_list} scored the least points ({min_score}).")

# 11-12 Best/worst transfers
max_value, min_value = float("-inf"), float("inf")
max_transfers, min_transfers = [], []

for idx, row in df_teams.iterrows():
    for tup in row.get("Transfers (In, Out, Points gained)", []):
        x = tup[2]
        if x > max_value:
            max_value, max_transfers = x, [(idx, tup)]
        elif x == max_value:
            max_transfers.append((idx, tup))
        if x < min_value:
            min_value, min_transfers = x, [(idx, tup)]
        elif x == min_value:
            min_transfers.append((idx, tup))

for idx, tup in max_transfers:
    team_name = df_teams.iloc[idx]["Team name"]
    rule_based_metrics.append(
        f"{team_name} got {tup[0]} in and removed {tup[1]} smartly and saw a change of {max_value} points."
    )

for idx, tup in min_transfers:
    team_name = df_teams.iloc[idx]["Team name"]
    rule_based_metrics.append(
        f"{team_name} got {tup[0]} in and removed {tup[1]} unwisely and saw a change of {min_value} points."
    )

# 13 Highest price increase
idx_highest_increased_price = all_extremes(df_footballers["Price difference (in Millions £)"], "max")
all_highest_price_increase_footballers = [
    df_footballers.iloc[i]["Footballer name"]
    for i in idx_highest_increased_price
    if df_footballers.iloc[i]["Price difference (in Millions £)"] > 0
]
value_highest_increased_price = value_from_first_index(df_footballers, idx_highest_increased_price, "Price difference (in Millions £)")
if all_highest_price_increase_footballers and value_highest_increased_price is not None:
    rule_based_metrics.append(
        f"{all_highest_price_increase_footballers} had the highest price increase (£{value_highest_increased_price}M) this Gameweek."
    )

# 14 Highest priced players vs points
most_expensive_players = df_footballers.sort_values("Price (in Millions £)", ascending=False).head(5)
price_vs_points = "Highest priced footballers \n"
for _, row in most_expensive_players.iterrows():
    if players.get(row["Footballer ID"], {}).get("minutes", 0) == 0:
        continue
    price_vs_points += f"{row['Footballer name']} valued at {row['Price (in Millions £)']} scored {row['GW points']} points. \n"
rule_based_metrics.append(price_vs_points.strip())

# 15 Most chosen players vs points
most_chosen_players = df_footballers.sort_values("Times chosen in squad", ascending=False).head(5)
chosen_vs_points = "Most picked footballers \n"
for _, row in most_chosen_players.iterrows():
    if players.get(row["Footballer ID"], {}).get("minutes", 0) == 0:
        continue
    chosen_vs_points += f"{row['Footballer name']} chosen {row['Times chosen in squad']} times scored {row['GW points']} points. \n"
rule_based_metrics.append(chosen_vs_points.strip())

# 16 Most captained players vs points
most_captained_players = df_footballers.sort_values("Times captained", ascending=False).head(5)
captained_vs_points = "Most times picked as captains \n"
for _, row in most_captained_players.iterrows():
    if players.get(row["Footballer ID"], {}).get("minutes", 0) == 0:
        continue
    captained_vs_points += f"{row['Footballer name']} chosen captain {row['Times captained']} times scored {row['GW points']} points. \n"
rule_based_metrics.append(captained_vs_points.strip())

# 18 Chips usage
chips_usage = []
for _, row in df_teams.iterrows():
    if row.get("Chips used"):
        chips_usage.append(f"{row['Team name']} used the chip(s): {row['Chips used']}")
if chips_usage:
    rule_based_metrics.append(" ".join(chips_usage))

# 19 Rare players (least selected in starting XI across league)
all_playing_times_played = {}

# Collect all players that actually played (multiplier > 0)
for pick in all_picks:
    for player in pick.get("picks", []):
        if player.get("multiplier", 0) != 0:
            row = df_footballers.loc[df_footballers["Footballer ID"] == player["element"], "Times chosen in squad"]
            if not row.empty:
                all_playing_times_played[player["element"]] = int(row.squeeze())

if all_playing_times_played:
    min_selected = min(all_playing_times_played.values())
    # Map: player_id -> player object for quick lookup
    least_selected_players = {
        pid: count for pid, count in all_playing_times_played.items() if count == min_selected
    }

    # Build {player name: score}
    least_selected_players_vs_scores = {}
    for pid in least_selected_players:
        name = players[pid]["web_name"]
        score = players[pid].get("event_points", 0)
        least_selected_players_vs_scores[name] = score

    rare_players_score = (
        f"Here is/are the least selected player(s) who started {min_selected} times "
        f"and their score(s): {least_selected_players_vs_scores}."
    )
    rule_based_metrics.append(rare_players_score)
else:
    rule_based_metrics.append("No rare players found this Gameweek.")

In [182]:
#20 top scoring rare captains
all_playing_XI_players_ids = []
all_playing_XI_players_frequencies = []

for id, freq in all_playing_times_played.items():
    all_playing_XI_players_ids.append(id)
    all_playing_XI_players_frequencies.append(freq)

who_played_me = defaultdict(list)
for id in all_playing_XI_players_ids:
    for _, team in df_teams.iterrows():
        p11 = team["Playing XI with ID"]
        for name, pid in p11:
            if pid == id:
                who_played_me[id].append(team["Team name"])

all_playing_XI_players_df = pd.DataFrame(who_played_me.items(), columns= ["Footballer ID", "Played in teams"])
all_playing_XI_players_df = df_footballers.merge(all_playing_XI_players_df, on="Footballer ID")

In [183]:
non_zero_captains = df_footballers[df_footballers["Times captained"] > 0]
rare_captains_cutoff = int(non_zero_captains["Times captained"].nsmallest(1).iloc[-1])
top_scoring_rare_captains_df = all_playing_XI_players_df[all_playing_XI_players_df["Times captained"] == rare_captains_cutoff].sort_values("GW points", ascending=False)
top_captain_cutoff = int(top_scoring_rare_captains_df["GW points"].nlargest(1).iloc[-1])
top_scoring_rare_captains_df = top_scoring_rare_captains_df[top_scoring_rare_captains_df["GW points"] == top_captain_cutoff][["Footballer name", "GW points", "Times captained", "Played in teams"]]
top_scoring_rare_captains_str = top_scoring_rare_captains_df.to_string(index=False)

rule_based_metrics.append(f"Here are the top scoring player(s) only chosen captain {rare_captains_cutoff} times. The table consists which team(s) chose them: {top_scoring_rare_captains_str}")

In [None]:
#21 

Unnamed: 0,Footballer name,GW points,Times captained,Played in teams
3,Saka,6,1,"[Arroz Blanco FC, TheatreOfMemes, Saka Potatoe..."
11,Semenyo,6,1,[K8 the Gr8]


In [None]:
#22 Played common players

In [185]:
rule_based_metrics

["['K8 the Gr8'] topped the league",
 "['People on the pitch'] are at the bottom of the league",
 "['Arroz Blanco FC'] showed the maximum change of points over the last week. Their change: 60 points from 43 to 103.",
 "['Arroz Blanco FC'] gained the most points (60) this Gameweek.",
 "['People on the pitch'] gained the least points (21) this Gameweek.",
 "['J.Timber'] gained the most points (24) this Gameweek.",
 "['Toti'] gained the least points (-2) this Gameweek.",
 "['Arsenal'] scored the most points (94).",
 "['Liverpool', 'Newcastle'] scored the least points (0).",
 'Saka Potatoes got Mateta in and removed Isak smartly and saw a change of 2 points.',
 'People on the pitch got Dorgu in and removed Wan-Bissaka smartly and saw a change of 2 points.',
 'Hustle & Flo got Reijnders in and removed Sarr unwisely and saw a change of -6 points.',
 'Ha-Cunha Mateta got Reijnders in and removed Caicedo unwisely and saw a change of -6 points.',
 "['Gabriel', 'Saliba', 'Calafiori', 'Semenyo', 

Preprocess dataframe and metrics list to feed to LLM

In [129]:
rule_based_metrics_text = "\n".join(f"- {metric}" for metric in rule_based_metrics)
df_teams_text = df_teams.to_string(index=False)

## Report Generation ##

Extra statistics to display on the roundup

1. Top 3, Bottom 3

In [130]:
top_3_df = df_teams.sort_values("Total points", ascending=False).head(3)[["Player name", "Team name", "Total points"]]
top_3_str = top_3_df.to_string(index=False)
bottom_3_df = df_teams.sort_values("Total points", ascending=False).tail(3)[["Player name", "Team name", "Total points"]]
bottom_3_str = bottom_3_df.to_string(index=False)

2. Best/Worst transfer making teams

In [131]:
# Best and worst transfer-making teams (net score)
team_transfer_scores = {}

for idx, row in df_teams.iterrows():
    transfers = row.get("Transfers (In, Out, Points gained)", [])
    total_score = sum(tup[2] for tup in transfers) if transfers else 0
    team_transfer_scores[idx] = total_score

best_transfer_str, worst_transfer_str = "", ""

if team_transfer_scores:  # safety check
    max_score = max(team_transfer_scores.values())
    min_score = min(team_transfer_scores.values())

    best_teams = [df_teams.iloc[idx]["Team name"] for idx, score in team_transfer_scores.items() if score == max_score]
    worst_teams = [df_teams.iloc[idx]["Team name"] for idx, score in team_transfer_scores.items() if score == min_score]

    # Create the required strings
    best_transfer_str = f"{', '.join(best_teams)} : {max_score}"
    worst_transfer_str = f"{', '.join(worst_teams)} : {min_score}"

3. League Name, Gameweek Number

In [132]:
league_name_text = f"League name: {league_name}"
gw_text = f"Gameweek Number: {gw}"

4. Biggest rank riser/fallers

In [133]:
fallers = []
risers = []

for _, row in df_teams.iterrows():
    team = row["Team name"]
    ranks = row["Rankings history"]
    current_rank = ranks[-1]

    # --- FALLER check ---
    if len(ranks) > 1:
        best_rank = min(ranks[:-1])
        gw_best = ranks.index(best_rank) + 1
        fall = current_rank - best_rank
        if fall > 0:
            fallers.append({
                "team": team,
                "change": fall,
                "from_rank": best_rank,
                "from_gw": gw_best,
                "to_rank": current_rank,
                "to_gw": current_gw
            })

    # --- RISER check ---
    if len(ranks) > 1:
        worst_rank = max(ranks[:-1])
        gw_worst = ranks.index(worst_rank) + 1
        rise = worst_rank - current_rank
        if rise > 0:
            risers.append({
                "team": team,
                "change": rise,
                "from_rank": worst_rank,
                "from_gw": gw_worst,
                "to_rank": current_rank,
                "to_gw": current_gw
            })

# --- Handle ties ---
fall_strs, rise_strs = [], []

if fallers:
    max_fall_change = max(f["change"] for f in fallers)
    max_fallers = [f for f in fallers if f["change"] == max_fall_change]

    for f in max_fallers:
        fall_strs.append(
            f'Team {f["team"]} fell {f["change"]} places '
            f'from GW{f["from_gw"]} (rank: {f["from_rank"]}) '
            f'to GW{f["to_gw"]} (rank: {f["to_rank"]}).'
        )
else:
    fall_strs = "No rank fallers so far."

if risers:
    max_rise_change = max(r["change"] for r in risers)
    max_risers = [r for r in risers if r["change"] == max_rise_change]

    for r in max_risers:
        rise_strs.append(
            f'Team {r["team"]} rose {r["change"]} places '
            f'from GW{r["from_gw"]} (rank: {r["from_rank"]}) '
            f'to GW{r["to_gw"]} (rank: {r["to_rank"]}).'
        )
else: rise_strs = "No rank risers so far."

5. Unique picks and their scores

In [205]:
least_selected_players_indices = []
for p in least_selected_players:
    least_selected_players_indices.append(p)

names = list(least_selected_players_vs_scores.keys())
points = list(least_selected_players_vs_scores.values())

# Step 2: build DataFrame
df_rare_players = pd.DataFrame({
    "Player ID": least_selected_players_indices,
    "Player": names,
    "Score": points
})

df_rare_players = df_rare_players.sort_values("Score", ascending=False).head(3)


top_rare_indices = list(df_rare_players.head(3)["Player ID"])
selected_by = defaultdict(list)
for index in top_rare_indices:
    for _, row in df_teams.iterrows():
        team = row["Playing XI with ID"]
        for player in team:
            if player[1] == index:
                selected_by[index].append(row["Team name"])

rare_player_team = list(selected_by.values())

df_rare_players_shortlisted = df_rare_players.copy().drop(columns = "Player ID")
df_rare_players_shortlisted["Selected by"] = rare_player_team
top_unique_picks = df_rare_players_shortlisted.to_string(index=False)

6. Captaincy effectiveness chart

In [135]:
df_ratios = pd.DataFrame(list(captaincy_effectiveness_ratio.items()), columns=["Entry ID", "Ratio"])

# Merge with df_team to get team names
df_ratios = df_ratios.merge(df_teams[["Entry ID", "Team name"]], on="Entry ID", how="right")

# Sort by ratio
df_ratios_sorted = df_ratios.sort_values(by="Ratio", ascending=False)

top_cutoff = df_ratios_sorted["Ratio"].nlargest(3).iloc[-1]
top_captaincy_ratio_teams = df_ratios_sorted[df_ratios_sorted["Ratio"] >= top_cutoff].reset_index(drop=True)
top_captaincy_ratio_teams = top_captaincy_ratio_teams.drop(columns="Entry ID")
top_captaincy_str = top_captaincy_ratio_teams.to_string(index=False)

bottom_cutoff = df_ratios_sorted["Ratio"].nsmallest(3).iloc[-1]
bottom_captaincy_ratio_teams = df_ratios_sorted[df_ratios_sorted["Ratio"] <= bottom_cutoff].reset_index(drop=True)
bottom_captaincy_ratio_teams = bottom_captaincy_ratio_teams.drop(columns="Entry ID")
worst_captaincy_str = bottom_captaincy_ratio_teams.to_string(index=False)

7. Top-3 teams with the highest single-player reliance

In [136]:
all_top_1_contribution_records = []
for entry_id, tuples in all_top_3_contributors_ids.items():
    team_name = df_teams[df_teams["Entry ID"] == entry_id]["Team name"].iloc[0]
    for player, score in tuples:
        all_top_1_contribution_records.append((team_name, player, score))
        break

# Convert to DataFrame
df_contribution_records = pd.DataFrame(all_top_1_contribution_records, columns=["Team name", "Highest reliance player name", "Reliance %"])

# Find the cutoff for top 3 scores (handles ties)
top_3_reliance_cutoff = df_contribution_records["Reliance %"].nlargest(3).iloc[-1]
bottom_3_reliance_cutoff = df_contribution_records["Reliance %"].nsmallest(3).iloc[-1]

top_3_reliance_df = df_contribution_records[df_contribution_records["Reliance %"] >= top_3_reliance_cutoff].sort_values(by="Reliance %", ascending=False)
top_3_reliance_str = top_3_reliance_df.to_string(index=False)
bottom_3_reliance_df = df_contribution_records[df_contribution_records["Reliance %"] <= bottom_3_reliance_cutoff].sort_values(by="Reliance %", ascending=True)
bottom_3_reliance_str = bottom_3_reliance_df.to_string(index=False)

In [137]:
all_top_1_contribution_records = []
for entry_id, tuples in all_top_3_contributors_ids.items():
    team_name = df_teams[df_teams["Entry ID"] == entry_id]["Team name"].iloc[0]
    for player, score in tuples:
        all_top_1_contribution_records.append((team_name, player, score))
        break

# Convert to DataFrame
df_contribution_records = pd.DataFrame(all_top_1_contribution_records, columns=["Team name", "Highest reliance player name", "Reliance %"])

# Find the cutoff for top 3 scores (handles ties)
top_3_reliance_cutoff = df_contribution_records["Reliance %"].nlargest(3).iloc[-1]
bottom_3_reliance_cutoff = df_contribution_records["Reliance %"].nsmallest(3).iloc[-1]

top_3_reliance_df = df_contribution_records[df_contribution_records["Reliance %"] >= top_3_reliance_cutoff].sort_values(by="Reliance %", ascending=False)
top_3_reliance_str = top_3_reliance_df.to_string(index=False)
bottom_3_reliance_df = df_contribution_records[df_contribution_records["Reliance %"] <= bottom_3_reliance_cutoff].sort_values(by="Reliance %", ascending=True)
bottom_3_reliance_str = bottom_3_reliance_df.to_string(index=False)

8. Chips usage effectiveness

In [138]:
# Aggregate chip effectiveness per entry (sum of chip impacts)
chips_scores = {entry_id: 0 for entry_id in entries}

for entry_id in entries:
    for gw in range(1, current_gw + 1):
        picks_url = f"{BASE_URL}/entry/{entry_id}/event/{gw}/picks/"
        picks = requests.get(picks_url).json()

        if "picks" not in picks:  # skip if data not available
            continue

        chip_used = picks.get("active_chip") # Only 1 chip can be used per gameweek in FPL
        if not chip_used:
            continue

        if chip_used == "3xc": # Triple captain
            captain_id = next(p["element"] for p in picks["picks"] if p["is_captain"])
            added_points = players[captain_id]["event_points"]
            chips_scores[entry_id] += added_points

        elif chip_used == "bboost": # Bench boost
            bench = [p["element"] for p in picks["picks"] if p["position"] >11]
            bench_points = sum(players[eid]["event_points"] for eid in bench)
            chips_scores[entry_id] += bench_points

        elif chip_used == "freehit":
            # Actual FH score
            actual_points = sum(
                players[p["element"]]["event_points"] * p["multiplier"]
                for p in picks["picks"]
            )

            prev_url = f"{BASE_URL}/entry/{entry_id}/event/{gw-1}/picks/"
            prev_picks = requests.get(prev_url).json()

            if "picks" in prev_picks:
                hypothetical_points = sum(
                    players[p["element"]]["event_points"] * p["multiplier"]
                    for p in prev_picks["picks"]
                )
                added_points = actual_points - hypothetical_points
                chips_scores[entry_id] += added_points

        elif chip_used == "wildcard":
            # Actual WC score
            actual_points = sum(
                players[p["element"]]["event_points"] * p["multiplier"]
                for p in picks["picks"]
            )

            prev_url = f"{BASE_URL}/entry/{entry_id}/event/{gw-1}/picks/"
            prev_picks = requests.get(prev_url).json()

            if "picks" in prev_picks:
                hypothetical_points = sum(
                    players[p["element"]]["event_points"] * p["multiplier"]
                    for p in prev_picks["picks"]
                )
                added_points = actual_points - hypothetical_points
                chips_scores[entry_id] += added_points

# Convert results to DataFrame with team names
results = []
for entry_id, score in chips_scores.items():
    results.append({"Entry ID": entry_id, "Chips added score": score})

df_chips = pd.DataFrame(results)
df_chips = df_chips.merge(df_teams, on="Entry ID")
df_chips = df_chips[["Team name", "Chips added score"]]

In [139]:
top_3_chips_cutoff = df_chips["Chips added score"].nlargest(3).iloc[-1]
bottom_3_chips_cutoff = df_chips["Chips added score"].nsmallest(3).iloc[-1]

top_3_chips_df = df_chips[df_chips["Chips added score"] >= top_3_chips_cutoff].sort_values(by="Chips added score", ascending=False)
top_3_chips_str = top_3_chips_df.to_string(index=False)
bottom_3_chips_df = df_chips[df_chips["Chips added score"] <= bottom_3_chips_cutoff].sort_values(by="Chips added score", ascending=True)
bottom_3_chips_str = bottom_3_chips_df.to_string(index=False)

LLM Call to generate summary

In [140]:
load_dotenv()

api_key = os.getenv("OPENAI_API_KEY")
client = OpenAI(api_key=api_key)

LLM prompt

In [None]:
prompt = f"""You are the witty but fair commissioner of an FPL league. Voice: playful, insightful, never mean-spirited.
Write about Gameweek {current_gw} in 500-700 words. You also need to have knowledge of the Premier Leagure footballing world and make
remarks that the fans can relate to, as a part of your sense of humor.

Here are some highlight-metrics that I could find from my data: {rule_based_metrics_text}

Mention most of these stats in the summary, depending on how interesting each is. MUST mention table topper, last position holder, highest scoring footballer every time.
Examples:
Looking at the player prices vs their points, you can comment on their performance and what was expected from them. Similarly for most chosen players, and most captained players.
You could comment on how the least selected players performed, and make sure to mention which team(s) chose them and benefitted from it. Find the team line-ups in the teams dataframe given later.
Comment on how well someone chose their captain (one should choose their highest scoring player as captain always). This information is present in the columns: 'Percentage contribution overall' and 'Captaincy effectiveness ratio overall'. A ratio of 1 means the best player chosen as captain. 0 means worst.

Here is the teams table. You must use it to comment on the structure of the standings table, and how competitive what part of the table is: {df_teams_text}
Using the table, also look for unusual stuff, or blunders that the players might have made for example not setting up a captain or vice-captain.
In the table you would see player names as well, you can use them interchangeably with the team names for a more personal effect, but only sometimes.
You also have a favourite teams column for every player, feel free to trash talk a player if their favourite real team got bashed in the GW.
Transfer hits are the 4 point deductions a team has to face for every additional transfer than those they are allowed, use them to comment as well.
You also have the rankings history of every team in a list. Eg. [2, 4, 3] means a team ranked 2nd, 4th, and 3rd, after the first, second and third GW respectively. It suggests a team's rise/downfall/consistency.
MUST talk about most common strategy throughout all the managers in our league (i.e - fielded 3 or more players almost everyone else had, captained someone who almost everyone had, used a chip that almost everyone used, etc.)
MUST give a shoutout to one or two managers who did something unique this week (i.e - fielded a player who no one else had who scored big, captained someone who no one else did got lots of points from them, used a chip when no one else did, etc.)

More data you could utilize for more comments:
1. Unique player picks, their scores and the player name: {top_unique_picks}
2. Players who relied the most on a single player for their scores: {top_3_reliance_str}
3. Players who relied the least on a single player for their scores: {bottom_3_reliance_str}
4. Top chip users: {top_3_chips_str}
5. Worst chip users: {bottom_3_chips_str}
6. Column 'playing_XI with times chosen' tells how many times each player was used by our managers. Use that to comment on a team's strategy, whether they're picking common players and playing safe, or using rare players.
7. Similarly you have 'Captains with times captained' column. Use that to comment on a manager's unique or common choice of picking captains.
8. Also comment on whether a manager chose a common chip, or was their pick unique that no one else picked. You have the 'Chips used' column for this.

Be smart, and extract as much insights as possible from the data given.
"""

LLM response

In [None]:
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a witty fantasy football commissioner."},
        {"role": "user", "content": prompt}
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

Ah, Gameweek 2, the week when the Fantasy Premier League gods decided to toy with our emotions like a cat with a ball of yarn. For those of you who found yourselves on the rollercoaster of fantasy football highs and lows, buckle up as we dive into the drama, the glory, and the facepalms of the week.

First, let's tip our hats to Kait Wojtaszek, the mastermind behind 'K8 the Gr8,' who sits atop the league with a solid 127 points. Kait's strategy is as effective as Arsenal's title hopes (pre-injury crisis, of course). With Semenyo as captain delivering a respectable 6 points, the team is as steady as Gabriel Martinelli on a sunny day. No transfer hits, no wild chips, just good old-fashioned consistency.

On the flip side, we have 'People on the pitch,' managed by Paul Quinsee, who seems to be taking inspiration from West Ham’s infamous relegation battles. With a grand total of 61 points, Paul’s team looks like they’ve been out on the town with Big Sam the night before a game. The decisio

## 📝 Final Summary Output

In [285]:
# Assemble the final human-readable report
final_response = f"""
{league_name_text} \n
{gw_text} \n
Summary: \n\n
{response.choices[0].message.content} \n\n
Top 3 teams: \n{top_3_str} \n\n
Bottom 3 teams: \n{bottom_3_str} \n\n
Best Transfer Maker(s): \n{best_transfer_str} points earned in the best transfer\n\n
Worst Transfer Maker(s): \n{worst_transfer_str} points earned in the worst transfer\n\n
Biggest rank riser(s): \n{rise_strs} \n\n
Biggest rank faller(s): \n{fall_strs} \n\n
Top scoring unique pick(s) chosen only {min_selected} times by managers in the league: \n {top_unique_picks} \n\n
Best captaincy effectiveness teams: \n{top_captaincy_str} \n\n
Worst captaincy effectiveness teams: \n{worst_captaincy_str} \n\n
Most single-player reliant teams: \n{top_3_reliance_str} \n\n
Least single-player reliant teams: \n{bottom_3_reliance_str} \n\n
Top chips score teams: \n{top_3_chips_str} \n\n
Bottom chips score teams: \n{bottom_3_chips_str} \n\n
"""


In [144]:
print(final_response)



League name: Ze Woorty Invitational Liga 

Gameweek Number: 2 

Summary: 


Ah, Gameweek 2, the week when the Fantasy Premier League gods decided to toy with our emotions like a cat with a ball of yarn. For those of you who found yourselves on the rollercoaster of fantasy football highs and lows, buckle up as we dive into the drama, the glory, and the facepalms of the week.

First, let's tip our hats to Kait Wojtaszek, the mastermind behind 'K8 the Gr8,' who sits atop the league with a solid 127 points. Kait's strategy is as effective as Arsenal's title hopes (pre-injury crisis, of course). With Semenyo as captain delivering a respectable 6 points, the team is as steady as Gabriel Martinelli on a sunny day. No transfer hits, no wild chips, just good old-fashioned consistency.

On the flip side, we have 'People on the pitch,' managed by Paul Quinsee, who seems to be taking inspiration from West Ham’s infamous relegation battles. With a grand total of 61 points, Paul’s team looks like t

## New LLM output format

In [274]:
df_teams_times_captained_strip = df_teams[["Player name", "Team name", "Captain with times captained this GW"]]
df_teams_times_captained_strip_str = df_teams_times_captained_strip.to_string(index=False)

df_teams_times_chosen_strip = df_teams[["Player name", "Team name", "Playing XI with times chosen this GW"]]
df_teams_times_chosen_strip_str = df_teams_times_chosen_strip.to_string(index=False)

chip_usage_stats = df_teams[["Player name", "Team name", "Chips used"]]
chip_usage_stats_str = chip_usage_stats.to_string(index=False)

favourite_teams_df = df_teams[["Player name", "Team name", "Favourite team"]]
favourite_teams_str = favourite_teams_df.to_string(index=False)

standings_df = df_teams[["Player name", "Team name", "Total points"]]
standings_str = standings_df.to_string(index=False)

Gameweek results

In [275]:
# build table with winner/loser
fixtures = requests.get(f"https://fantasy.premierleague.com/api/fixtures/?event={current_gw}" ).json()
matches = []
for f in fixtures:
    if f["finished"]:
        home_team = teams_lookup[f["team_h"]]
        away_team = teams_lookup[f["team_a"]]
        home_score = f["team_h_score"]
        away_score = f["team_a_score"]

        # decide result
        if home_score > away_score:
            winner, loser = home_team, away_team
        elif away_score > home_score:
            winner, loser = away_team, home_team
        else:
            winner, loser = "Draw", "Draw"

        matches.append({
            "GW": f["event"],
            "Home": home_team,
            "Away": away_team,
            "Score": f"{home_score} - {away_score}",
            "Winner": winner,
            "Loser": loser
        })

real_gameweek_results_df = pd.DataFrame(matches)
real_gameweek_results_str = real_gameweek_results_df.to_string(index=False)

In [277]:
real_gameweek_results_str = f"Here are the gameweek {current_gw} results: \n"
for i, result in real_gameweek_results_df.iterrows():
    if result["Winner"] == "Draw":
        real_gameweek_results_str += f"{result["Home"]} drew {result["Away"]} with a {result["Score"]} score. \n"
    else:
        real_gameweek_results_str += f"{result["Winner"]} defeated {result["Loser"]} with a {result["Score"]} score. \n"

Stripped teams table

In [279]:
df_teams_stripped = df_teams.copy()
df_teams_stripped = df_teams_stripped.drop(columns=["Entry ID", "Playing XI with ID"])
df_teams_stripped_str = df_teams_stripped.to_string(index=False)

Rivals

In [280]:
min_pd = float("-inf")
min_row = 0
for index, row in df_teams.head(5).iterrows():
    if index == 0:
        continue
    if row["Adjacent points difference"] > min_pd:
        min_pd = row["Adjacent points difference"]
        min_row = index

rival_2 = df_teams.drop(columns=["Entry ID", "Adjacent points difference"]).iloc[index-1]
rival_2_str = rival_2.to_string(index=False)
rival_1 = df_teams.drop(columns=["Entry ID", "Adjacent points difference"]).iloc[index-2]
rival_1_str = rival_1.to_string(index=False)

In [281]:
prompt = f"""You are the witty but fair commissioner of an FPL league. Voice: playful, insightful, never mean-spirited.
Write about Gameweek {current_gw} in 500-700 words. You also need to have knowledge of the Premier Leagure footballing world and make
remarks that the fans can relate to, as a part of your sense of humor.

Write your commentary in this order:
1. Talk about most common captains chosen by the managers using this data: {df_teams_times_captained_strip_str}. Also talk about most commonly played players by every team using this data: {df_teams_times_chosen_strip_str}. Also talk about the chip usage statistics by each team: {chip_usage_stats_str}.
2. Give shoutout to managers who played rarely chosen players and scored well. Use this data: {top_unique_picks}. Also, here are rarely chosen captains and their scores: {top_scoring_rare_captains_str}. Check {chip_usage_stats} to see if anyone stood out with chip usage this week.
3. Use the real gameweek results to trash talk about managers' favourite teams. Here are the results: {real_gameweek_results_str}. Here are managers and their favourite teams: {favourite_teams_str}.
4. Look at these two teams. 1: {rival_1_str}, 2: {rival_2_str}. These are close to each other in terms of points and standings. Create a rivalry out of them.
5. Looking at the rival teams, suggest what if scenarios. Eg, if team 2 didn't make their transfer, they would have had more points and be on top. Or if they played a player on their bench. Or if they captained the right player. You can also make these for the team 1, how they could have extended their lead in this rivalry.
6. Make a closing statement about the league and the gameweek. Here is a complete table, look for interesting stats to mention in your commentary: {df_teams_stripped_str}.

To add to your commentary, here are more stats and figures from FPL:
1. Unique player picks, their scores and the player name: {top_unique_picks}
2. Players who relied the most on a single player for their scores: {top_3_reliance_str}
3. Players who relied the least on a single player for their scores: {bottom_3_reliance_str}
4. Top chip users: {top_3_chips_str}
5. Worst chip users: {bottom_3_chips_str}
6. Comment on how well someone chose their captain (one should choose their highest scoring player as captain always). A ratio of 1 means the best player chosen as captain. 0 means worst. Check these tables: {top_captaincy_str}, {worst_captaincy_str}.

In the tables above you would see player names as well, you can use them interchangeably with the team names for a more personal effect, but only sometimes.

The 'Transfers' column holds transfer pair in this format: (Player IN, Player OUT, points GAINED by the transfer). So it the 3rd item is a positive number, the transfer was successful.
*Transfer hits are the 4 point deductions a team has to face for every additional transfer than those they are allowed, use them to comment as well.
*You also have the rankings history of every team in a list. Eg. [2, 4, 3] means a team ranked 2nd, 4th, and 3rd, after the first, second and third GW respectively. It suggests a team's rise/downfall/consistency.
MUST talk about most common strategy throughout all the managers in our league (i.e - fielded 3 or more players almost everyone else had, captained someone who almost everyone had, used a chip that almost everyone used, etc.)
MUST give a shoutout to one or two managers who did something unique this week (i.e - fielded a player who no one else had who scored big, captained someone who no one else did got lots of points from them, used a chip when no one else did, etc.)
"""

In [282]:
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a witty fantasy football commissioner."},
        {"role": "user", "content": prompt}
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

Greetings, fellow fantasy football magicians and managers extraordinaire! Welcome to the whimsical world of Gameweek 2, where the stakes are high, the banter is rife, and the fantasy football gods have blessed (or cursed) us with another round of Premier League drama.

**Captain's Log: The Usual Suspects and the Brave Mavericks**

Let’s start with our fearless captains. In Gameweek 2, the popular choice for captaincy was none other than Mohamed Salah, the Egyptian King. Managers like D'Arcy Williams, John Deutsch, Rick Thorley, and Paul Quinsee placed their faith in Salah, hoping he’d work his magic. Alas, he didn’t quite sail the Nile of points they hoped for, but his solid performance was a respectable one.

On the other end of the captaincy spectrum, with a daring spirit akin to a Gryffindor at heart, we had Kamal Logue and Mark Woort-Menker. Kamal went rogue with Ekitiké and Mark with Saka. While neither player set the pitch ablaze, we salute their audacity. And Kait Wojtaszek, und

In [283]:
# Assemble the final human-readable report
final_response = f"""
{league_name_text} \n
{gw_text} \n
Summary: \n\n
{response.choices[0].message.content} \n\n
Top 3 teams: \n{top_3_str} \n\n
Bottom 3 teams: \n{bottom_3_str} \n\n
Best Transfer Maker(s): \n{best_transfer_str} points earned \n\n
Worst Transfer Maker(s): \n{worst_transfer_str} points earned \n\n
Biggest rank riser(s): \n{rise_strs} \n\n
Biggest rank faller(s): \n{fall_strs} \n\n
Top scoring unique pick(s) chosen only {min_selected} times by managers in the league: \n {top_unique_picks} \n\n
Best captaincy effectiveness teams: \n{top_captaincy_str} \n\n
Worst captaincy effectiveness teams: \n{worst_captaincy_str} \n\n
Most single-player reliant teams: \n{top_3_reliance_str} \n\n
Least single-player reliant teams: \n{bottom_3_reliance_str} \n\n
Top chips score teams: \n{top_3_chips_str} \n\n
Bottom chips score teams: \n{bottom_3_chips_str} \n\n
"""

In [284]:
print(final_response)


League name: Ze Woorty Invitational Liga 

Gameweek Number: 2 

Summary: 


Greetings, fellow fantasy football magicians and managers extraordinaire! Welcome to the whimsical world of Gameweek 2, where the stakes are high, the banter is rife, and the fantasy football gods have blessed (or cursed) us with another round of Premier League drama.

**Captain's Log: The Usual Suspects and the Brave Mavericks**

Let’s start with our fearless captains. In Gameweek 2, the popular choice for captaincy was none other than Mohamed Salah, the Egyptian King. Managers like D'Arcy Williams, John Deutsch, Rick Thorley, and Paul Quinsee placed their faith in Salah, hoping he’d work his magic. Alas, he didn’t quite sail the Nile of points they hoped for, but his solid performance was a respectable one.

On the other end of the captaincy spectrum, with a daring spirit akin to a Gryffindor at heart, we had Kamal Logue and Mark Woort-Menker. Kamal went rogue with Ekitiké and Mark with Saka. While neither p