# 📊 Fantasy Premier League (FPL) - Complete Data Analysis & Strategy Tools

## 🎯 **Overview**
This notebook provides comprehensive analysis tools for Fantasy Premier League decision-making, including:
- **Data Exploration & Cleaning** - Understanding the dataset structure
- **Season Performance Analysis** - Player and team cumulative statistics  
- **Strategic Analysis Tools** - Fixture difficulty, player rankings, team strength
- **Actionable FPL Insights** - Real-world applications for transfers and team selection

## 📋 **Table of Contents**
1. [**Data Loading & Overview**](#data-loading--overview)
2. [**Data Cleaning & Processing**](#data-cleaning--processing)  
3. [**Exploratory Data Analysis**](#exploratory-data-analysis)
4. [**Season Statistics Calculation**](#season-statistics-calculation)
5. [**Player Performance Analysis**](#player-performance-analysis)
6. [**Strategic Analysis Tools**](#strategic-analysis-tools)
7. [**Fixture Analysis System**](#fixture-analysis-system)
8. [**Quick Reference & Usage Guide**](#quick-reference--usage-guide)

---

In [248]:
import pandas as pd 
df = pd.read_csv('fpl-data-stats.csv')
df.describe()

Unnamed: 0,id,element_type,now_cost,selected_by_percent,gameweek,minutes,shots,SoT,SiB,xG,...,defensive_contribution,xGI,npxGI,xP,total_points,PvsxP,touches,penalty_area_touches,carries_final_third,carries_penalty_area
count,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,...,4261.0,4261.0,4261.0,4261.0,4261.0,4261.0,1799.0,1799.0,4261.0,4261.0
mean,359.551044,2.545177,5.004764,2.07449,3.505985,27.229524,0.32199,0.101619,0.218728,0.034616,...,2.062896,0.059446,0.056489,1.242563,1.258155,0.015592,37.840467,1.481934,0.307909,0.122741
std,208.728401,0.834209,1.10307,6.073265,1.687784,37.794209,0.807859,0.367391,0.636921,0.128099,...,3.609688,0.172044,0.161734,2.04329,2.418339,1.421444,24.616561,1.910565,0.819943,0.522904
min,1.0,1.0,3.9,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,-2.0,-3.0,-11.4,0.0,0.0,0.0,0.0
25%,178.0,2.0,4.4,0.1,2.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,18.0,0.0,0.0,0.0
50%,360.0,3.0,4.8,0.2,4.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,35.0,1.0,0.0,0.0
75%,538.0,3.0,5.4,1.0,5.0,70.0,0.0,0.0,0.0,0.0,...,3.0,0.0,0.0,2.086,1.0,0.0,53.5,2.0,0.0,0.0
max,742.0,4.0,14.5,66.6,6.0,90.0,7.0,4.0,7.0,2.0,...,23.0,2.0,2.0,13.0,24.0,12.826,129.0,18.0,8.0,11.0


# 1️⃣ Data Loading & Overview {#data-loading--overview}

## 📂 Import Data and Initial Exploration
This section loads the FPL dataset and provides basic information about its structure.

In [249]:
# Dataset Overview and Structure
print("=== DATASET OVERVIEW ===")
print(f"Dataset Shape: {df.shape}")
print(f"Total Records: {df.shape[0]:,}")
print(f"Total Features: {df.shape[1]}")
print("\n=== COLUMN NAMES ===")
print(df.columns.tolist())

print("\n=== DATA TYPES ===")
print(df.dtypes)

print("\n=== BASIC INFO ===")
df.info()

=== DATASET OVERVIEW ===
Dataset Shape: (4261, 37)
Total Records: 4,261
Total Features: 37

=== COLUMN NAMES ===
['id', 'element_type', 'web_name', 'team_name', 'opponent_team_name', 'was_home', 'now_cost', 'selected_by_percent', 'gameweek', 'minutes', 'shots', 'SoT', 'SiB', 'xG', 'npxG', 'G', 'npG', 'key_passes', 'xA', 'A', 'xGC', 'GC', 'xCS', 'CS', 'clearances_blocks_interceptions', 'recoveries', 'tackles', 'defensive_contribution', 'xGI', 'npxGI', 'xP', 'total_points', 'PvsxP', 'touches', 'penalty_area_touches', 'carries_final_third', 'carries_penalty_area']

=== DATA TYPES ===
id                                   int64
element_type                         int64
web_name                            object
team_name                           object
opponent_team_name                  object
was_home                              bool
now_cost                           float64
selected_by_percent                float64
gameweek                             int64
minutes                  

In [250]:
# Missing Values Analysis
print("=== MISSING VALUES ANALYSIS ===")
missing_values = df.isnull().sum()
missing_percentage = (missing_values / len(df)) * 100

missing_df = pd.DataFrame({
    'Column': missing_values.index,
    'Missing Count': missing_values.values,
    'Missing Percentage': missing_percentage.values
}).sort_values('Missing Count', ascending=False)

# Display only columns with missing values
if missing_df['Missing Count'].sum() > 0:
    print(missing_df[missing_df['Missing Count'] > 0])
else:
    print("No missing values found in the dataset!")

print(f"\nTotal missing values in dataset: {missing_values.sum():,}")
print(f"Percentage of complete records: {((len(df) - missing_values.sum()) / len(df)) * 100:.2f}%")

df = df.drop(columns=['penalty_area_touches', 'touches'])

=== MISSING VALUES ANALYSIS ===
                  Column  Missing Count  Missing Percentage
34  penalty_area_touches           2462           57.779864
33               touches           2462           57.779864

Total missing values in dataset: 4,924
Percentage of complete records: -15.56%


# 2️⃣ Data Cleaning & Processing {#data-cleaning--processing}

## 🧹 Data Quality Assessment and Cleaning
Analyzing missing values, data types, and performing necessary data cleaning operations.

In [251]:
# Separate Numerical and Categorical Variables
import numpy as np

# Identify numerical and categorical columns
numerical_cols = df.select_dtypes(include=[np.number]).columns.tolist()
categorical_cols = df.select_dtypes(include=['object', 'category']).columns.tolist()

print("=== VARIABLE TYPES ===")
print(f"Numerical variables ({len(numerical_cols)}): {numerical_cols}")
print(f"\nCategorical variables ({len(categorical_cols)}): {categorical_cols}")

# For categorical variables, show unique values
print("\n=== CATEGORICAL VARIABLES ANALYSIS ===")
for col in categorical_cols[:10]:  # Show first 10 categorical columns
    unique_count = df[col].nunique()
    print(f"\n{col}:")
    print(f"  - Unique values: {unique_count}")
    if unique_count <= 20:  # Show values if not too many
        print(f"  - Values: {sorted(df[col].unique())}")
    else:
        print(f"  - Top 10 values: {df[col].value_counts().head(10).index.tolist()}")

=== VARIABLE TYPES ===
Numerical variables (31): ['id', 'element_type', 'now_cost', 'selected_by_percent', 'gameweek', 'minutes', 'shots', 'SoT', 'SiB', 'xG', 'npxG', 'G', 'npG', 'key_passes', 'xA', 'A', 'xGC', 'GC', 'xCS', 'CS', 'clearances_blocks_interceptions', 'recoveries', 'tackles', 'defensive_contribution', 'xGI', 'npxGI', 'xP', 'total_points', 'PvsxP', 'carries_final_third', 'carries_penalty_area']

Categorical variables (3): ['web_name', 'team_name', 'opponent_team_name']

=== CATEGORICAL VARIABLES ANALYSIS ===

web_name:
  - Unique values: 721
  - Top 10 values: ['Patterson', 'Henderson', 'James', 'Roberts', 'White', 'Gomez', 'Onana', 'Barnes', 'Neto', 'Harrison']

team_name:
  - Unique values: 20
  - Values: ['Arsenal', 'Aston Villa', 'Bournemouth', 'Brentford', 'Brighton', 'Burnley', 'Chelsea', 'Crystal Palace', 'Everton', 'Fulham', 'Leeds', 'Liverpool', 'Man City', 'Man Utd', 'Newcastle', "Nott'm Forest", 'Spurs', 'Sunderland', 'West Ham', 'Wolves']

opponent_team_name:
  

In [252]:
# Filter useful numerical variables for FPL analysis
print("=== FILTERING USEFUL NUMERICAL VARIABLES ===")

# Define categories of useful variables
core_performance = ['total_points', 'minutes', 'now_cost', 'selected_by_percent']
attacking_metrics = ['G', 'A', 'xG', 'xA', 'shots', 'SoT', 'key_passes']
expected_metrics = ['xG', 'xA', 'xGI', 'npxG', 'npxGI', 'xP']
defensive_metrics = ['CS', 'xCS', 'GC', 'xGC', 'tackles', 'recoveries', 
                    'clearances_blocks_interceptions', 'defensive_contribution']
advanced_metrics = ['PvsxP', 'carries_final_third', 'carries_penalty_area']

# Combine into useful variables list
useful_numerical_vars = list(set(core_performance + attacking_metrics + 
                                expected_metrics + defensive_metrics + advanced_metrics))

# Filter only variables that exist in the dataset
useful_vars_available = [var for var in useful_numerical_vars if var in numerical_cols]

print(f"Original numerical variables: {len(numerical_cols)}")
print(f"Useful numerical variables: {len(useful_vars_available)}")
print(f"Variables removed: {len(numerical_cols) - len(useful_vars_available)}")

print(f"\n=== USEFUL VARIABLES BY CATEGORY ===")
print(f"Core Performance: {[v for v in core_performance if v in useful_vars_available]}")
print(f"Attacking Metrics: {[v for v in attacking_metrics if v in useful_vars_available]}")
print(f"Expected Stats: {[v for v in expected_metrics if v in useful_vars_available]}")
print(f"Defensive Metrics: {[v for v in defensive_metrics if v in useful_vars_available]}")
print(f"Advanced Metrics: {[v for v in advanced_metrics if v in useful_vars_available]}")

# Variables to exclude (less useful for FPL analysis)
excluded_vars = [var for var in numerical_cols if var not in useful_vars_available]
print(f"\n=== EXCLUDED VARIABLES ===")
print(f"Less useful for FPL: {excluded_vars}")

# Create filtered dataset with useful variables only
useful_numerical_df = df[useful_vars_available].copy()
print(f"\n=== FILTERED DATASET INFO ===")
print(f"Shape: {useful_numerical_df.shape}")
print(f"Useful numerical variables: {useful_vars_available}")

=== FILTERING USEFUL NUMERICAL VARIABLES ===
Original numerical variables: 31
Useful numerical variables: 26
Variables removed: 5

=== USEFUL VARIABLES BY CATEGORY ===
Core Performance: ['total_points', 'minutes', 'now_cost', 'selected_by_percent']
Attacking Metrics: ['G', 'A', 'xG', 'xA', 'shots', 'SoT', 'key_passes']
Expected Stats: ['xG', 'xA', 'xGI', 'npxG', 'npxGI', 'xP']
Defensive Metrics: ['CS', 'xCS', 'GC', 'xGC', 'tackles', 'recoveries', 'clearances_blocks_interceptions', 'defensive_contribution']
Advanced Metrics: ['PvsxP', 'carries_final_third', 'carries_penalty_area']

=== EXCLUDED VARIABLES ===
Less useful for FPL: ['id', 'element_type', 'gameweek', 'SiB', 'npG']

=== FILTERED DATASET INFO ===
Shape: (4261, 26)
Useful numerical variables: ['xCS', 'xGI', 'shots', 'tackles', 'total_points', 'GC', 'carries_final_third', 'SoT', 'defensive_contribution', 'A', 'carries_penalty_area', 'G', 'selected_by_percent', 'minutes', 'PvsxP', 'xA', 'now_cost', 'xG', 'npxG', 'CS', 'xGC', 'np

In [253]:
# Display the first 20 rows of the dataset
print("=== TOP 20 ROWS OF DATASET ===")
print(df.head(20))



=== TOP 20 ROWS OF DATASET ===
    id  element_type      web_name team_name opponent_team_name  was_home  \
0    1             1          Raya   Arsenal            Man Utd     False   
1    2             1  Arrizabalaga   Arsenal            Man Utd     False   
2    3             1          Hein   Arsenal            Man Utd     False   
3    4             1       Setford   Arsenal            Man Utd     False   
4    5             2       Gabriel   Arsenal            Man Utd     False   
5    6             2        Saliba   Arsenal            Man Utd     False   
6    7             2     Calafiori   Arsenal            Man Utd     False   
7    8             2      J.Timber   Arsenal            Man Utd     False   
8    9             2        Kiwior   Arsenal            Man Utd     False   
9   10             2  Lewis-Skelly   Arsenal            Man Utd     False   
10  11             2         White   Arsenal            Man Utd     False   
11  12             2     Zinchenko   Arsenal 

In [254]:
# Outlier Detection and Analysis
print("=== OUTLIER DETECTION ===")

def detect_outliers_iqr(df, column):
    """Detect outliers using IQR method"""
    Q1 = df[column].quantile(0.25)
    Q3 = df[column].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    
    outliers = df[(df[column] < lower_bound) | (df[column] > upper_bound)]
    return outliers, lower_bound, upper_bound

# Analyze outliers for key metrics
key_metrics = ['total_points', 'now_cost', 'selected_by_percent', 'minutes']

for metric in key_metrics:
    if metric in df.columns and df[metric].notna().sum() > 0:
        outliers, lower, upper = detect_outliers_iqr(df, metric)
        print(f"\n{metric.upper()}:")
        print(f"  Normal range: {lower:.2f} to {upper:.2f}")
        print(f"  Number of outliers: {len(outliers)}")
        print(f"  Percentage of outliers: {(len(outliers) / len(df)) * 100:.2f}%")
        
        if len(outliers) > 0 and len(outliers) <= 10:
            print("  Top outliers:")
            top_outliers = outliers.nlargest(10, metric)[['web_name', 'team_name', metric]]
            for _, player in top_outliers.iterrows():
                print(f"    {player['web_name']} ({player['team_name']}): {player[metric]}")

# Performance vs Expected Analysis
print("\n\n=== PERFORMANCE vs EXPECTED ANALYSIS ===")

# Players overperforming xG
if 'xG' in df.columns and 'G' in df.columns:
    df['goal_overperformance'] = df['G'] - df['xG']
    top_goal_overperformers = df[df['goal_overperformance'] > 0].nlargest(10, 'goal_overperformance')
    print("\nTop Goal Overperformers:")
    for _, player in top_goal_overperformers.iterrows():
        print(f"  {player['web_name']} ({player['team_name']}): {player['G']:.1f} goals vs {player['xG']:.2f} xG (+{player['goal_overperformance']:.2f})")

# Players overperforming xA
if 'xA' in df.columns and 'A' in df.columns:
    df['assist_overperformance'] = df['A'] - df['xA']
    top_assist_overperformers = df[df['assist_overperformance'] > 0].nlargest(10, 'assist_overperformance')
    print("\nTop Assist Overperformers:")
    for _, player in top_assist_overperformers.iterrows():
        print(f"  {player['web_name']} ({player['team_name']}): {player['A']:.1f} assists vs {player['xA']:.2f} xA (+{player['assist_overperformance']:.2f})")

=== OUTLIER DETECTION ===

TOTAL_POINTS:
  Normal range: -1.50 to 2.50
  Number of outliers: 675
  Percentage of outliers: 15.84%

NOW_COST:
  Normal range: 2.90 to 6.90
  Number of outliers: 227
  Percentage of outliers: 5.33%

SELECTED_BY_PERCENT:
  Normal range: -1.25 to 2.35
  Number of outliers: 695
  Percentage of outliers: 16.31%

MINUTES:
  Normal range: -105.00 to 175.00
  Number of outliers: 0
  Percentage of outliers: 0.00%


=== PERFORMANCE vs EXPECTED ANALYSIS ===

Top Goal Overperformers:
  Zubimendi (Arsenal): 2.0 goals vs 0.20 xG (+1.80)
  Thiago (Brentford): 2.0 goals vs 0.60 xG (+1.40)
  Richarlison (Spurs): 2.0 goals vs 0.70 xG (+1.30)
  J.Timber (Arsenal): 2.0 goals vs 0.70 xG (+1.30)
  Welbeck (Brighton): 2.0 goals vs 0.80 xG (+1.20)
  Semenyo (Bournemouth): 2.0 goals vs 0.90 xG (+1.10)
  Wood (Nott'm Forest): 2.0 goals vs 1.00 xG (+1.00)
  Isidor (Sunderland): 1.0 goals vs 0.00 xG (+1.00)
  Garner (Everton): 1.0 goals vs 0.00 xG (+1.00)
  Gravenberch (Liverpool): 

# 3️⃣ Exploratory Data Analysis {#exploratory-data-analysis}

## 🔍 Deep Dive into Data Patterns
Exploring data distributions, outliers, and relationships between variables.

In [255]:
# Positional and Team Analysis
print("=== POSITIONAL ANALYSIS ===")

# Position mapping
position_map = {1: 'Goalkeeper', 2: 'Defender', 3: 'Midfielder', 4: 'Forward'}
df['position_name'] = df['element_type'].map(position_map)

# Analysis by position
position_stats = df.groupby('position_name').agg({
    'total_points': ['count', 'mean', 'median', 'max'],
    'now_cost': ['mean', 'median'],
    'minutes': ['mean'],
    'selected_by_percent': ['mean'],
    'G': ['mean'],
    'A': ['mean']
}).round(2)

print("Position Statistics:")
print(position_stats)

print("\n=== TEAM ANALYSIS ===")

# Team performance analysis
team_stats = df.groupby('team_name').agg({
    'total_points': ['count', 'sum', 'mean'],
    'now_cost': ['mean'],
    'selected_by_percent': ['mean'],
    'G': ['sum'],
    'A': ['sum'],
    'minutes': ['sum']
}).round(2)

team_stats.columns = ['_'.join(col) for col in team_stats.columns]
team_stats = team_stats.sort_values('total_points_sum', ascending=False)

print("\nTop 10 Teams by Total Points:")
print(team_stats.head(10)[['total_points_sum', 'total_points_mean', 'now_cost_mean']])

print("\n=== VALUE ANALYSIS BY POSITION ===")
# Calculate points per million by position
df['points_per_million'] = df['total_points'] / df['now_cost']

value_by_position = df[df['total_points'] > 0].groupby('position_name')['points_per_million'].agg([
    'count', 'mean', 'median', 'max'
]).round(2)

print(value_by_position)

=== POSITIONAL ANALYSIS ===
Position Statistics:
              total_points                  now_cost        minutes  \
                     count  mean median max     mean median    mean   
position_name                                                         
Defender              1410  1.36    0.0  24     4.49    4.4   31.14   
Forward                460  1.29    0.0  16     5.80    5.5   22.17   
Goalkeeper             494  0.84    0.0  15     4.32    4.0   21.49   
Midfielder            1897  1.29    0.0  16     5.37    5.0   27.04   

              selected_by_percent     G     A  
                             mean  mean  mean  
position_name                                  
Defender                     2.11  0.01  0.02  
Forward                      3.85  0.11  0.02  
Goalkeeper                   2.34  0.00  0.00  
Midfielder                   1.55  0.04  0.04  

=== TEAM ANALYSIS ===

Top 10 Teams by Total Points:
                total_points_sum  total_points_mean  now_cost_m

# 5️⃣ Player Performance Analysis {#player-performance-analysis}

## 🏆 Feature 1 Season Leaders, Value Picks & Hidden Gems
Analysis of top performers using **cumulative season statistics** (not single gameweek data).

In [256]:
# Calculate cumulative season statistics for each player
print("=== CALCULATING CUMULATIVE SEASON STATISTICS ===")

# Group by player and calculate season totals
season_stats = df.groupby(['web_name', 'team_name', 'element_type', 'now_cost', 'selected_by_percent']).agg({
    'total_points': 'sum',  # Sum of all gameweek points
    'minutes': 'sum',       # Total minutes played
    'G': 'sum',            # Total goals
    'A': 'sum',            # Total assists  
    'xG': 'sum',           # Total expected goals
    'xA': 'sum',           # Total expected assists
    'shots': 'sum',        # Total shots
    'SoT': 'sum',          # Total shots on target
    'key_passes': 'sum',   # Total key passes
    'CS': 'sum',           # Total clean sheets
    'xCS': 'sum',          # Total expected clean sheets
    'GC': 'sum',           # Total goals conceded
    'xGC': 'sum',          # Total expected goals conceded
    'gameweek': ['count', 'max'],  # Games played and latest gameweek
    'SiB': 'sum',          # Total shots in box
    'tackles': 'sum',      # Total tackles
    'recoveries': 'sum'    # Total recoveries
}).round(2)

print("Columns after aggregation:")
print(season_stats.columns.tolist())

# Flatten column names
season_stats.columns = ['_'.join(col) if col[1] else col[0] for col in season_stats.columns]
season_stats = season_stats.rename(columns={
    'gameweek_count': 'games_played',
    'gameweek_max': 'last_gameweek'
})

print("Columns after flattening:")
print(season_stats.columns.tolist())

# Reset index to make it a regular dataframe
season_stats = season_stats.reset_index()

# Add position names
position_map = {1: 'Goalkeeper', 2: 'Defender', 3: 'Midfielder', 4: 'Forward'}
season_stats['position_name'] = season_stats['element_type'].map(position_map)

# Calculate additional metrics using the correct column names
season_stats['points_per_million'] = season_stats['total_points_sum'] / season_stats['now_cost']
season_stats['points_per_game'] = season_stats['total_points_sum'] / season_stats['games_played']
season_stats['minutes_per_game'] = season_stats['minutes_sum'] / season_stats['games_played']
season_stats['goals_per_game'] = season_stats['G_sum'] / season_stats['games_played']
season_stats['assists_per_game'] = season_stats['A_sum'] / season_stats['games_played']

# Rename main columns for clarity
season_stats = season_stats.rename(columns={
    'total_points_sum': 'season_points',
    'minutes_sum': 'season_minutes',
    'G_sum': 'season_goals',
    'A_sum': 'season_assists',
    'xG_sum': 'season_xG',
    'xA_sum': 'season_xA',
    'shots_sum': 'season_shots',
    'SoT_sum': 'season_SoT',
    'key_passes_sum': 'season_key_passes',
    'CS_sum': 'season_CS',
    'xCS_sum': 'season_xCS',
    'GC_sum': 'season_GC',
    'xGC_sum': 'season_xGC',
    'SiB_sum': 'season_SiB',
    'tackles_sum': 'season_tackles',
    'recoveries_sum': 'season_recoveries'
})

# Round all numeric columns
numeric_cols = season_stats.select_dtypes(include=[np.number]).columns
season_stats[numeric_cols] = season_stats[numeric_cols].round(2)

print(f"Created season stats for {len(season_stats)} players")
print(f"Data covers gameweeks 1-{df['gameweek'].max()}")
print("\nSample of season stats:")
print(season_stats[['web_name', 'team_name', 'position_name', 'games_played', 'season_points', 'season_goals', 'season_assists', 'season_minutes']].head().to_string(index=False))

=== CALCULATING CUMULATIVE SEASON STATISTICS ===
Columns after aggregation:
[('total_points', 'sum'), ('minutes', 'sum'), ('G', 'sum'), ('A', 'sum'), ('xG', 'sum'), ('xA', 'sum'), ('shots', 'sum'), ('SoT', 'sum'), ('key_passes', 'sum'), ('CS', 'sum'), ('xCS', 'sum'), ('GC', 'sum'), ('xGC', 'sum'), ('gameweek', 'count'), ('gameweek', 'max'), ('SiB', 'sum'), ('tackles', 'sum'), ('recoveries', 'sum')]
Columns after flattening:
['total_points_sum', 'minutes_sum', 'G_sum', 'A_sum', 'xG_sum', 'xA_sum', 'shots_sum', 'SoT_sum', 'key_passes_sum', 'CS_sum', 'xCS_sum', 'GC_sum', 'xGC_sum', 'games_played', 'last_gameweek', 'SiB_sum', 'tackles_sum', 'recoveries_sum']
Created season stats for 758 players
Data covers gameweeks 1-6

Sample of season stats:
 web_name   team_name position_name  games_played  season_points  season_goals  season_assists  season_minutes
 A.Becker   Liverpool    Goalkeeper             6             20           0.0             0.0             540
 A.García Aston Villa      

In [257]:
# CORRECTED: Top Performers and Hidden Gems Analysis using CUMULATIVE season stats
print("=== TOP PERFORMERS ANALYSIS (SEASON TOTALS) ===")

# Top scorers by cumulative season points
top_scorers = season_stats.nlargest(10, 'season_points')[['web_name', 'team_name', 'position_name', 'season_points', 'now_cost', 'selected_by_percent', 'games_played']]
print("Top 10 Point Scorers (Season Total):")
for _, player in top_scorers.iterrows():
    ppg = player['season_points'] / player['games_played'] if player['games_played'] > 0 else 0
    print(f"  {player['web_name']} ({player['team_name']}, {player['position_name']}): {player['season_points']:.0f} pts in {player['games_played']} games ({ppg:.1f} ppg), £{player['now_cost']}m, {player['selected_by_percent']}% selected")

# Best value players (min 20 season points to filter out bench players)
print(f"\n=== BEST VALUE PLAYERS (Min 20 season points) ===")
value_players = season_stats[(season_stats['season_points'] >= 20) & (season_stats['points_per_million'] > 0)].nlargest(10, 'points_per_million')
print("Top 10 Value Players (Points per £m):")
for _, player in value_players.iterrows():
    print(f"  {player['web_name']} ({player['team_name']}, {player['position_name']}): {player['points_per_million']:.2f} pts/£m ({player['season_points']:.0f} pts in {player['games_played']} games, £{player['now_cost']}m)")

# Hidden gems analysis - players with strong underlying metrics but moderate total points
print(f"\n=== HIDDEN GEMS ANALYSIS (Season Stats) ===")

# Players with decent season points (30-60) but low ownership - potential for more points
hidden_gems = season_stats[(season_stats['season_points'] >= 30) & (season_stats['season_points'] <= 60) & 
                          (season_stats['selected_by_percent'] < 5) & (season_stats['selected_by_percent'] > 0) &
                          (season_stats['games_played'] >= 3)]  # Must have played at least 3 games

if len(hidden_gems) > 0:
    # Calculate underlying performance score based on expected stats
    hidden_gems = hidden_gems.copy()
    hidden_gems['underlying_score'] = (
        hidden_gems['season_xG'] * 0.3 + 
        hidden_gems['season_xA'] * 0.25 + 
        hidden_gems['season_xCS'] * 0.2 + 
        hidden_gems['season_key_passes'] * 0.1 + 
        hidden_gems['season_shots'] * 0.05 +
        (hidden_gems['season_minutes'] / (hidden_gems['games_played'] * 90)) * 0.1  # Minutes played per game ratio
    )
    
    hidden_gems_sorted = hidden_gems.nlargest(10, 'underlying_score')
    print("Players with Strong Underlying Stats but Moderate Points:")
    for _, player in hidden_gems_sorted.iterrows():
        print(f"  {player['web_name']} ({player['team_name']}, {player['position_name']}): {player['season_points']:.0f} pts, {player['selected_by_percent']}% selected")
        print(f"    Underlying: xG:{player['season_xG']:.2f}, xA:{player['season_xA']:.2f}, xCS:{player['season_xCS']:.2f}, Keys:{player['season_key_passes']:.1f} in {player['games_played']} games")
else:
    print("No hidden gems found with current criteria")

# Differential picks - low ownership but decent season points
print(f"\n=== DIFFERENTIAL PICKS (Low Ownership, Season Stats) ===")
differential_picks = season_stats[(season_stats['season_points'] >= 40) & 
                                 (season_stats['selected_by_percent'] < 3) & 
                                 (season_stats['selected_by_percent'] > 0) &
                                 (season_stats['games_played'] >= 4)]  # At least 4 games played

if len(differential_picks) > 0:
    differential_sorted = differential_picks.nlargest(10, 'season_points')
    print("High Season Points, Very Low Ownership (<3%):")
    for _, player in differential_sorted.iterrows():
        ppg = player['season_points'] / player['games_played']
        print(f"  {player['web_name']} ({player['team_name']}, {player['position_name']}): {player['season_points']:.0f} pts ({ppg:.1f} ppg), {player['selected_by_percent']:.1f}% owned, £{player['now_cost']}m")
else:
    print("No differential picks found with current criteria")

# Goal/Assist leaders with season totals
print(f"\n=== SEASON ATTACKING LEADERS ===")
goal_leaders = season_stats[season_stats['season_goals'] > 0].nlargest(8, 'season_goals')
print("Top Goal Scorers (Season Total):")
for _, player in goal_leaders.iterrows():
    gpg = player['season_goals'] / player['games_played']
    print(f"  {player['web_name']} ({player['team_name']}): {player['season_goals']:.0f} goals in {player['games_played']} games ({gpg:.2f} per game)")

assist_leaders = season_stats[season_stats['season_assists'] > 0].nlargest(8, 'season_assists')
print("\nTop Assist Providers (Season Total):")
for _, player in assist_leaders.iterrows():
    apg = player['season_assists'] / player['games_played']
    print(f"  {player['web_name']} ({player['team_name']}): {player['season_assists']:.0f} assists in {player['games_played']} games ({apg:.2f} per game)")

=== TOP PERFORMERS ANALYSIS (SEASON TOTALS) ===
Top 10 Point Scorers (Season Total):
  Haaland (Man City, Forward): 62 pts in 6 games (10.3 ppg), £14.4m, 52.6% selected
  Semenyo (Bournemouth, Midfielder): 48 pts in 6 games (8.0 ppg), £7.8m, 52.8% selected
  Senesi (Bournemouth, Defender): 44 pts in 6 games (7.3 ppg), £4.9m, 19.9% selected
  Guéhi (Crystal Palace, Defender): 43 pts in 6 games (7.2 ppg), £4.8m, 27.3% selected
  Anthony (Burnley, Midfielder): 40 pts in 6 games (6.7 ppg), £5.6m, 4.2% selected
  Alderete (Sunderland, Defender): 39 pts in 6 games (6.5 ppg), £4.0m, 3.6% selected
  Enzo (Chelsea, Midfielder): 39 pts in 6 games (6.5 ppg), £6.7m, 13.3% selected
  Roefs (Sunderland, Goalkeeper): 39 pts in 6 games (6.5 ppg), £4.5m, 3.2% selected
  Gabriel (Arsenal, Defender): 38 pts in 6 games (6.3 ppg), £6.2m, 24.5% selected
  J.Timber (Arsenal, Defender): 37 pts in 6 games (6.2 ppg), £5.8m, 14.4% selected

=== BEST VALUE PLAYERS (Min 20 season points) ===
Top 10 Value Players (

# 6️⃣ Strategic Analysis Tools {#strategic-analysis-tools}

## ⚔️ Advanced FPL Analysis Functions

This section contains powerful, reusable functions for Fantasy Premier League strategic analysis:

### 🔧 **Available Tools:**
1. **Defender Rankings** - Rank defenders by clean sheet potential and value
2. **Attacker Rankings** - Rank attacking players by goal/assist potential  
3. **Team Strength Analysis** - Calculate attacking and defensive strength for all teams
4. **Fixture Difficulty Calculator** - Score any specific matchup

### 📊 **Key Features:**
- Uses **cumulative season statistics** for accuracy
- Considers expected stats (xG, xA, xCS) for sustainability  
- Includes value scoring (points per £million)
- Accounts for consistency and minutes played
- Easily customizable parameters

In [258]:
import pandas as pd
import numpy as np
from typing import Optional, List

def calculate_team_stats_corrected(season_data: pd.DataFrame) -> tuple:
    """
    Calculate attacking and defensive statistics for each team using cumulative season data.
    
    Args:
        season_data: DataFrame with cumulative season statistics per player
        
    Returns:
        tuple: (attacking_stats, defensive_stats) DataFrames
    """
    # Attacking stats by team (aggregate all players from each team)
    attacking_stats = season_data.groupby('team_name').agg({
        'season_xG': 'sum',      # Total team xG
        'season_goals': 'sum',   # Total team goals
        'season_shots': 'sum',   # Total team shots
        'season_SoT': 'sum',     # Total team shots on target
        'season_minutes': 'sum', # Total team minutes
        'games_played': 'mean'   # Average games played (should be similar for all players)
    }).round(3)
    
    # Convert totals to per-game averages
    attacking_stats['avg_xG_for'] = attacking_stats['season_xG'] / attacking_stats['games_played']
    attacking_stats['avg_G_for'] = attacking_stats['season_goals'] / attacking_stats['games_played']
    attacking_stats['avg_shots_for'] = attacking_stats['season_shots'] / attacking_stats['games_played']
    attacking_stats['avg_SoT_for'] = attacking_stats['season_SoT'] / attacking_stats['games_played']
    
    # For defensive stats, we need to use the original gameweek data to get opponent information
    # Since we don't have opponent data in season_stats, we'll use a simplified approach
    # based on goals conceded for defensive teams (GK + DEF)
    defensive_players = season_data[season_data['element_type'].isin([1, 2])]  # GK and DEF
    
    defensive_stats = defensive_players.groupby('team_name').agg({
        'season_GC': 'mean',     # Average goals conceded per defensive player
        'season_xGC': 'mean',    # Average xGC per defensive player  
        'games_played': 'mean'   # Average games played
    }).round(3)
    
    # Convert to per-game averages (rename for consistency)
    defensive_stats['avg_G_conceded'] = defensive_stats['season_GC'] / defensive_stats['games_played']
    defensive_stats['avg_xG_conceded'] = defensive_stats['season_xGC'] / defensive_stats['games_played']
    
    return attacking_stats, defensive_stats

def rank_fixtures_corrected(season_data: pd.DataFrame, upcoming_gameweeks: Optional[List[int]] = None) -> pd.DataFrame:
    """
    Analyze and rank fixtures based on attacking strength vs defensive weakness using season data.
    
    Args:
        season_data: DataFrame with cumulative season statistics
        upcoming_gameweeks: List of gameweek numbers to analyze (if None, uses next 3 GWs)
    
    Returns:
        DataFrame with ranked fixtures showing favorability scores
    """
    if upcoming_gameweeks is None:
        current_gw = season_data['last_gameweek'].max()
        upcoming_gameweeks = [current_gw + 1, current_gw + 2, current_gw + 3]
    
    # Get team statistics
    attacking_stats, defensive_stats = calculate_team_stats_corrected(season_data)
    
    # Create fixtures matrix
    teams = season_data['team_name'].unique()
    fixtures = []
    
    for gw in upcoming_gameweeks:
        for home_team in teams:
            for away_team in teams:
                if home_team != away_team:
                    fixtures.append({
                        'gameweek': gw,
                        'home_team': home_team,
                        'away_team': away_team,
                        'fixture': f"{home_team} vs {away_team}"
                    })
    
    fixture_df = pd.DataFrame(fixtures)
    
    # Add attacking stats for home team
    fixture_df = fixture_df.merge(
        attacking_stats[['avg_xG_for', 'avg_G_for', 'avg_shots_for', 'avg_SoT_for']], 
        left_on='home_team', 
        right_index=True, 
        how='left'
    )
    
    # Add defensive stats for away team
    fixture_df = fixture_df.merge(
        defensive_stats[['avg_xG_conceded', 'avg_G_conceded']], 
        left_on='away_team', 
        right_index=True, 
        how='left'
    )
    
    # Calculate favorability scores
    fixture_df['attacking_strength'] = (
        fixture_df['avg_xG_for'] * 0.4 + 
        fixture_df['avg_G_for'] * 0.3 + 
        fixture_df['avg_shots_for'] * 0.2 + 
        fixture_df['avg_SoT_for'] * 0.1
    )
    
    fixture_df['defensive_weakness'] = (
        fixture_df['avg_xG_conceded'] * 0.6 + 
        fixture_df['avg_G_conceded'] * 0.4
    )
    
    # Overall favorability score
    fixture_df['favorability_score'] = (
        fixture_df['attacking_strength'] * 0.6 + 
        fixture_df['defensive_weakness'] * 0.4
    )
    
    # Add difficulty rating
    fixture_df['difficulty_rating'] = pd.cut(
        fixture_df['favorability_score'], 
        bins=5, 
        labels=['Very Hard', 'Hard', 'Medium', 'Easy', 'Very Easy']
    )
    
    # Sort by favorability
    result = fixture_df.sort_values(['gameweek', 'favorability_score'], ascending=[True, False])
    
    output_cols = [
        'gameweek', 'fixture', 'home_team', 'away_team', 'favorability_score', 
        'difficulty_rating', 'attacking_strength', 'defensive_weakness',
        'avg_xG_for', 'avg_G_for', 'avg_xG_conceded', 'avg_G_conceded'
    ]
    
    return result[output_cols].round(3)

print("=== CORRECTED FIXTURE ANALYSIS FUNCTION CREATED ===")
print("Function: rank_fixtures_corrected(season_data, upcoming_gameweeks=None)")
print("Purpose: Identifies favorable fixtures using CUMULATIVE season statistics")
print("\nKey Changes:")
print("- Uses season_stats dataframe instead of gameweek data")
print("- Calculates team attacking/defensive strength from cumulative player stats")
print("- More accurate representation of team form over the season")

=== CORRECTED FIXTURE ANALYSIS FUNCTION CREATED ===
Function: rank_fixtures_corrected(season_data, upcoming_gameweeks=None)
Purpose: Identifies favorable fixtures using CUMULATIVE season statistics

Key Changes:
- Uses season_stats dataframe instead of gameweek data
- Calculates team attacking/defensive strength from cumulative player stats
- More accurate representation of team form over the season

Function: rank_fixtures_corrected(season_data, upcoming_gameweeks=None)
Purpose: Identifies favorable fixtures using CUMULATIVE season statistics

Key Changes:
- Uses season_stats dataframe instead of gameweek data
- Calculates team attacking/defensive strength from cumulative player stats
- More accurate representation of team form over the season


In [259]:
def filter_defenders_corrected(season_data: pd.DataFrame, min_games: int = 3, top_n: int = 20) -> pd.DataFrame:
    """
    Rank defenders by clean sheet potential using cumulative season data.
    
    Args:
        season_data: DataFrame with cumulative season statistics
        min_games: Minimum games played to be considered
        top_n: Number of top defenders to return
    
    Returns:
        DataFrame with ranked defenders based on season performance
    """
    # Filter for defenders only
    defenders = season_data[season_data['element_type'] == 2].copy()
    
    # Filter by minimum games played
    defenders = defenders[defenders['games_played'] >= min_games]
    
    if len(defenders) == 0:
        return pd.DataFrame()
    
    # Calculate performance metrics
    defenders['clean_sheet_rate'] = (defenders['season_CS'] / defenders['games_played']).fillna(0)
    defenders['xCS_per_game'] = (defenders['season_xCS'] / defenders['games_played']).fillna(0)
    defenders['goals_conceded_per_game'] = (defenders['season_GC'] / defenders['games_played']).fillna(0)
    defenders['minutes_per_game'] = defenders['season_minutes'] / defenders['games_played']
    defenders['consistency_score'] = np.minimum(defenders['minutes_per_game'] / 90, 1)
    
    # Clean sheet potential score
    defenders['clean_sheet_potential'] = (
        defenders['xCS_per_game'] * 0.4 +
        defenders['clean_sheet_rate'] * 0.35 +
        (1 / (defenders['goals_conceded_per_game'] + 0.1)) * 0.15 +  # Lower goals conceded = better
        defenders['consistency_score'] * 0.1
    )
    
    # Value score
    defenders['value_score'] = defenders['season_points'] / defenders['now_cost']
    
    # Overall defender score  
    defenders['defender_score'] = (
        defenders['clean_sheet_potential'] * 0.6 +
        defenders['value_score'] * 0.25 +
        defenders['consistency_score'] * 0.15
    )
    
    # Sort by defender score
    result = defenders.sort_values('defender_score', ascending=False)
    
    # Select key columns
    output_cols = [
        'web_name', 'team_name', 'now_cost', 'selected_by_percent',
        'defender_score', 'clean_sheet_potential', 'value_score', 'consistency_score',
        'games_played', 'clean_sheet_rate', 'xCS_per_game', 'goals_conceded_per_game',
        'season_points', 'season_minutes', 'season_CS', 'season_xCS'
    ]
    
    return result[output_cols].head(top_n).round(3)

def filter_attackers_corrected(season_data: pd.DataFrame, min_games: int = 3, top_n: int = 20, positions: List[int] = [3, 4]) -> pd.DataFrame:
    """
    Rank attackers using cumulative season data.
    
    Args:
        season_data: DataFrame with cumulative season statistics
        min_games: Minimum games played to be considered
        top_n: Number of top attackers to return
        positions: List of position types to include (3=Midfielder, 4=Forward)
    
    Returns:
        DataFrame with ranked attackers based on season performance
    """
    # Filter for attackers
    attackers = season_data[season_data['element_type'].isin(positions)].copy()
    
    # Filter by minimum games
    attackers = attackers[attackers['games_played'] >= min_games]
    
    if len(attackers) == 0:
        return pd.DataFrame()
    
    # Calculate performance metrics
    attackers['goals_per_game'] = (attackers['season_goals'] / attackers['games_played']).fillna(0)
    attackers['assists_per_game'] = (attackers['season_assists'] / attackers['games_played']).fillna(0)
    attackers['xG_per_game'] = (attackers['season_xG'] / attackers['games_played']).fillna(0)
    attackers['xA_per_game'] = (attackers['season_xA'] / attackers['games_played']).fillna(0)
    attackers['shots_per_game'] = (attackers['season_shots'] / attackers['games_played']).fillna(0)
    attackers['SoT_per_game'] = (attackers['season_SoT'] / attackers['games_played']).fillna(0)
    attackers['SiB_per_game'] = (attackers['season_SiB'] / attackers['games_played']).fillna(0)
    attackers['minutes_per_game'] = attackers['season_minutes'] / attackers['games_played']
    
    # Attacking threat score
    attackers['attacking_threat'] = (
        attackers['xG_per_game'] * 0.3 +
        attackers['xA_per_game'] * 0.25 +
        attackers['goals_per_game'] * 0.2 +
        attackers['assists_per_game'] * 0.15 +
        attackers['SoT_per_game'] * 0.05 +
        attackers['SiB_per_game'] * 0.05
    )
    
    # Consistency score
    attackers['consistency_score'] = np.minimum(attackers['minutes_per_game'] / 90, 1)
    
    # Value score
    attackers['value_score'] = attackers['season_points'] / attackers['now_cost']
    
    # Overall attacker score
    attackers['attacker_score'] = (
        attackers['attacking_threat'] * 0.6 +
        attackers['value_score'] * 0.25 +
        attackers['consistency_score'] * 0.15
    )
    
    # Sort by attacker score
    result = attackers.sort_values('attacker_score', ascending=False)
    
    # Select key columns
    output_cols = [
        'web_name', 'team_name', 'position_name', 'now_cost', 'selected_by_percent',
        'attacker_score', 'attacking_threat', 'value_score', 'consistency_score',
        'games_played', 'goals_per_game', 'assists_per_game', 'xG_per_game', 'xA_per_game',
        'shots_per_game', 'SoT_per_game', 'season_points', 'season_minutes'
    ]
    
    return result[output_cols].head(top_n).round(3)

print("=== CORRECTED DEFENDER & ATTACKER FILTERING FUNCTIONS CREATED ===")
print("Functions: filter_defenders_corrected() & filter_attackers_corrected()")
print("Purpose: Rank players using CUMULATIVE season statistics")
print("\nKey Changes:")
print("- Uses season_stats dataframe with cumulative data")
print("- Changed min_minutes to min_games for more intuitive filtering")
print("- Calculates per-game averages from season totals")
print("- More accurate player performance assessment")

=== CORRECTED DEFENDER & ATTACKER FILTERING FUNCTIONS CREATED ===
Functions: filter_defenders_corrected() & filter_attackers_corrected()
Purpose: Rank players using CUMULATIVE season statistics

Key Changes:
- Uses season_stats dataframe with cumulative data
- Changed min_minutes to min_games for more intuitive filtering
- Calculates per-game averages from season totals
- More accurate player performance assessment

Functions: filter_defenders_corrected() & filter_attackers_corrected()
Purpose: Rank players using CUMULATIVE season statistics

Key Changes:
- Uses season_stats dataframe with cumulative data
- Changed min_minutes to min_games for more intuitive filtering
- Calculates per-game averages from season totals
- More accurate player performance assessment


# 7️⃣ Feature 2 Ranking Leaderboard


In [260]:
# 🏆 SOLUTION 1: Team Strength Rankings
print("="*70)
print("📊 TEAM STRENGTH RANKINGS (Based on Season Performance)")
print("="*70)
print("💡 Use these rankings to assess fixture difficulty manually")

def create_team_strength_rankings(season_data: pd.DataFrame) -> pd.DataFrame:
    """
    Create team strength rankings based on season performance.
    Users can use this to manually assess fixture difficulty.
    """
    # Calculate team stats from player data
    attacking_stats = season_data.groupby('team_name').agg({
        'season_goals': 'sum',
        'season_xG': 'sum', 
        'season_shots': 'sum',
        'season_SoT': 'sum',
        'games_played': 'mean'
    }).round(2)
    
    # FIXED: Only include teams that have defensive players (GK/DEF) in the data
    defensive_players = season_data[season_data['element_type'].isin([1, 2])]
    
    if len(defensive_players) == 0:
        print("⚠️ Warning: No defensive players found in dataset")
        # Create dummy defensive stats if no defensive players
        defensive_stats = pd.DataFrame(index=attacking_stats.index)
        defensive_stats['season_CS'] = 0
        defensive_stats['season_xCS'] = 0  
        defensive_stats['season_GC'] = 2.0  # Assume average goals conceded
        defensive_stats['season_xGC'] = 2.0
        defensive_stats['games_played'] = attacking_stats['games_played']
    else:
        defensive_stats = defensive_players.groupby('team_name').agg({
            'season_CS': 'mean',
            'season_xCS': 'mean',
            'season_GC': 'mean',
            'season_xGC': 'mean',
            'games_played': 'mean'
        }).round(2)
    
    # Convert to per-game averages
    attacking_stats['goals_per_game'] = attacking_stats['season_goals'] / attacking_stats['games_played']
    attacking_stats['xG_per_game'] = attacking_stats['season_xG'] / attacking_stats['games_played']
    attacking_stats['shots_per_game'] = attacking_stats['season_shots'] / attacking_stats['games_played']
    
    defensive_stats['CS_rate'] = defensive_stats['season_CS'] / defensive_stats['games_played']
    defensive_stats['GC_per_game'] = defensive_stats['season_GC'] / defensive_stats['games_played']
    
    # Calculate strength scores
    attacking_stats['attack_strength'] = (
        attacking_stats['xG_per_game'] * 0.4 +
        attacking_stats['goals_per_game'] * 0.3 +
        attacking_stats['shots_per_game'] * 0.3
    )
    
    defensive_stats['defense_strength'] = (
        defensive_stats['CS_rate'] * 0.4 +
        (1 / (defensive_stats['GC_per_game'] + 0.1)) * 0.6  # Lower goals conceded = stronger
    )
    
    # FIXED: Use inner join first, then handle missing teams properly
    team_rankings = attacking_stats[['attack_strength']].join(
        defensive_stats[['defense_strength']], how='left'  # Left join to keep all attacking teams
    )
    
    # IMPROVED: Handle missing defensive data more intelligently
    missing_defense_teams = team_rankings[team_rankings['defense_strength'].isna()].index
    if len(missing_defense_teams) > 0:
        print(f"⚠️ Teams without defensive data: {list(missing_defense_teams)}")
        # Instead of filling with 0, use the median defensive strength
        median_defense = team_rankings['defense_strength'].median()
        if pd.isna(median_defense):  # If all teams missing defensive data
            median_defense = 2.0  # Default moderate defensive strength
        team_rankings['defense_strength'].fillna(median_defense, inplace=True)
        print(f"📊 Filled missing defensive strength with median: {median_defense:.3f}")
    
    # Calculate overall strength
    team_rankings['overall_strength'] = (
        team_rankings['attack_strength'] * 0.6 + 
        team_rankings['defense_strength'] * 0.4
    )
    
    # Add rankings
    team_rankings['attack_rank'] = team_rankings['attack_strength'].rank(ascending=False, method='dense').astype(int)
    team_rankings['defense_rank'] = team_rankings['defense_strength'].rank(ascending=False, method='dense').astype(int)
    team_rankings['overall_rank'] = team_rankings['overall_strength'].rank(ascending=False, method='dense').astype(int)
    
    return team_rankings.round(3)

# Create team rankings
team_rankings = create_team_strength_rankings(season_stats)
team_rankings_sorted = team_rankings.sort_values('overall_rank')

print("🏆 TEAM STRENGTH RANKINGS (Season Performance)")
print("=" * 60)
print("Overall Team Rankings (All 20 Teams):")
print(team_rankings_sorted[['overall_rank', 'attack_rank', 'defense_rank', 'overall_strength', 'attack_strength', 'defense_strength']].to_string())

print(f"\n⚽ TOP 8 ATTACKING TEAMS:")
attack_rankings = team_rankings.sort_values('attack_rank').head(8)
for idx, (team, data) in enumerate(attack_rankings.iterrows(), 1):
    print(f"{int(data['attack_rank']):2d}. {team:<15} (Attack: {data['attack_strength']:.3f})")

print(f"\n🛡️ TOP 8 DEFENSIVE TEAMS:")
defense_rankings = team_rankings.sort_values('defense_rank').head(8)
for idx, (team, data) in enumerate(defense_rankings.iterrows(), 1):
    print(f"{int(data['defense_rank']):2d}. {team:<15} (Defense: {data['defense_strength']:.3f})")

📊 TEAM STRENGTH RANKINGS (Based on Season Performance)
💡 Use these rankings to assess fixture difficulty manually
🏆 TEAM STRENGTH RANKINGS (Season Performance)
Overall Team Rankings (All 20 Teams):
                overall_rank  attack_rank  defense_rank  overall_strength  attack_strength  defense_strength
team_name                                                                                                   
Arsenal                    1            3             1             4.004            5.872             1.201
Liverpool                  2            1            10             3.885            6.067             0.612
Man Utd                    3            2            18             3.754            6.052             0.308
Man City                   4            4             7             3.599            5.539             0.689
Crystal Palace             5            7             2             3.530            5.092             1.188
Chelsea                    6           

In [262]:
# SOLUTION 3: Player Recommendations for Team Matchups
print("\n" + "="*80)
print("=== SOLUTION 3: PLAYER RECOMMENDATIONS BY OPPONENT STRENGTH ===")
print("Find best players against weak defenses/attacks")

def get_players_for_matchup(team: str, opponent_type: str, season_data: pd.DataFrame, 
                           team_rankings: pd.DataFrame, top_n: int = 8) -> pd.DataFrame:
    """
    Get player recommendations based on opponent strength.
    
    Args:
        team: Team name to get players from
        opponent_type: 'weak_defense' for attackers, 'weak_attack' for defenders
        season_data: Player season statistics
        team_rankings: Team strength rankings
        top_n: Number of players to return
    """
    team_players = season_data[season_data['team_name'] == team].copy()
    
    if len(team_players) == 0:
        return pd.DataFrame()
    
    if opponent_type == 'weak_defense':
        # Get attacking players when facing weak defenses
        attackers = team_players[team_players['element_type'].isin([3, 4])]  # MID + FWD
        attackers = attackers[attackers['games_played'] >= 3]
        
        if len(attackers) == 0:
            return pd.DataFrame()
            
        # Score based on attacking output and value
        attackers['matchup_score'] = (
            attackers['goals_per_game'] * 3 +
            attackers['assists_per_game'] * 2 +
            attackers['points_per_game'] * 0.5 +
            attackers['points_per_million'] * 0.3
        )
        
        result = attackers.sort_values('matchup_score', ascending=False).head(top_n)
        return result[['web_name', 'position_name', 'season_points', 'now_cost', 
                      'goals_per_game', 'assists_per_game', 'points_per_game', 
                      'points_per_million', 'matchup_score']].round(2)
        
    else:  # weak_attack
        # Get defensive players when facing weak attacks
        defenders = team_players[team_players['element_type'].isin([1, 2])]  # GK + DEF
        defenders = defenders[defenders['games_played'] >= 3]
        
        if len(defenders) == 0:
            return pd.DataFrame()
            
        # Score based on clean sheet potential and value
        defenders['clean_sheet_rate'] = defenders['season_CS'] / defenders['games_played']
        defenders['matchup_score'] = (
            defenders['clean_sheet_rate'] * 4 +
            defenders['points_per_game'] * 0.6 +
            defenders['points_per_million'] * 0.4
        )
        
        result = defenders.sort_values('matchup_score', ascending=False).head(top_n)
        return result[['web_name', 'position_name', 'season_points', 'now_cost',
                      'clean_sheet_rate', 'points_per_game', 'points_per_million', 
                      'matchup_score']].round(2)

# Find teams with weak defenses (good for attacking players)
weak_defenses = team_rankings.sort_values('defense_rank', ascending=False).head(8)
print("🎯 TEAMS WITH WEAK DEFENSES (Target for Attackers):")
print("=" * 55)
for team in weak_defenses.index:
    defense_rank = int(weak_defenses.loc[team, 'defense_rank'])
    defense_strength = weak_defenses.loc[team, 'defense_strength']
    print(f"{defense_rank:2d}. {team:<15} (Defense: {defense_strength:.3f})")

# Find teams with weak attacks (good for defenders)
weak_attacks = team_rankings.sort_values('attack_rank', ascending=False).head(8)
print(f"\n🛡️ TEAMS WITH WEAK ATTACKS (Good for Defenders):")
print("=" * 50)
for team in weak_attacks.index:
    attack_rank = int(weak_attacks.loc[team, 'attack_rank'])
    attack_strength = weak_attacks.loc[team, 'attack_strength']
    print(f"{attack_rank:2d}. {team:<15} (Attack: {attack_strength:.3f})")

# SHOW ALL TEAMS: Complete attacking rankings with player recommendations
print(f"\n⚽ ATTACKING PICKS FROM ALL TEAMS (Sorted by Attack Rank):")
print("=" * 60)
all_attacking_teams = team_rankings.sort_values('attack_rank')  # ALL teams sorted by attack rank

for idx, (team, data) in enumerate(all_attacking_teams.iterrows()):
    if team in season_stats['team_name'].values:
        attack_rank = int(data['attack_rank'])
        attack_strength = data['attack_strength']
        
        attackers = get_players_for_matchup(team, 'weak_defense', season_stats, team_rankings, 3)
        if not attackers.empty:
            print(f"\n🔴 {team} (#{attack_rank} Attack, Strength: {attack_strength:.3f}):")
            print(attackers[['web_name', 'position_name', 'now_cost', 'goals_per_game', 'assists_per_game', 'points_per_game']].to_string(index=False))
        else:
            print(f"\n🔴 {team} (#{attack_rank} Attack, Strength: {attack_strength:.3f}): No attacking players found")

# SHOW ALL TEAMS: Complete defensive rankings with player recommendations  
print(f"\n🛡️ DEFENSIVE PICKS FROM ALL TEAMS (Sorted by Defense Rank):")
print("=" * 60)
all_defensive_teams = team_rankings.sort_values('defense_rank')  # ALL teams sorted by defense rank

for idx, (team, data) in enumerate(all_defensive_teams.iterrows()):
    if team in season_stats['team_name'].values:
        defense_rank = int(data['defense_rank'])
        defense_strength = data['defense_strength']
        
        defenders = get_players_for_matchup(team, 'weak_attack', season_stats, team_rankings, 3)
        if not defenders.empty:
            print(f"\n🔵 {team} (#{defense_rank} Defense, Strength: {defense_strength:.3f}):")
            print(defenders[['web_name', 'position_name', 'now_cost', 'clean_sheet_rate', 'points_per_game', 'points_per_million']].to_string(index=False))
        else:
            print(f"\n🔵 {team} (#{defense_rank} Defense, Strength: {defense_strength:.3f}): No defensive players found")


=== SOLUTION 3: PLAYER RECOMMENDATIONS BY OPPONENT STRENGTH ===
Find best players against weak defenses/attacks
🎯 TEAMS WITH WEAK DEFENSES (Target for Attackers):
20. Wolves          (Defense: 0.266)
19. West Ham        (Defense: 0.299)
18. Man Utd         (Defense: 0.308)
17. Burnley         (Defense: 0.331)
16. Nott'm Forest   (Defense: 0.339)
15. Brighton        (Defense: 0.376)
14. Brentford       (Defense: 0.377)
13. Fulham          (Defense: 0.485)

🛡️ TEAMS WITH WEAK ATTACKS (Good for Defenders):
20. Burnley         (Attack: 3.153)
19. Brentford       (Attack: 3.674)
18. Wolves          (Attack: 3.726)
17. Newcastle       (Attack: 3.784)
16. Aston Villa     (Attack: 3.813)
15. Fulham          (Attack: 3.923)
14. West Ham        (Attack: 3.991)
13. Sunderland      (Attack: 4.003)

⚽ ATTACKING PICKS FROM ALL TEAMS (Sorted by Attack Rank):

🔴 Liverpool (#1 Attack, Strength: 6.067):
   web_name position_name  now_cost  goals_per_game  assists_per_game  points_per_game
Gravenberch  

# 🔮 ENHANCED FIXTURE ANALYZER - SEASON-WIDE ANALYSIS

**The next-generation fixture analysis system that provides:**
- ✅ **Complete season fixture analysis** (all gameweeks)
- ✅ **Visual difficulty heatmaps** 
- ✅ **Strategic transfer timing recommendations**
- ✅ **Advanced team matchup intelligence**
- ✅ **Position-specific fixture insights**

This enhanced system works alongside your existing analysis to provide deeper strategic insights for long-term FPL planning.

In [263]:
# =============================================================================
# 🔮 ENHANCED FIXTURE ANALYZER CLASS
# =============================================================================

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

class EnhancedFixtureAnalyzer:
    """
    Advanced fixture analysis system for complete season planning
    
    Features:
    - Season-wide fixture difficulty analysis
    - Visual heatmaps and charts
    - Strategic transfer timing recommendations
    - Position-specific insights
    - Team matchup intelligence
    """
    
    def __init__(self, season_stats, team_rankings, fixtures_path='fixture_template.csv'):
        """Initialize with your existing data"""
        self.season_stats = season_stats
        self.team_rankings = team_rankings  # Use the SAME rankings as Basic System
        self.fixtures_df = pd.read_csv(fixtures_path)
        self._process_data()
        
    def _process_data(self):
        """Process the data and create team mappings"""
        # Map fixture team names to your existing team_rankings
        self._map_team_names()
        

    
    def _map_team_names(self):
        """Map fixture team names to season_stats team names"""
        fixture_teams = set(self.fixtures_df['home_team'].unique()) | set(self.fixtures_df['away_team'].unique())
        season_teams = set(self.season_stats['team_name'].unique())
        
        self.team_mapping = {}
        
        for fixture_team in fixture_teams:
            # Try exact match first
            if fixture_team in season_teams:
                self.team_mapping[fixture_team] = fixture_team
                continue
            
            # Try partial matching
            best_match = None
            for season_team in season_teams:
                if (fixture_team.lower().replace(' ', '') in season_team.lower().replace(' ', '') or
                    season_team.lower().replace(' ', '') in fixture_team.lower().replace(' ', '')):
                    best_match = season_team
                    break
            
            if best_match:
                self.team_mapping[fixture_team] = best_match
            else:
                # Create a default mapping for missing teams
                self.team_mapping[fixture_team] = fixture_team
                print(f"⚠️ Could not match '{fixture_team}' - using default mapping")
                
    def get_fixture_difficulty_matrix(self, start_gw=None, end_gw=None, home_advantage=0.7):
        """Create fixture difficulty matrix using BASIC SYSTEM calculation method"""
        if start_gw is None:
            start_gw = self.fixtures_df['gameweek'].min()
        if end_gw is None:
            end_gw = self.fixtures_df['gameweek'].max()
            
        fixtures_period = self.fixtures_df[
            (self.fixtures_df['gameweek'] >= start_gw) & 
            (self.fixtures_df['gameweek'] <= end_gw)
        ].copy()
        
        # Add difficulty scores using BASIC SYSTEM method
        difficulties = []
        total_teams = len(self.team_rankings)
        
        for _, fixture in fixtures_period.iterrows():
            home_team = self.team_mapping.get(fixture['home_team'], fixture['home_team'])
            away_team = self.team_mapping.get(fixture['away_team'], fixture['away_team'])
            
            if home_team in self.team_rankings.index and away_team in self.team_rankings.index:
                # Get team stats
                home_stats = self.team_rankings.loc[home_team]
                away_stats = self.team_rankings.loc[away_team]
                
                # ATTACKING CALCULATION (same as Basic System)
                home_attack_rank = int(home_stats['attack_rank'])
                away_defense_rank = int(away_stats['defense_rank'])
                
                # Apply home advantage (reduced from 1.0 to 0.7)
                original_home_attack = home_attack_rank
                if home_advantage > 0 and home_attack_rank > 1:
                    home_attack_rank = max(1, home_attack_rank - home_advantage)
                
                # Calculate favorability with scaling (same as Basic System)
                attack_rank_difference = away_defense_rank - home_attack_rank
                attack_difficulty = attack_rank_difference / total_teams * 10
                
                # DEFENSIVE CALCULATION (same as Basic System)
                home_defense_rank = int(home_stats['defense_rank'])
                away_attack_rank = int(away_stats['attack_rank'])
                
                # Apply home advantage
                original_home_defense = home_defense_rank
                if home_advantage > 0 and home_defense_rank > 1:
                    home_defense_rank = max(1, home_defense_rank - home_advantage)
                
                # Calculate favorability with scaling
                defense_rank_difference = away_attack_rank - home_defense_rank
                defense_difficulty = defense_rank_difference / total_teams * 10
                
                difficulties.append({
                    'gameweek': fixture['gameweek'],
                    'home_team': fixture['home_team'],
                    'away_team': fixture['away_team'],
                    'mapped_home': home_team,
                    'mapped_away': away_team,
                    'attack_difficulty': attack_difficulty,
                    'defense_difficulty': defense_difficulty,
                    'overall_difficulty': (attack_difficulty + defense_difficulty) / 2,
                    # Additional data for debugging
                    'home_attack_rank': home_attack_rank,
                    'away_defense_rank': away_defense_rank,
                    'home_defense_rank': home_defense_rank,
                    'away_attack_rank': away_attack_rank,
                    'attack_rank_diff': attack_rank_difference,
                    'defense_rank_diff': defense_rank_difference
                })
        
        return pd.DataFrame(difficulties)

print("✅ Enhanced Fixture Analyzer Class Loaded!")
print("📊 Ready to analyze complete season fixture data")

✅ Enhanced Fixture Analyzer Class Loaded!
📊 Ready to analyze complete season fixture data


In [264]:
# =============================================================================
# 🔮 ENHANCED ANALYZER METHODS - ANALYSIS FUNCTIONS
# =============================================================================

# Add analysis methods to the EnhancedFixtureAnalyzer class
def analyze_gameweek(self, gw):
    """Analyze specific gameweek with detailed insights - HOME & AWAY perspective"""
    gw_fixtures = self.fixtures_df[self.fixtures_df['gameweek'] == gw].copy()
    
    print(f"⚽ GAMEWEEK {gw} ENHANCED FIXTURE ANALYSIS")
    print("=" * 70)
    print(f"📅 {len(gw_fixtures)} fixtures scheduled")
    
    if gw_fixtures.empty:
        print("❌ No fixture data available for this gameweek")
        return
    
    # Get difficulty matrices for both home and away perspectives
    home_difficulty = self.get_fixture_difficulty_matrix(gw, gw)
    
    if home_difficulty.empty:
        print("❌ No difficulty data available for analysis")
        return
    
    # Categorize difficulty (using BASIC SYSTEM categories)
    def get_difficulty_text(score):
        if score >= 4.0: return "🟢 Very Easy"
        elif score >= 2.5: return "🟡 Easy" 
        elif score >= 1.0: return "🟠 Medium-Easy"
        elif score >= -0.5: return "⚪ Medium"
        elif score >= -2.0: return "🔴 Hard"
        else: return "⚫ Very Hard"
    
    def get_recommendation(score, team_name, position):
        if team_name:  # For quick picks section
            if score >= 2.5:
                return f"🔥 Target {team_name} {position}"
            elif score >= 1.0:
                return f"⭐ Consider {team_name} {position}"
            elif score <= -2.0:
                return f"❌ Avoid {team_name} {position}"
            else:
                return f"⚪ Average {team_name} {position}"
        else:  # For detailed display section
            if score >= 2.5:
                return "Strong Pick 🔥"
            elif score >= 1.0:
                return "Good Pick ⭐"
            elif score <= -2.0:
                return "Strong Avoid ❌"
            else:
                return "Average"
    
    for i, (_, fixture) in enumerate(home_difficulty.iterrows(), 1):
        home_team = fixture['home_team']
        away_team = fixture['away_team']
        
        # HOME TEAM perspective
        home_att_diff = fixture['attack_difficulty']
        home_def_diff = fixture['defense_difficulty']
        
        # AWAY TEAM perspective (calculate reverse fixture)
        away_difficulty = self.get_fixture_difficulty_matrix(gw, gw)
        # Find the away team perspective by looking up as if away team was home
        mapped_home = fixture['mapped_home']
        mapped_away = fixture['mapped_away']
        
        if mapped_away in self.team_rankings.index and mapped_home in self.team_rankings.index:
            # Calculate away team difficulty (as if they were home, but without home advantage)
            away_stats = self.team_rankings.loc[mapped_away]
            home_stats = self.team_rankings.loc[mapped_home]
            total_teams = len(self.team_rankings)
            
            # Away attacking (vs home defense)
            away_attack_rank = int(away_stats['attack_rank'])
            home_defense_rank = int(home_stats['defense_rank'])
            away_att_rank_diff = home_defense_rank - away_attack_rank
            away_att_diff = away_att_rank_diff / total_teams * 10
            
            # Away defending (vs home attack)
            away_defense_rank = int(away_stats['defense_rank'])
            home_attack_rank = int(home_stats['attack_rank'])
            away_def_rank_diff = home_attack_rank - away_defense_rank
            away_def_diff = away_def_rank_diff / total_teams * 10
        else:
            away_att_diff = 0
            away_def_diff = 0
        
        print(f"\n{i}. 🏟️ {home_team.upper()} vs {away_team.upper()}")
        print("-" * 60)
        
        # Display both teams side by side (like Basic System)
        print(f"🏠 {home_team.upper()[:12]:12} (HOME) |  ✈️  {away_team.upper()[:12]:12} (AWAY)")
        print("-" * 28 + "|" + "-" * 28)
        
        # Get team ranks for display
        if mapped_home in self.team_rankings.index and mapped_away in self.team_rankings.index:
            home_att_rank = int(self.team_rankings.loc[mapped_home, 'attack_rank'])
            home_def_rank = int(self.team_rankings.loc[mapped_home, 'defense_rank'])
            away_att_rank = int(self.team_rankings.loc[mapped_away, 'attack_rank'])
            away_def_rank = int(self.team_rankings.loc[mapped_away, 'defense_rank'])
            
            # Show attacking analysis with ranks (like Basic System)
            home_att_text = get_difficulty_text(home_att_diff)
            away_att_text = get_difficulty_text(away_att_diff)
            home_att_analysis = f"ATT#{home_att_rank} vs DEF#{away_def_rank}"
            away_att_analysis = f"ATT#{away_att_rank} vs DEF#{home_def_rank}"
            
            print(f"⚔️  {home_att_analysis:15} {home_att_text:10} | ⚔️  {away_att_analysis:15} {away_att_text:10}")
            print(f"   {get_recommendation(home_att_diff, '', 'attackers'):15} ({home_att_diff:+4.1f}) | {get_recommendation(away_att_diff, '', 'attackers'):15} ({away_att_diff:+4.1f})")
            
            # Show defensive analysis with ranks
            home_def_text = get_difficulty_text(home_def_diff)
            away_def_text = get_difficulty_text(away_def_diff)
            home_def_analysis = f"DEF#{home_def_rank} vs ATT#{away_att_rank}"
            away_def_analysis = f"DEF#{away_def_rank} vs ATT#{home_att_rank}"
            
            print(f"🛡️  {home_def_analysis:15} {home_def_text:10} | 🛡️  {away_def_analysis:15} {away_def_text:10}")
            print(f"   {get_recommendation(home_def_diff, '', 'defenders'):15} ({home_def_diff:+4.1f}) | {get_recommendation(away_def_diff, '', 'defenders'):15} ({away_def_diff:+4.1f})")
        else:
            # Fallback if ranks not available
            home_att_text = get_difficulty_text(home_att_diff)
            away_att_text = get_difficulty_text(away_att_diff)
            print(f"⚔️  {home_att_text:15} ({home_att_diff:+4.1f}) | ⚔️  {away_att_text:15} ({away_att_diff:+4.1f})")
            
            home_def_text = get_difficulty_text(home_def_diff)
            away_def_text = get_difficulty_text(away_def_diff)
            print(f"🛡️  {home_def_text:15} ({home_def_diff:+4.1f}) | 🛡️  {away_def_text:15} ({away_def_diff:+4.1f})")
        
        # Quick recommendations
        print(f"\n💡 QUICK PICKS:")
        
        # Best attacking opportunity
        if home_att_diff > away_att_diff:
            print(f"⚔️  ATTACK: {get_recommendation(home_att_diff, home_team, 'attackers')}")
        elif away_att_diff > home_att_diff:
            print(f"⚔️  ATTACK: {get_recommendation(away_att_diff, away_team, 'attackers')}")
        else:
            print(f"⚔️  ATTACK: Both teams similar - Average picks")
        
        # Best defensive opportunity
        if home_def_diff > away_def_diff:
            print(f"🛡️  DEFENSE: {get_recommendation(home_def_diff, home_team, 'defenders/GK')}")
        elif away_def_diff > home_def_diff:
            print(f"🛡️  DEFENSE: {get_recommendation(away_def_diff, away_team, 'defenders/GK')}")
        else:
            print(f"🛡️  DEFENSE: Both teams similar - Average picks")

def get_best_fixtures(self, position_type='attack', num_gameweeks=3):
    """Get best fixtures for next few gameweeks"""
    current_gw = self.fixtures_df['gameweek'].min()
    end_gw = min(current_gw + num_gameweeks - 1, self.fixtures_df['gameweek'].max())
    
    difficulty_matrix = self.get_fixture_difficulty_matrix(current_gw, end_gw)
    
    if position_type == 'attack':
        best_fixtures = difficulty_matrix.nlargest(10, 'attack_difficulty')
        print(f"🎯 BEST ATTACKING FIXTURES (GW{current_gw}-{end_gw})")
        score_col = 'attack_difficulty'
    else:
        best_fixtures = difficulty_matrix.nlargest(10, 'defense_difficulty')
        print(f"🛡️ BEST DEFENSIVE FIXTURES (GW{current_gw}-{end_gw})")
        score_col = 'defense_difficulty'
    
    print("=" * 60)
    
    if best_fixtures.empty:
        print("❌ No fixture data available")
        return
        
    for i, (_, fixture) in enumerate(best_fixtures.iterrows(), 1):
        score = fixture[score_col]
        gw = fixture['gameweek']
        matchup = f"{fixture['home_team']} vs {fixture['away_team']}"
        
        # Add recommendation level (using BASIC SYSTEM thresholds)
        if score >= 4.0:
            level = "🔥 VERY EASY"
        elif score >= 2.5:
            level = "⭐ EASY"
        elif score >= 1.0:
            level = "✅ MEDIUM-EASY"
        elif score >= -0.5:
            level = "⚪ MEDIUM"
        elif score >= -2.0:
            level = "⚠️ HARD"
        else:
            level = "❌ VERY HARD"
            
        print(f"{i:2d}. GW{gw:2d}: {matchup:<30} | {level} ({score:+.1f})")

def create_team_difficulty_summary(self):
    """Create summary of difficulty for each team"""
    all_difficulties = self.get_fixture_difficulty_matrix()
    
    if all_difficulties.empty:
        print("❌ No fixture difficulty data available")
        return
    
    # Aggregate by team (home fixtures only to avoid double counting)
    team_summary = []
    
    fixture_teams = set(all_difficulties['home_team'].unique())
    
    for team in fixture_teams:
        home_fixtures = all_difficulties[all_difficulties['home_team'] == team]
        
        if len(home_fixtures) == 0:
            continue
            
        avg_attack_diff = home_fixtures['attack_difficulty'].mean()
        avg_defense_diff = home_fixtures['defense_difficulty'].mean()
        num_fixtures = len(home_fixtures)
        
        team_summary.append({
            'team': team,
            'avg_attack_difficulty': avg_attack_diff,
            'avg_defense_difficulty': avg_defense_diff,
            'num_fixtures': num_fixtures,
            'overall_difficulty': (avg_attack_diff + avg_defense_diff) / 2
        })
    
    if not team_summary:
        print("❌ No team summary data available")
        return
        
    summary_df = pd.DataFrame(team_summary).sort_values('overall_difficulty', ascending=False)
    
    print("🏆 TEAM FIXTURE DIFFICULTY SUMMARY")
    print("=" * 70)
    print("(Higher scores = easier fixtures, negative = harder fixtures)")
    print()
    
    for i, (_, team_data) in enumerate(summary_df.iterrows(), 1):
        team = team_data['team']
        att_diff = team_data['avg_attack_difficulty']
        def_diff = team_data['avg_defense_difficulty']
        overall = team_data['overall_difficulty']
        fixtures = int(team_data['num_fixtures'])
        
        # Add emoji indicators
        att_emoji = "🟢" if att_diff >= 1 else "🟡" if att_diff >= -1 else "🔴"
        def_emoji = "🟢" if def_diff >= 1 else "🟡" if def_diff >= -1 else "🔴"
        
        print(f"{i:2d}. {team:<20} | {att_emoji} ATT:{att_diff:+5.1f} | {def_emoji} DEF:{def_diff:+5.1f} | Overall:{overall:+5.1f} ({fixtures} fixtures)")

def get_transfer_recommendations(self, num_gameweeks=5):
    """Advanced transfer timing recommendations"""
    current_gw = self.fixtures_df['gameweek'].min()
    end_gw = min(current_gw + num_gameweeks - 1, self.fixtures_df['gameweek'].max())
    
    difficulty_matrix = self.get_fixture_difficulty_matrix(current_gw, end_gw)
    
    if difficulty_matrix.empty:
        print("❌ No data available for transfer recommendations")
        return
    
    print(f"📈 STRATEGIC TRANSFER RECOMMENDATIONS (GW{current_gw}-{end_gw})")
    print("=" * 70)
    
    # Teams to target (good fixtures)
    good_attack_teams = difficulty_matrix[difficulty_matrix['attack_difficulty'] >= 2]['home_team'].value_counts()
    good_defense_teams = difficulty_matrix[difficulty_matrix['defense_difficulty'] >= 2]['home_team'].value_counts()
    
    # Teams to avoid (bad fixtures)  
    bad_attack_teams = difficulty_matrix[difficulty_matrix['attack_difficulty'] <= -2]['home_team'].value_counts()
    bad_defense_teams = difficulty_matrix[difficulty_matrix['defense_difficulty'] <= -2]['home_team'].value_counts()
    
    print(f"\n🎯 TARGET TEAMS FOR TRANSFERS:")
    print("🔥 Attacking Assets:")
    if len(good_attack_teams) > 0:
        for team, count in good_attack_teams.head(5).items():
            print(f"   • {team} ({count} good fixtures)")
    else:
        print("   • No standout attacking opportunities")
        
    print("🛡️ Defensive Assets:")
    if len(good_defense_teams) > 0:
        for team, count in good_defense_teams.head(5).items():
            print(f"   • {team} ({count} good fixtures)")
    else:
        print("   • No standout defensive opportunities")
    
    print(f"\n❌ AVOID THESE TEAMS:")
    print("⚔️ Poor Attacking Fixtures:")
    if len(bad_attack_teams) > 0:
        for team, count in bad_attack_teams.head(5).items():
            print(f"   • {team} ({count} tough fixtures)")
    else:
        print("   • No teams to specifically avoid for attacking")
        
    print("🏰 Poor Defensive Fixtures:")
    if len(bad_defense_teams) > 0:
        for team, count in bad_defense_teams.head(5).items():
            print(f"   • {team} ({count} tough fixtures)")
    else:
        print("   • No teams to specifically avoid for defending")

# Attach methods to the class
EnhancedFixtureAnalyzer.analyze_gameweek = analyze_gameweek
EnhancedFixtureAnalyzer.get_best_fixtures = get_best_fixtures
EnhancedFixtureAnalyzer.create_team_difficulty_summary = create_team_difficulty_summary
EnhancedFixtureAnalyzer.get_transfer_recommendations = get_transfer_recommendations

print("✅ Enhanced Analysis Methods Added!")
print("🔧 Methods: analyze_gameweek, get_best_fixtures, create_team_difficulty_summary, get_transfer_recommendations")

✅ Enhanced Analysis Methods Added!
🔧 Methods: analyze_gameweek, get_best_fixtures, create_team_difficulty_summary, get_transfer_recommendations

🔧 Methods: analyze_gameweek, get_best_fixtures, create_team_difficulty_summary, get_transfer_recommendations


In [265]:
# =============================================================================
# 🚀 INITIALIZE ENHANCED FIXTURE ANALYZER
# =============================================================================

print("🔮 INITIALIZING ENHANCED FIXTURE ANALYZER...")
print("=" * 60)

try:
    # Initialize analyzer with your existing data (using SAME team_rankings as Basic System)
    analyzer = EnhancedFixtureAnalyzer(season_stats, team_rankings, 'fixture_template.csv')
    
    print("✅ Analyzer initialized successfully!")
    print(f"📊 Fixture data loaded: {len(analyzer.fixtures_df)} fixtures")
    print(f"📅 Gameweeks available: {analyzer.fixtures_df['gameweek'].min()} to {analyzer.fixtures_df['gameweek'].max()}")
    print(f"🏟️ Teams mapped: {len(analyzer.team_mapping)} teams")
    
    # Check for any mapping issues
    missing_mappings = [team for team, mapped in analyzer.team_mapping.items() 
                       if mapped not in analyzer.team_rankings.index and mapped == team]
    
    if missing_mappings:
        print(f"⚠️ Teams without ranking data: {', '.join(missing_mappings[:5])}")
        print("   (These teams will be skipped in analysis)")
    else:
        print("✅ All teams successfully mapped to ranking data")
    
    print("\n🎯 ENHANCED FIXTURE ANALYZER READY!")
    print("Available methods:")
    print("• analyzer.analyze_gameweek(gw) - Deep dive into specific gameweek")
    print("• analyzer.get_best_fixtures('attack'/'defense', num_gw) - Find best opportunities")
    print("• analyzer.create_team_difficulty_summary() - Complete team overview")
    print("• analyzer.get_transfer_recommendations(num_gw) - Strategic transfer advice")
    
except Exception as e:
    print(f"❌ Error initializing analyzer: {e}")
    print("Please check that 'fixture_template.csv' exists and has the correct format")
    import traceback
    traceback.print_exc()

🔮 INITIALIZING ENHANCED FIXTURE ANALYZER...
✅ Analyzer initialized successfully!
📊 Fixture data loaded: 90 fixtures
📅 Gameweeks available: 7 to 15
🏟️ Teams mapped: 20 teams
✅ All teams successfully mapped to ranking data

🎯 ENHANCED FIXTURE ANALYZER READY!
Available methods:
• analyzer.analyze_gameweek(gw) - Deep dive into specific gameweek
• analyzer.get_best_fixtures('attack'/'defense', num_gw) - Find best opportunities
• analyzer.create_team_difficulty_summary() - Complete team overview
• analyzer.get_transfer_recommendations(num_gw) - Strategic transfer advice


In [266]:
# =============================================================================
# 🎯 GAMEWEEK ANALYSIS DEMO
# =============================================================================

print("🎯 ENHANCED FIXTURE ANALYSIS - GAMEWEEK DEMO")
print("=" * 60)

# Analyze the first available gameweek
if 'analyzer' in locals():
    first_gw = analyzer.fixtures_df['gameweek'].min()
    print(f"📊 Analyzing Gameweek {first_gw} as demonstration...")
    print()
    
    analyzer.analyze_gameweek(first_gw)
    
    print(f"\n" + "="*60)
    print("💡 How to interpret the results:")
    print("🟢 Very Easy (3+) = Excellent opportunity, definitely target")
    print("🟡 Easy (1-3) = Good opportunity, consider targeting")  
    print("🟠 Medium (-1 to 1) = Average fixture, neutral")
    print("🔴 Hard (-3 to -1) = Difficult fixture, consider avoiding")
    print("⚫ Very Hard (-3+) = Very difficult, definitely avoid")
    
else:
    print("❌ Analyzer not initialized. Please run the initialization cell first.")

🎯 ENHANCED FIXTURE ANALYSIS - GAMEWEEK DEMO
📊 Analyzing Gameweek 7 as demonstration...

⚽ GAMEWEEK 7 ENHANCED FIXTURE ANALYSIS
📅 10 fixtures scheduled

1. 🏟️ BOURNEMOUTH vs FULHAM
------------------------------------------------------------
🏠 BOURNEMOUTH  (HOME) |  ✈️  FULHAM       (AWAY)
----------------------------|----------------------------
⚔️  ATT#6 vs DEF#13 🟡 Easy     | ⚔️  ATT#15 vs DEF#8 ⚫ Very Hard
   Strong Pick 🔥   (+3.9) | Strong Avoid ❌  (-3.5)
🛡️  DEF#8 vs ATT#15 🟡 Easy     | 🛡️  DEF#13 vs ATT#6 ⚫ Very Hard
   Strong Pick 🔥   (+3.9) | Strong Avoid ❌  (-3.5)

💡 QUICK PICKS:
⚔️  ATTACK: 🔥 Target Bournemouth attackers
🛡️  DEFENSE: 🔥 Target Bournemouth defenders/GK

2. 🏟️ LEEDS vs SPURS
------------------------------------------------------------
🏠 LEEDS        (HOME) |  ✈️  SPURS        (AWAY)
----------------------------|----------------------------
⚔️  ATT#12 vs DEF#4 ⚫ Very Hard | ⚔️  ATT#11 vs DEF#12 ⚪ Medium  
   Strong Avoid ❌  (-3.7) | Average         (+0.5)
🛡️  DEF

In [267]:
# =============================================================================
# 🔥 BEST FIXTURE OPPORTUNITIES
# =============================================================================

if 'analyzer' in locals():
    print("🔥 FINDING BEST FIXTURE OPPORTUNITIES")
    print("=" * 60)
    
    # Get best attacking fixtures for next 3 gameweeks
    print("⚔️ ATTACKING OPPORTUNITIES:")
    analyzer.get_best_fixtures('attack', 3)
    
    print(f"\n" + "="*60)
    
    # Get best defensive fixtures for next 3 gameweeks  
    print("🛡️ DEFENSIVE OPPORTUNITIES:")
    analyzer.get_best_fixtures('defense', 3)
    
    print(f"\n" + "="*60)
    print("💡 Strategy Tips:")
    print("• 🔥 EXCELLENT fixtures (3+): Strongly consider transfers to these teams")
    print("• ✅ GOOD fixtures (1-3): Solid options, good for captaincy")
    print("• ❌ AVOID fixtures (-1 or worse): Consider transferring out or benching")
    print("• Look for teams with multiple good fixtures for better value")
    
else:
    print("❌ Analyzer not initialized. Please run the initialization cell first.")

🔥 FINDING BEST FIXTURE OPPORTUNITIES
⚔️ ATTACKING OPPORTUNITIES:
🎯 BEST ATTACKING FIXTURES (GW7-9)
 1. GW 8: Liverpool vs Man Utd           | 🔥 VERY EASY (+8.5)
 2. GW 7: Arsenal vs West Ham            | 🔥 VERY EASY (+8.3)
 3. GW 9: Man Utd vs Brighton            | 🔥 VERY EASY (+6.8)
 4. GW 9: Bournemouth vs Nott'm Forest   | 🔥 VERY EASY (+5.3)
 5. GW 7: Bournemouth vs Fulham          | ⭐ EASY (+3.9)
 6. GW 8: Sunderland vs Wolves           | ⭐ EASY (+3.8)
 7. GW 9: Leeds vs West Ham              | ⭐ EASY (+3.8)
 8. GW 7: Chelsea vs Liverpool           | ⭐ EASY (+2.9)
 9. GW 8: Nott'm Forest vs Chelsea       | ✅ MEDIUM-EASY (+1.9)
10. GW 8: Man City vs Everton            | ✅ MEDIUM-EASY (+1.4)

🛡️ DEFENSIVE OPPORTUNITIES:
🛡️ BEST DEFENSIVE FIXTURES (GW7-9)
 1. GW 8: Sunderland vs Wolves           | 🔥 VERY EASY (+7.8)
 2. GW 7: Arsenal vs West Ham            | 🔥 VERY EASY (+6.5)
 3. GW 8: Spurs vs Aston Villa           | 🔥 VERY EASY (+6.3)
 4. GW 7: Aston Villa vs Burnley         | 🔥 VE

In [268]:
# =============================================================================
# 📊 TEAM DIFFICULTY OVERVIEW & TRANSFER STRATEGY
# =============================================================================

if 'analyzer' in locals():
    print("📊 COMPLETE TEAM FIXTURE DIFFICULTY OVERVIEW")
    print("=" * 70)
    
    # Show team difficulty summary
    analyzer.create_team_difficulty_summary()
    
    print(f"\n" + "="*70)
    print("📈 STRATEGIC TRANSFER RECOMMENDATIONS")
    print("=" * 70)
    
    # Get transfer recommendations for next 5 gameweeks
    analyzer.get_transfer_recommendations(5)
    
    print(f"\n" + "="*70)
    print("🎯 HOW TO USE THIS ENHANCED ANALYSIS:")
    print("=" * 70)
    print("1. 📊 Use Team Overview to identify teams with consistently good/bad fixtures")
    print("2. 🎯 Use Transfer Recommendations to time your moves strategically")
    print("3. ⚔️ Target teams with multiple good attacking fixtures for forwards/mids")
    print("4. 🛡️ Target teams with multiple good defensive fixtures for defenders/GKs")
    print("5. 📅 Plan 3-5 gameweeks ahead instead of just the next gameweek")
    print("6. 🔄 Update fixture data regularly as new gameweeks become available")
    
    print(f"\n🎉 ENHANCED FIXTURE ANALYSIS COMPLETE!")
    print("This system provides strategic insights beyond basic fixture difficulty!")
    
else:
    print("❌ Analyzer not initialized. Please run the initialization cell first.")

📊 COMPLETE TEAM FIXTURE DIFFICULTY OVERVIEW
🏆 TEAM FIXTURE DIFFICULTY SUMMARY
(Higher scores = easier fixtures, negative = harder fixtures)

 1. Arsenal              | 🟢 ATT: +3.7 | 🟢 DEF: +5.9 | Overall: +4.8 (4 fixtures)
 2. Crystal Palace       | 🟢 ATT: +3.7 | 🟢 DEF: +4.0 | Overall: +3.8 (4 fixtures)
 3. Liverpool            | 🟢 ATT: +5.2 | 🟡 DEF: +0.2 | Overall: +2.7 (4 fixtures)
 4. Spurs                | 🟢 ATT: +1.3 | 🟢 DEF: +4.0 | Overall: +2.7 (5 fixtures)
 5. Bournemouth          | 🟢 ATT: +3.9 | 🟢 DEF: +1.4 | Overall: +2.6 (5 fixtures)
 6. Man City             | 🟢 ATT: +2.2 | 🟡 DEF: +0.9 | Overall: +1.6 (5 fixtures)
 7. Everton              | 🟡 ATT: -0.2 | 🟢 DEF: +3.1 | Overall: +1.5 (5 fixtures)
 8. Brighton             | 🟢 ATT: +1.2 | 🟡 DEF: +0.6 | Overall: +0.9 (5 fixtures)
 9. Man Utd              | 🟢 ATT: +4.7 | 🔴 DEF: -2.9 | Overall: +0.9 (4 fixtures)
10. Sunderland           | 🔴 ATT: -1.8 | 🟢 DEF: +3.3 | Overall: +0.8 (4 fixtures)
11. Chelsea              | 🟢 ATT: +2.1 

## 🚀 Next Steps: Web Interface & Advanced Features

Now that you have the Enhanced Fixture Analyzer integrated into your FPL notebook, here are the next steps:

### 📈 **Immediate Actions**
1. **Run the cells above** to test the Enhanced Fixture Analyzer
2. **Add more fixture data** to `fixture_template.csv` (extend beyond gameweek 15)
3. **Experiment with different gameweek ranges** using the analysis methods

### 🌐 **Web Interface Development**
To create a beautiful web interface for better visualization:

1. **Copy the Tempo.new prompt** from `tempo_prompt.md` in your folder
2. **Paste it into [Tempo.new](https://tempo.new)** to generate the web interface
3. **The web version will include:**
   - Interactive fixture difficulty heatmaps
   - Visual team strength comparisons  
   - Drag-and-drop transfer planning tools
   - Real-time gameweek analysis
   - Mobile-responsive design

### 🔧 **Advanced Features to Add**
- **Fixture Run Analysis**: Analyze 3-5 fixture sequences for transfer timing
- **Captain Recommendations**: Find best captaincy options by gameweek
- **Differential Picks**: Identify low-owned players with good fixtures
- **Blank Gameweek Planning**: Handle double gameweeks and blank gameweeks

### 📊 **Data Enhancement**
- **Historical Performance**: Add head-to-head records
- **Form Analysis**: Weight recent performance more heavily
- **Injury Reports**: Factor in key player availability
- **Weather/Venue**: Consider external factors

The Enhanced Fixture Analyzer now gives you **strategic FPL intelligence** that goes far beyond basic analysis! 🏆