# Basketball Season: Predicting Coach Changes

**Project Context**
This notebook addresses **Task (b)** of the project description: "Set of teams that will change coaches". The goal is to utilize 10 years of historical data regarding players, teams, and games to predict which teams will replace their head coach during the test season (Year 11).

**Data Sources**
The analysis utilizes the following relational tables provided in the dataset:
* **`coaches`**: History of coaches, including stints and win/loss records.
* **`teams`**: Seasonal performance metrics for every team.
* **`players_teams`**: Performance metrics for players within specific teams.

In [None]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

# 1. Define File Paths
# Assuming standard directory structure as per the problem description
initial_path = '../data/initial_data/'
test_path = '../data/test_data/'

def load_and_combine(filename):
    # Load historical data
    df = pd.read_csv(f"{initial_path}{filename}")
    
    # Try to load test data if it exists (it might not exist for all files)
    try:
        df_test = pd.read_csv(f"{test_path}{filename}")
        # Concatenate: this ensures Year 11 is at the end of the dataframe
        df = pd.concat([df, df_test], ignore_index=True)
    except FileNotFoundError:
        pass
    return df

# Load all relevant tables
coaches = load_and_combine('coaches.csv')
teams = load_and_combine('teams.csv')
players_teams = load_and_combine('players_teams.csv')

# Ensure data is sorted chronologically
teams = teams.sort_values(['year', 'tmID'])
coaches = coaches.sort_values(['year', 'tmID'])

display(coaches.head())
display(teams.head())
display(players_teams.head())

pd.set_option('display.width', 1000)
features = []

### 2. Defining the Target Variable
The objective is to identify mid-season coach changes. In the `coaches` dataset, a `stint` greater than 0 indicates that a coach took over after the season began or that multiple coaches managed the team in a single year.

* **Target (`CoachChange`)**: Boolean flag indicating if a team had a `stint > 0` in a given year.
* **Scope**: We define this for the training years (1-10) and aim to predict it for the test year (11).

In [None]:
target_df = coaches.groupby(['year', 'tmID'])['stint'].max().reset_index()
target_df['CoachChange'] = target_df['stint'] > 0

target_df = target_df[['year', 'tmID', 'CoachChange']]

print("\n--- Target Variable 'CoachChange' Created ---")
print(target_df[target_df['CoachChange'] == True])
print(f"\nTotal 'CoachChange = True' events: {int(target_df['CoachChange'].sum())}")

### 3. Feature Engineering: Coach Tenure & Stability
Hypothesis: A coach who has just started (low tenure) or has survived many years (high tenure) has a different risk profile than one in the middle of a contract.

We calculate:
* **`coach_tenure_years`**: The cumulative number of years a specific `coachID` has managed a specific `tmID` without interruption.
* **`coach_spell`**: A unique identifier for a specific contiguous period a coach spends with a team.

In [None]:
# Filter to main coaches (stint == 0)
main_coaches = coaches[(coaches['stint'] == 0) | (coaches['stint'] == 1)].copy()

# Sort by team and year to ensure order
main_coaches = main_coaches.sort_values(['tmID', 'year'])

# Logic: Check if the coachID changed from the previous year
main_coaches['prev_coach'] = main_coaches.groupby('tmID')['coachID'].shift(1)
main_coaches['prev_coach'] = main_coaches['prev_coach'].fillna(main_coaches['coachID'])
main_coaches['coach_change'] = main_coaches['coachID'] != main_coaches['prev_coach']

# Create a "spell ID" that increments every time the coach changes
main_coaches['coach_spell'] = main_coaches.groupby('tmID')['coach_change'].cumsum()

# Count the cumulative years within each spell
# We add 1 so the first year counts as 1 (or 0 if you prefer 'completed years')
main_coaches['coach_tenure_years'] = main_coaches.groupby(['tmID', 'coach_spell']).cumcount() + 1

print(main_coaches[['coachID', 'year', 'tmID', 'stint', 'coach_spell', 'coach_tenure_years', 'coach_change']][main_coaches['year'] == 11].sort_values(['year', 'tmID']))

features.extend(['coach_tenure_years'])
print(f"Features: {features}")

In [None]:
# Calculate Win Percentage for every season
# Note: For the test year (Year 11), 'won' and 'lost' will be NaN or 0, which is fine
# because we only use Lagged values for prediction.
teams['win_pct'] = teams['won'] / (teams['won'] + teams['lost'])

# 1. Previous Season Win Pct
# Shift(1) grabs the value from Year t-1 and places it in the row for Year t
teams['previous_season_win_pct'] = teams.groupby('tmID')['win_pct'].shift(1)

# 2. Three Year Win Trend
# We want the slope of win_pct for [t-3, t-2, t-1].
# We apply a rolling window of 3 to the 'previous_season_win_pct' (which is already shifted).
def calculate_trend(y):
    # Only calculate if we have 3 data points
    if len(y) < 3 or pd.isna(y).any():
        return np.nan
    # Fit linear regression to find slope
    X = np.array([0, 1, 2]).reshape(-1, 1)
    model = LinearRegression().fit(X, y.values)
    return model.coef_[0]

teams['three_year_win_trend'] = teams.groupby('tmID')['previous_season_win_pct'] \
                                     .rolling(3) \
                                     .apply(calculate_trend) \
                                     .reset_index(0, drop=True)

teams['previous_season_win_pct'] = teams['previous_season_win_pct'].fillna(teams['win_pct']).fillna(.4)
teams['three_year_win_trend'] = teams['three_year_win_trend'].fillna(-0.02) # A negative trend is better

display(teams[teams['year'] == 11][['year', 'tmID', 'previous_season_win_pct', 'three_year_win_trend']])

features.extend(['previous_season_win_pct', 'three_year_win_trend'])
print(f"Features: {features}")

### 4. Feature Engineering: Team Performance Metrics
A coach's job security is heavily tied to team success. As described in the case, teams aim to achieve the greatest number of wins in the first part of the season.

We engineer lag features to prevent data leakage (using $t-1$ data to predict $t$):
* **`previous_season_win_pct`**: The win percentage ($\frac{won}{won + lost}$) from the prior year.
* **`three_year_win_trend`**: The slope of the win percentage over the last 3 years. A negative slope indicates a declining franchise, increasing the pressure to fire the coach.

In [None]:
# Convert Playoff 'N' (No) to 1, 'Y' to 0
# Handle NaN (Year 11) by treating it as 0 temporarily, but the shift handles the logic.
teams = teams.sort_values(['tmID', 'year'])
teams['missed_playoff'] = teams['playoff'].apply(lambda x: 1 if x == 'N' else 0)

def get_streak(series):
    # Shift to look at history only
    history = series.shift(1).fillna(0)
    streaks = []
    current_streak = 0
    for missed in history:
        if missed == 1:
            current_streak += 1
        else:
            current_streak = 0
        streaks.append(current_streak)
    return pd.Series(streaks, index=series.index)

teams['playoff_miss_streak'] = teams.groupby('tmID', group_keys=False)['missed_playoff'].apply(get_streak)

print(teams[teams['year'] == 11][['year', 'tmID', 'missed_playoff', 'playoff_miss_streak']])

features.extend(['playoff_miss_streak'])
print(f"Features: {features}")

### 5. Feature Engineering: Playoff Droughts
Qualifying for the playoffs is a primary measure of success. Repeated failures to qualify often lead to management changes.

* **`playoff_miss_streak`**: The consecutive number of years a team has failed to reach the playoffs prior to the current season.

In [None]:
# 1. Calculate Efficiency for all player-years
# Fill NaNs with 0 (important for test data where stats are empty)
cols_to_fix = ['points', 'rebounds', 'assists', 'steals', 'blocks', 
               'fgAttempted', 'fgMade', 'ftAttempted', 'ftMade', 'turnovers']
pt_stats = players_teams.copy()
pt_stats[cols_to_fix] = pt_stats[cols_to_fix].fillna(0)

pt_stats['efficiency'] = (pt_stats['points'] + pt_stats['rebounds'] + 
                          pt_stats['assists'] + pt_stats['steals'] + 
                          pt_stats['blocks'] - 
                          (pt_stats['fgAttempted'] - pt_stats['fgMade']) - 
                          (pt_stats['ftAttempted'] - pt_stats['ftMade']) - 
                          pt_stats['turnovers'])

# 2. Prepare Lookup Table: Player Efficiency in Year Y
eff_lookup = pt_stats[['playerID', 'year', 'efficiency']].copy()
eff_lookup['next_year'] = eff_lookup['year'] + 1  # We join this to the NEXT year
eff_lookup = eff_lookup[['playerID', 'next_year', 'efficiency']]

# 3. Merge Current Roster (Year T) with Efficiency (from Year T-1)
roster = players_teams[['playerID', 'year', 'tmID']]
roster_with_talent = roster.merge(eff_lookup, 
                                  left_on=['playerID', 'year'], 
                                  right_on=['playerID', 'next_year'], 
                                  how='left')

# 4. Aggregate by Team
talent_score = roster_with_talent.groupby(['tmID', 'year'])['efficiency'].sum().reset_index()
talent_score = talent_score.rename(columns={'efficiency': 'talent_score_aggregate'})

# 5. Fill in missing talent scores for Year 1
year_2_data = talent_score[talent_score['year'] == 2].set_index('tmID')['talent_score_aggregate']
talent_score.loc[talent_score['year'] == 1, 'talent_score_aggregate'] = talent_score.loc[talent_score['year'] == 1, 'tmID'].map(year_2_data)

display(talent_score[talent_score['year'] == 11])

features.extend(['talent_score_aggregate'])
print(f"Features: {features}")

### 6. Feature Engineering: Roster Talent (Aggregate Efficiency)
A coach might be fired if they are underperforming relative to the talent available on the roster. We calculate player efficiency using the specific offensive and defensive statistics provided.

 The Efficiency metric is derived from standard stats found in `players_teams`:
$$Efficiency = (PTS + REB + AST + STL + BLK) - (Missed FG + Missed FT + TO)$$

We aggregate this for the **current** roster (Year $t$) using stats from Year $t-1$ to estimate the incoming "talent level" of the squad.

In [None]:
# 1. Total Minutes played by each team in Year Y
team_minutes = players_teams.groupby(['tmID', 'year'])['minutes'].sum().reset_index()
team_minutes['next_year'] = team_minutes['year'] + 1 # Align to next year as denominator

# 2. Minutes of RETURNING players (Players in Team T at Year Y who were also in Team T at Year Y-1)
# We need their minutes from Y-1.
prev_player_mins = players_teams[['playerID', 'year', 'tmID', 'minutes']].copy()
prev_player_mins['next_year'] = prev_player_mins['year'] + 1
prev_player_mins = prev_player_mins[['playerID', 'next_year', 'tmID', 'minutes']]
# Join current roster with previous minutes on (Player, Team) match
returning_players = roster.merge(prev_player_mins, 
                                 left_on=['playerID', 'year', 'tmID'], 
                                 right_on=['playerID', 'next_year', 'tmID'], 
                                 how='inner')

returning_mins_sum = returning_players.groupby(['tmID', 'year'])['minutes'].sum().reset_index()
returning_mins_sum.rename(columns={'minutes': 'retained_minutes'}, inplace=True)

# 3. Calculate Index: Retained Minutes (from Y-1) / Total Minutes (from Y-1)
continuity = returning_mins_sum.merge(team_minutes[['tmID', 'next_year', 'minutes']], 
                                      left_on=['tmID', 'year'], 
                                      right_on=['tmID', 'next_year'], 
                                      how='left')

continuity['roster_continuity_index'] = continuity['retained_minutes'] / continuity['minutes']
continuity['roster_continuity_index'] = continuity['roster_continuity_index'].fillna(0)
continuity = continuity[['tmID', 'year', 'roster_continuity_index']]

display(continuity[continuity['year'] == 11])

### 7. Feature Engineering: Roster Continuity
This metric determines how much of the team remains the same from the previous year. High turnover in players might excuse a coach's poor performance, while high continuity suggests the coach is the only variable left to change.

* **`roster_continuity_index`**: The proportion of total minutes played in Year $t$ by players who were also on the team in Year $t-1$.

In [None]:
# Sort by year to calculate cumulative sum correctly
all_coaches = coaches.copy().sort_values('year')

# Sum wins/losses per coach per year (in case of multiple stints)
coach_annual = all_coaches.groupby(['coachID', 'year'])[['won', 'lost']].sum().reset_index()

# Calculate expanding sum (cumulative history) shifted by 1 (exclude current year)
coach_annual['cum_won'] = coach_annual.groupby('coachID')['won'] \
                                      .transform(lambda x: x.shift(1).expanding().sum())
coach_annual['cum_lost'] = coach_annual.groupby('coachID')['lost'] \
                                       .transform(lambda x: x.shift(1).expanding().sum())

# Calculate lifetime win percentage
coach_annual['coach_lifetime_win_pct'] = (coach_annual['cum_won'] / (coach_annual['cum_won'] + coach_annual['cum_lost'])).fillna(.4) # Fill missing values with .4
coach_annual = coach_annual[['coachID', 'year', 'coach_lifetime_win_pct']]

display(coach_annual[coach_annual['year'] == 10])

### 8. Feature Engineering: Coach Lifetime Win Percentage
Historical performance across all teams managed by a specific coach. A "legendary" coach with a high lifetime win percentage typically has more job security than a novice.

* **`coach_lifetime_win_pct`**: Expanding mean of wins divided by total games for a `coachID` across their entire career up to Year $t-1$.

In [None]:
# Start with main coaches (Year, Team, Coach)
final_df = main_coaches[['year', 'tmID', 'coachID', 'coach_tenure_years']]

# Merge Coach Lifetime Stats
final_df = final_df.merge(coach_annual[['coachID', 'year', 'coach_lifetime_win_pct']], 
                          on=['coachID', 'year'], how='left')

# Merge Team Stats (Win trends, Playoff streaks)
final_df = final_df.merge(teams[['year', 'tmID', 'previous_season_win_pct', 
                                 'three_year_win_trend', 'playoff_miss_streak']], 
                          on=['year', 'tmID'], how='left')

# Merge Roster Stats (Talent, Continuity)
final_df = final_df.merge(talent_score, on=['year', 'tmID'], how='left')
final_df = final_df.merge(continuity[['year', 'tmID', 'roster_continuity_index']], 
                          on=['year', 'tmID'], how='left')
final_df['roster_continuity_index'] = final_df['roster_continuity_index'].fillna(.5)

final_df = final_df.merge(target_df[['year', 'tmID', 'CoachChange']], 
                          on=['year', 'tmID'], how='left')

final_df = final_df.sort_values(by=['year', 'tmID']).reset_index(drop=True)

print(final_df.head())

X = final_df.drop(['year', 'tmID', 'coachID', 'CoachChange'], axis=1)
y = final_df['CoachChange']

print()
print(f"X.shape: {X.shape}")
display(X.head())
print(f"y.shape: {y.shape}")
display(y)

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# --- C. Check Correlation ---

# Calculate the correlation matrix
corr_matrix = X.corr().abs()

# Create a heatmap
plt.figure(figsize=(12, 8))
sns.heatmap(corr_matrix, annot=False, cmap='Blues', fmt='.1f')
plt.title('Feature Correlation Matrix')
plt.show()

# You can also manually find high-correlation pairs
# Select upper triangle of correlation matrix
upper_tri = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(bool))

# Find features with correlation greater than 0.9
high_corr_features = [column for column in upper_tri.columns if any(upper_tri[column] > 0.9)]

if high_corr_features:
    print(f"\nWARNING!: High Correlation remaining in features: {high_corr_features}")
    print("Consider dropping one feature from each correlated pair.")
else:
    print("\nNo highly correlated (r > 0.9) features found. Ready for modeling.")

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, make_scorer, classification_report, matthews_corrcoef
from sklearn.utils.class_weight import compute_sample_weight
from sklearn.pipeline import Pipeline

### 9. Evaluation Strategy: Yearly Walk-Forward Validation
Because this is time-series data (Years 1 through 10), standard K-Fold cross-validation would introduce look-ahead bias (using future data to predict the past).

Instead, we use **Walk-Forward Validation**:
1.  **Train**: Years $1$ to $t-1$
2.  **Test**: Year $t$
3.  **Repeat**: Increment $t$ from Year 3 to Year 10.

In [None]:
class YearlyWalkForwardSplit:
    """
    Perform walk-forward validation based on an external year series (list/array/column).
    
    Train: All years prior to the current test year.
    Test:  The specific current test year.
    """
    def __init__(self, year_series):
        self.year_series = np.array(year_series)
        self.unique_years = np.sort(np.unique(self.year_series))
        
    def get_n_splits(self, X=None, y=None, groups=None):
        return len(self.unique_years) - 2

    def split(self, X, y=None, groups=None):
        if len(X) != len(self.year_series):
            raise ValueError(f"Data length mismatch! X has {len(X)} rows, but year_series has {len(self.year_series)}.")

        for i in range(2, len(self.unique_years)):
            test_year = self.unique_years[i]
            
            # Train on everything strictly BEFORE the test year
            train_mask = self.year_series < test_year
            
            # Test on ONLY the current test year
            test_mask = self.year_series == test_year
            
            train_indices = np.flatnonzero(train_mask)
            test_indices = np.flatnonzero(test_mask)
            
            yield train_indices, test_indices

In [None]:
test_year = 10

X_train = X[final_df['year'] < test_year]
y_train = y[final_df['year'] < test_year]

X_test = X[final_df['year'] == test_year]
y_test = y[final_df['year'] == test_year]

walk_forward_cv = YearlyWalkForwardSplit(final_df[final_df['year'] < test_year]['year'])

f1_scorer = make_scorer(f1_score, pos_label=1, zero_division=0)
mcc_scorer = make_scorer(matthews_corrcoef)

In [None]:
lr = LogisticRegression(solver='liblinear', max_iter=1000, random_state=42)

lr_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('lr', lr)
])

lr_params = {
    'lr__C': [0.01, 0.1, 1, 10, 100],
    'lr__penalty': ['l1', 'l2'],
    'lr__class_weight': [None, 'balanced']
}

lr_grid = GridSearchCV(
    estimator=lr_pipeline,
    param_grid=lr_params,
    scoring=mcc_scorer,
    cv=walk_forward_cv,
    verbose=1,
    n_jobs=-1
)

lr_grid.fit(X_train, y_train)

# 5. Get Results
print(f"Best Hyperparameters: {lr_grid.best_params_}")
print(f"Best Cross-Validated MCC Score: {lr_grid.best_score_:.4f}")

In [None]:
rf = RandomForestClassifier(random_state=42)

rf_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('rf', rf)
])

rf_params = {
    'rf__n_estimators': [100, 200, 300],
    'rf__max_features': ['sqrt', 'log2'],
    'rf__max_depth': [10, 20, None],
    'rf__min_samples_split': [2, 5],
    'rf__min_samples_leaf': [1, 2],
    'rf__class_weight': [None, 'balanced']
}

rf_grid = GridSearchCV(
    estimator=rf_pipeline,
    param_grid=rf_params,
    scoring=mcc_scorer,
    cv=walk_forward_cv,
    verbose=1,
    n_jobs=-1
)

rf_grid.fit(X_train, y_train)

# 5. Get Results
print(f"Best Hyperparameters: {rf_grid.best_params_}")
print(f"Best Cross-Validated MCC Score: {rf_grid.best_score_:.4f}")

In [None]:
from xgboost import XGBClassifier

xgb_params = {
  'n_estimators' : [100, 200, 500],
  'learning_rate' : [0.01, 0.05, 0.1],
  'max_depth' : [3, 4, 5, 6],
  'subsample' : [0.6, 0.8, 1.0],
  'scale_pos_weight' : [1, 10, 25],
}

xgb_classifier = XGBClassifier(random_state=42)

xgb_grid = GridSearchCV(
    estimator=xgb_classifier,
    param_grid=xgb_params,
    scoring=mcc_scorer,
    cv=walk_forward_cv,
    verbose=1,
    n_jobs=-1
)

xgb_grid.fit(
  X_train,
  y_train
)

print(f"Best Hyperparameters: {xgb_grid.best_params_}")
print(f"Best Cross-Validated MCC Score: {xgb_grid.best_score_:.4f}")

In [None]:
# SVMs strictly require feature scaling (StandardScaler) to converge correctly.
svc_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('svc', SVC(random_state=42, probability=True)) 
])

# C: Controls strictness. Lower C = softer margins (prevents overfitting).
# kernel: 'rbf' is standard for non-linear complex boundaries.
# class_weight: Crucial for imbalanced coach firing data.
svc_params = {
    'svc__C': [0.1, 1, 10, 50],
    'svc__kernel': ['rbf', 'poly', 'sigmoid'],
    'svc__gamma': ['scale', 'auto'],
    'svc__class_weight': ['balanced', {0:1, 1:5}, {0:1, 1:10}]
}

# 4. Setup Grid Search
svc_grid = GridSearchCV(
    estimator=svc_pipeline,
    param_grid=svc_params,
    scoring=mcc_scorer,
    cv=walk_forward_cv,
    verbose=1,
    n_jobs=-1
)

svc_grid.fit(
    X_train,
    y_train
)

print(f"Best Hyperparameters: {svc_grid.best_params_}")
print(f"Best Cross-Validated MCC Score: {svc_grid.best_score_:.4f}")

In [None]:
sample_weights = compute_sample_weight(class_weight='balanced', y=y_train)

mlp_params = {
  'mlp__hidden_layer_sizes': [(50,), (100,), (50, 50), (100, 50)],
  'mlp__activation': ['relu', 'tanh'],
  'mlp__alpha': [0.0001, 0.001],
  'mlp__learning_rate_init': [0.001, 0.01],
  'mlp__max_iter': [2000],
}

mlp_pipeline = Pipeline([
  ('scaler', StandardScaler()),
  ('mlp', MLPClassifier(random_state=42))
])

mlp_grid = GridSearchCV(
  mlp_pipeline, 
  mlp_params,
  cv=walk_forward_cv, 
  scoring=mcc_scorer, 
  verbose=1,
  n_jobs=-1,
)

mlp_grid.fit(
  X_train, 
  y_train,
  mlp__sample_weight=sample_weights
)

print(f"Best Hyperparameters: {mlp_grid.best_params_}")
print(f"Best Cross-Validated MCC Score: {mlp_grid.best_score_:.4f}")

In [None]:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, precision_recall_curve, auc, matthews_corrcoef

best_estimators = {'Logistic Regression': lr_grid.best_estimator_, 'Random Forest': rf_grid.best_estimator_, 'XGBClassifier': xgb_grid.best_estimator_, 'SVC': svc_grid.best_estimator_, 'MLP': mlp_grid.best_estimator_}

def get_metrics_dict(y_true, y_pred, model_name, y_proba=None):
    metrics = {
        'Model': model_name,
        'Accuracy': accuracy_score(y_true, y_pred),
        'Precision': precision_score(y_true, y_pred, zero_division=0),
        'Recall': recall_score(y_true, y_pred, zero_division=0),
        'F1-Score': f1_score(y_true, y_pred, zero_division=0),
        'MCC': matthews_corrcoef(y_true, y_pred),
    }
    if y_proba is not None:
        try:
            metrics['ROC AUC'] = roc_auc_score(y_true, y_proba)
        except ValueError:
            metrics['ROC AUC'] = None
        
        precision, recall, _ = precision_recall_curve(y_true, y_proba)
        metrics['PR AUC'] = auc(recall, precision)

    return metrics

all_metrics = []

for name, estimator in best_estimators.items():
    y_pred = estimator.predict(X_test)
    y_proba = None
    if hasattr(estimator, 'predict_proba'):
        y_proba = estimator.predict_proba(X_test)[:, 1]
    
    metrics = get_metrics_dict(y_test, y_pred, name, y_proba)
    all_metrics.append(metrics)

metrics_df = pd.DataFrame(all_metrics)
metrics_df = metrics_df.sort_values(by=['MCC', 'PR AUC', 'ROC AUC'], ascending=[False, False, False])

metrics_0_to_1 = ['Accuracy', 'Precision', 'Recall', 'F1-Score', 'ROC AUC', 'PR AUC']
metrics_mcc = ['MCC']

styled_metrics_df = metrics_df.style\
    .background_gradient(cmap='RdYlGn', subset=metrics_0_to_1, vmin=0, vmax=1)\
    .background_gradient(cmap='RdYlGn', subset=metrics_mcc, vmin=-1, vmax=1)\
    .format(precision=4)

display(styled_metrics_df)

### 10. Final Prediction for Test Season (Year 11)
Using the best-performing models from our cross-validation, we now retrain on the entire historical dataset (Years 1-10) and generate probabilities for Year 11.

This output satisfies the project requirement to identify the "Set of teams that will change coaches" for the test season.

In [None]:
X_final_train = X[final_df['year'] < 11]
y_final_train = y[final_df['year'] < 11]

X_final_test = X[final_df['year'] == 11]
results = {}

print("Final season (Season 11) mid-season coach change predictions: \n")

for name in metrics_df['Model'].values:
    estimator = best_estimators[name]

    # Refit using Year 10 data
    estimator.fit(X_final_train, y_final_train)
    y_pred = estimator.predict(X_final_test)
    y_proba = estimator.predict_proba(X_final_test)
    
    final_test_season = final_df[final_df['year'] == 11].copy()
    final_test_season['CoachChange'] = y_pred
    final_test_season['CoachChange_proba'] = y_proba[:, 1]
    final_test_season = final_test_season.sort_values(by='CoachChange_proba', ascending=False)
    results[name] = final_test_season[['tmID', 'coachID', 'CoachChange', 'CoachChange_proba']][final_test_season['CoachChange'] == True]

for name, result in results.items():
  print(f"{name}: {len(result)}")
  print(result)