<img src="https://upload.wikimedia.org/wikipedia/commons/0/06/Imperial_College_London_new_logo.png" alt="Imperial Logo" width="400">

### **Course:** CIVE70111 Machine Learning
### Task 4 PV Plant Modelling and Machine Learning Pipeline

**Project:** Clssification of operation conditions

**Date:** 09/12/2025  

<p align="right">
Created by: Michael Wong

# Table of Contents

1. **Project Overview**
2. **Workflow Summary**
3. **Imports & Paths**
4. **Helper Functions**
5. **Machine Learning Helpers**
6. **End-to-End Classification Pipeline**


# 1. Project Overview

This project focuses on detecting **suboptimal inverter operating conditions** in two solar power plants.
Each plant contains multiple inverters and weather sensors recording AC/DC power, yield, irradiance,
and temperature. The dataset contains numerous real-world issues including missing values, inconsistent
measurements, noisy power output at night, and non-monotonic yield counters.

The goal is to develop a **robust and interpretable machine learning model** that:

- Predicts inverter state as **Optimal (0)** or **Suboptimal (1)**
- Uses strict **time-based splitting** to avoid data leakage
- Is evaluated using F1-score with emphasis on Suboptimal detection
- Incorporates **data cleaning, outlier removal, feature engineering**
- Provides **engineering interpretability** using ALE and Drop-Column Importance

The final system integrates preprocessing, model training, evaluation,
and interpretability into a fully automated pipeline.


# 2. Workflow Summary

The overall workflow is divided into six major stages:

1. **Imports & Paths**
   - Load required Python libraries
   - Define file locations for Plant 1 and Plant 2 datasets

2. **Helper Functions**
   - Weather cleaning
   - AC/DC cleaning
   - Daily and total yield correction
   - Outlier removal
   - Merging inverter and weather data

3. **Machine Learning Helper Functions**
   - Label construction
   - Feature engineering (AC/IRRA, DC/IRRA)
   - Train/validation/test splitting
   - Threshold optimisation for Suboptimal F1
   - ALE plotting and drop-column importance

4. **End-to-End Classification Pipeline**
   - Assemble datasets
   - Clean and engineer features
   - Split chronologically
   - Train Logistic Regression and Linear SVM (scaled/unscaled)
   - Generate evaluation metrics
   - Produce ALE interpretability plots
   - Compute drop-column feature importance

5. **Experiments**
   - With vs. without outlier removal
   - Before vs. after feature selection
   - Plant 1 vs. Plant 2 comparison

6. **Results Interpretation**
   - Performance comparison across plants and models
   - Importance of each input feature
   - Impact of outlier removal
   - Engineering insights into inverter performance


# 3. Imports & Paths

In [225]:
import os
import datetime as dt

import numpy as np
import pandas as pd

# Disable all plot display
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.metrics import (
    precision_recall_curve, classification_report, confusion_matrix,
    f1_score, average_precision_score
)
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.utils.class_weight import compute_class_weight

from PyALE import ale

import pickle
from tqdm import tqdm
import logging
logging.getLogger("PyALE").setLevel(logging.WARNING)


# 4. Helper Functions 
### Weather, AC/DC, Yield, Outliers

In [226]:
import pickle

def ensure_dir(path):
    if not os.path.exists(path):
        os.makedirs(path)


def regression_outlier_detection_graph(df, x_col="IRRADIATION_CLEAN",
                                       y_col="AC_CLEAN", z_thresh=3, plot=True):
    df = df.copy()
    mask_valid = df[[x_col, y_col]].notna().all(axis=1)
    if mask_valid.sum() < 10:
        return df

    X = df.loc[mask_valid, [x_col]].values
    y = df.loc[mask_valid, y_col].values

    model = LinearRegression()
    model.fit(X, y)
    y_pred = model.predict(X)

    residuals = y - y_pred
    z = (residuals - residuals.mean()) / residuals.std(ddof=0)
    outlier_mask = np.abs(z) > z_thresh

    df_valid = df.loc[mask_valid].copy()
    df_valid["outlier_reg"] = outlier_mask

    df_clean = df_valid.loc[~df_valid["outlier_reg"]].drop(columns=["outlier_reg"])
    df_rest = df.loc[~mask_valid]
    df_result = pd.concat([df_clean, df_rest], axis=0).sort_index()
    return df_result

### Weather Cleaning

In [227]:
def clean_weather(df_weather_raw):
    """
    Create IRRADIATION_CLEAN using simple 6:00â€“18:30 day/night rule,
    drop SOURCE_KEY, and set DATE_TIME as index.
    """
    dfw = df_weather_raw.copy()
    dfw["DATE_TIME"] = pd.to_datetime(dfw["DATE_TIME"])

    day_start = dt.time(6, 0)
    day_end   = dt.time(18, 30)
    dfw["expected_day"] = dfw["DATE_TIME"].dt.time.between(day_start, day_end)

    dfw["IRRADIATION_CLEAN"] = dfw["IRRADIATION"].copy()
    dfw.loc[(~dfw["expected_day"]) & (dfw["IRRADIATION_CLEAN"] > 0), "IRRADIATION_CLEAN"] = 0

    dfw.set_index("DATE_TIME", inplace=True)
    if "SOURCE_KEY" in dfw.columns:
        dfw = dfw.drop(columns=["SOURCE_KEY"])

    return dfw

### Aggregating Generation Data by Inverter

In [228]:
def aggregate_inverters(df_gen_clean):
    """
    Aggregate generation data per inverter and time, and count Optimal/Suboptimal.
    Returns dict: {source_key: aggregated_df}
    """
    agg_dict = {}
    grouped = df_gen_clean.groupby("SOURCE_KEY")
    for sk, g in grouped:
        agg_df = g.groupby("DATE_TIME").agg(
            SOURCE_KEY=("SOURCE_KEY", "first"),
            DC_POWER=("DC_POWER", "first"),
            AC_POWER=("AC_POWER", "first"),
            DAILY_YIELD=("DAILY_YIELD", "first"),
            TOTAL_YIELD=("TOTAL_YIELD", "first"),
            NUM_OPT=("Operating_Condition", lambda x: (x == "Optimal").sum()),
            NUM_SUBOPT=("Operating_Condition", lambda x: (x == "Suboptimal").sum())
        ).reset_index()
        agg_dict[sk] = agg_df
    return agg_dict

### Merge Inverter + Weather

In [229]:
def merge_inverter_weather(agg_inv_dict, df_weather_clean):
    """
    Inner-join each inverter df with weather df on matching DATE_TIME index.
    Returns dict: {source_key: joined_df}
    """
    joined = {}
    for sk, inv_df in agg_inv_dict.items():
        d = inv_df.copy()
        d["DATE_TIME"] = pd.to_datetime(d["DATE_TIME"])
        d.set_index("DATE_TIME", inplace=True)
        join_df = d.join(df_weather_clean, how="inner")
        joined[sk] = join_df
    return joined


### Clean AC/DC Power

In [230]:
def clean_ac_dc_dict(wea_inv_dict):
    """
    Clean AC_POWER and DC_POWER into AC_CLEAN/DC_CLEAN based on IRRADIATION_CLEAN.
    Returns dict on the same keys.
    """
    cleaned = {}
    for sk, df_join in wea_inv_dict.items():
        d = df_join.copy()
        d["AC_CLEAN"] = d["AC_POWER"].copy()
        d["DC_CLEAN"] = d["DC_POWER"].copy()

        night_mask = d["IRRADIATION_CLEAN"] == 0
        d.loc[night_mask & (d["AC_CLEAN"] > 0), "AC_CLEAN"] = 0
        d.loc[night_mask & (d["DC_CLEAN"] > 0), "DC_CLEAN"] = 0

        day_mask = d["IRRADIATION_CLEAN"] > 0
        d.loc[day_mask & (d["AC_CLEAN"] == 0), "AC_CLEAN"] = float("nan")
        d.loc[day_mask & (d["DC_CLEAN"] == 0), "DC_CLEAN"] = float("nan")

        d["AC_CLEAN"] = d["AC_CLEAN"].interpolate(method="linear")
        d["DC_CLEAN"] = d["DC_CLEAN"].interpolate(method="linear")

        d["AC_CLEAN"] = d["AC_CLEAN"].fillna(0)
        d["DC_CLEAN"] = d["DC_CLEAN"].fillna(0)

        cleaned[sk] = d
    return cleaned


### Clean DAILY_YIELD

In [231]:
def clean_daily_yield_dict(acdc_dict):
    """
    Enforce DAILY_YIELD_CLEAN:
      - 0 at night
      - monotonic increasing during daytime
      - flat after sunset
    Returns dict with DAILY_YIELD_CLEAN added.
    """
    cleaned = {}
    for sk, df_in in acdc_dict.items():
        d = df_in.copy()
        d.index = pd.to_datetime(d.index)
        d["DAILY_YIELD_CLEAN"] = d["DAILY_YIELD"].copy()

        dates = np.unique(d.index.date)
        for day in dates:
            mask_day_full = d.index.date == day
            df_day = d.loc[mask_day_full]

            irr_pos = df_day["IRRADIATION_CLEAN"] > 0
            if not irr_pos.any():
                d.loc[mask_day_full, "DAILY_YIELD_CLEAN"] = 0.0
                continue

            day_start_idx = df_day[irr_pos].index[0]
            day_end_idx   = df_day[irr_pos].index[-1]

            night_mask   = mask_day_full & (d.index < day_start_idx)
            day_mask     = mask_day_full & (d.index >= day_start_idx) & (d.index <= day_end_idx)
            evening_mask = mask_day_full & (d.index > day_end_idx)

            d.loc[night_mask, "DAILY_YIELD_CLEAN"] = 0.0
            val_end = d.at[day_end_idx, "DAILY_YIELD"]
            d.loc[evening_mask, "DAILY_YIELD_CLEAN"] = val_end

            day_idx = d.loc[day_mask].index
            if len(day_idx) == 0:
                continue

            raw_vals = d.loc[day_idx, "DAILY_YIELD_CLEAN"].values.astype(float)
            invalid = np.zeros(len(raw_vals), dtype=bool)

            invalid |= raw_vals <= 0
            if len(raw_vals) > 1:
                drops = np.diff(raw_vals) < 0
                invalid[1:][drops] = True

            d.loc[day_idx[invalid], "DAILY_YIELD_CLEAN"] = np.nan
            d.loc[day_idx, "DAILY_YIELD_CLEAN"] = (
                d.loc[day_idx, "DAILY_YIELD_CLEAN"]
                .interpolate(method="linear", limit_direction="both")
            )

            prev_val = d.at[day_idx[0], "DAILY_YIELD_CLEAN"]
            for t in day_idx[1:]:
                cur = d.at[t, "DAILY_YIELD_CLEAN"]
                if pd.isna(cur) or cur < prev_val:
                    d.at[t, "DAILY_YIELD_CLEAN"] = prev_val
                else:
                    prev_val = cur

            d.loc[night_mask, "DAILY_YIELD_CLEAN"] = 0.0
            d.loc[evening_mask, "DAILY_YIELD_CLEAN"] = val_end

        cleaned[sk] = d
    return cleaned

### Clean TOTAL_YIELD

In [232]:
def clean_total_yield_dict(daily_dict):
    """
    Clean TOTAL_YIELD into TOTAL_YIELD_CLEAN using increments in DAILY_YIELD_CLEAN.
    Returns dict with TOTAL_YIELD_CLEAN added, and trimmed columns + OPERATING_CONDITION_CLEAN.
    """
    cleaned = {}
    for sk, df_in in daily_dict.items():
        d = df_in.copy()
        d["TOTAL_YIELD_CLEAN"] = d["TOTAL_YIELD"].copy()
        timestamps = d.index

        for i in range(1, len(timestamps)):
            t_prev = timestamps[i - 1]
            t_curr = timestamps[i]

            TY_prev = d.at[t_prev, "TOTAL_YIELD_CLEAN"]
            TY_now  = d.at[t_curr, "TOTAL_YIELD"]
            DY_prev = d.at[t_prev, "DAILY_YIELD_CLEAN"]
            DY_now  = d.at[t_curr, "DAILY_YIELD_CLEAN"]

            is_new_day = t_curr.date() != t_prev.date()
            if is_new_day:
                d.at[t_curr, "TOTAL_YIELD_CLEAN"] = TY_prev
                continue

            delta_dy = DY_now - DY_prev
            TY_expected = TY_prev + delta_dy

            if TY_now < TY_prev:
                d.at[t_curr, "TOTAL_YIELD_CLEAN"] = TY_expected
            else:
                d.at[t_curr, "TOTAL_YIELD_CLEAN"] = TY_now

        cols_keep = [
            "PLANT_ID", "SOURCE_KEY",
            "AC_CLEAN", "DC_CLEAN",
            "DAILY_YIELD_CLEAN", "TOTAL_YIELD_CLEAN",
            "AMBIENT_TEMPERATURE", "MODULE_TEMPERATURE",
            "IRRADIATION_CLEAN", "NUM_OPT", "NUM_SUBOPT"
        ]
        cols_keep = [c for c in cols_keep if c in d.columns]
        d = d[cols_keep]

        d["OPERATING_CONDITION_CLEAN"] = np.where(
            d["NUM_OPT"] > d["NUM_SUBOPT"], "Optimal", "Suboptimal"
        )
        d = d.drop(columns=["NUM_OPT", "NUM_SUBOPT"])

        cleaned[sk] = d
    return cleaned

### Outlier Removal Wrapper

In [233]:
def remove_outliers_ps_dict(df_ps_dict):
    """
    Apply regression_outlier_detection_graph to each inverter df.
    """
    out_dict = {}
    for sk, df_in in df_ps_dict.items():
        out_dict[sk] = regression_outlier_detection_graph(
            df_in, x_col="IRRADIATION_CLEAN", y_col="AC_CLEAN",
            z_thresh=3, plot=False
        )
    return out_dict

# 5. Machine Learning Helpers

### Label Creation

In [234]:
def make_label(df_all):
    """
    Label: Optimal -> 0, Suboptimal -> 1
    """
    return (df_all["OPERATING_CONDITION_CLEAN"].str.lower() == "suboptimal").astype(int)


### Feature Engineering


In [235]:
def engineer_features(df_all):
    """
    Sort by SOURCE_KEY then DATE_TIME, and add AC/IRRA, DC/IRRA.
    Avoids deprecated groupby.apply behavior.
    """
    df_feat = df_all.copy()
    df_feat = df_feat.sort_values(["SOURCE_KEY", "DATE_TIME"])

    df_feat["DC/IRRA"] = df_feat["DC_CLEAN"] / (df_feat["IRRADIATION_CLEAN"] + 1e-3)
    df_feat["AC/IRRA"] = df_feat["AC_CLEAN"] / (df_feat["IRRADIATION_CLEAN"] + 1e-3)

    return df_feat


### Combine All Inverter Data


In [236]:
def assemble_all_from_df_ps(df_ps_dict):
    """
    Combine all inverter dfs into one dataframe.
    """
    parts = []
    for sk, df_inv in df_ps_dict.items():
        d = df_inv.copy()
        d = d.reset_index()  # bring DATE_TIME back as a column
        parts.append(d)

    df_all = pd.concat(parts, ignore_index=True).drop_duplicates()
    df_all["DATE_TIME"] = pd.to_datetime(df_all["DATE_TIME"])

    mask = (~df_all["OPERATING_CONDITION_CLEAN"].isna()) & (~df_all["IRRADIATION_CLEAN"].isna())
    df_all = df_all[mask]

    counts = df_all["OPERATING_CONDITION_CLEAN"].value_counts()
    print("\n=== Operating Condition Counts ===")
    print(f"Number of Optimal (0):     {counts.get('Optimal', 0)}")
    print(f"Number of Suboptimal (1):  {counts.get('Suboptimal', 0)}")

    return df_all


### Time-Based Splitting (Prevents leakage)


In [237]:
def time_split(df_feat, y, test_days=10, val_days=3):
    """
    Chronological split into train/val/test.
    """
    last_time = df_feat["DATE_TIME"].max()
    test_start = last_time - pd.Timedelta(days=test_days)
    val_start  = test_start - pd.Timedelta(days=val_days)

    mask_test = df_feat["DATE_TIME"] >= test_start
    mask_val  = (df_feat["DATE_TIME"] >= val_start) & (~mask_test)
    mask_train = df_feat["DATE_TIME"] < val_start

    X_tr = df_feat[mask_train]
    X_val = df_feat[mask_val]
    X_te = df_feat[mask_test]

    y_tr = y[mask_train]
    y_val = y[mask_val]
    y_te = y[mask_test]

    return X_tr, X_val, X_te, y_tr, y_val, y_te

### Preprocessing Pipeline (StandardScaler on numeric columns)


In [238]:
def make_preprocessor(df_feat, drop_col):
    """
    StandardScaler on numeric columns not in drop_col.
    """
    num_cols = [
        c for c in df_feat.columns
        if c not in drop_col and df_feat[c].dtype.kind in "fcui"
    ]
    pre = ColumnTransformer(
        [("num", Pipeline([("scaler", StandardScaler())]), num_cols)]
    )
    return pre

### Select Threshold that Maximises F1 for Suboptimal Class


In [239]:
def Suboptimal_f1_threshold(y_true, scores_suboptimal):
    """
    Pick threshold that maximises F1 for the Suboptimal (1) class.
    """
    p, r, thr = precision_recall_curve(y_true, scores_suboptimal)
    if len(thr) == 0:
        return 0.0

    f1 = 2 * p[1:] * r[1:] / (p[1:] + r[1:] + 1e-12)
    best_ix = np.nanargmax(f1)
    return float(thr[best_ix])


### Evaluation: Confusion Matrix, Classification Report, PR-AUC


In [240]:
def Suboptimal_evaluate(name, y_true, scores_suboptimal, thr, tag):
    """
    Print confusion matrix + classification report + PR-AUC focused on suboptimal.
    """
    preds = (scores_suboptimal >= thr).astype(int)
    ap = average_precision_score(y_true, scores_suboptimal)
    print(f"\n==== {name} | {tag} ====")
    print(f"Suboptimal focused Threshold: {thr:.4f} | PR-AUC: {ap:.4f}")
    print(classification_report(y_true, preds, digits=3))
    print("Suboptimal focused Confusion Matrix:\n", confusion_matrix(y_true, preds))

### Compute F1 Score Using a Custom Threshold


In [241]:
def f1_threshold_scorer(model, X, y_true, thr):
    """
    Compute F1 (Suboptimal=1) for a given model and threshold.
    """
    try:
        scores = model.predict_proba(X)[:, 1]
    except Exception:
        scores = model.decision_function(X)
    preds = (scores > thr).astype(int)
    return f1_score(y_true, preds, pos_label=1)

### 1-D ALE Plots for Model Interpretability


In [242]:
def plot_ale_1d(model, X, feature, bins=20, save_path=None):
    # Run ALE
    ale(X=X, model=model, feature=[feature], include_CI=False, grid_size=bins)

    # Sanitize filename
    safe_feature = str(feature)
    for bad in ["/", "\\", ":", "*", "?", "\"", "<", ">", "|"]:
        safe_feature = safe_feature.replace(bad, "_")

    plt.title(f"ALE for {feature}")
    plt.tight_layout()

    if save_path:
        file = os.path.join(save_path, f"ALE_{safe_feature}.png")
        plt.savefig(file)

    plt.show()  # prevents display


### Drop-Column Importance (Re-trains SVM per feature)


In [243]:
def drop_column_importance(df_feat, baseline_f1, drop_col,
                           X_tr, y_tr, X_val, y_val, X_te, y_te):
    """
    Drop-column importance using LinearSVC: importance = baseline_f1 - dropped_f1.
    """
    importances = {}
    base_drop_cols = set(drop_col)

    for col in X_tr.columns:
        if col in base_drop_cols:
            continue

        X_tr_d = X_tr.drop(columns=[col])
        X_val_d = X_val.drop(columns=[col])
        X_te_d = X_te.drop(columns=[col])

        df_feat_d = df_feat.drop(columns=[col])
        pre_d = make_preprocessor(df_feat_d, drop_col)

        svm_d = Pipeline([
            ("pre", pre_d),
            ("clf", LinearSVC(class_weight="balanced", max_iter=5000))
        ])
        svm_d.fit(X_tr_d, y_tr)

        thr_d = Suboptimal_f1_threshold(y_val, svm_d.decision_function(X_val_d))
        dropped_f1 = f1_threshold_scorer(svm_d, X_te_d, y_te, thr_d)

        importances[col] = baseline_f1 - dropped_f1

    return importances

# 6. End-to-End Classification Pipeline for a Plant


In [244]:
def run_classification_on_df_inv(df_ps, test_days=10, val_days=3, drop_col=None):
    """
    Full pipeline + SAVE plots + SHOW plots + unique filenames per run.
    ALE is computed on TRAINING data (correct theoretical usage).
    """

    # ================================================================
    # FOLDER SETUP
    # ================================================================
    base_path = r"C:\Users\B.KING\OneDrive - Imperial College London\CIVE70111 Machine Learning\CouseWork\Group-11\data"
    
    folder_main = os.path.join(base_path, "03 ALE SVM Decision")
    folder_plots = os.path.join(folder_main, "Plots")
    folder_ale = os.path.join(folder_plots, "ALE")
    folder_svm = os.path.join(folder_plots, "SVM")

    ensure_dir(folder_main)
    ensure_dir(folder_plots)
    ensure_dir(folder_ale)
    ensure_dir(folder_svm)

    # ================================================================
    # DATA PREPARATION
    # ================================================================
    if drop_col is None:
        drop_col = ["OPERATING_CONDITION_CLEAN", "DATE_TIME", "PLANT_ID", "SOURCE_KEY"]

    # df_ps is a single inverter df with datetime index
    df = df_ps.reset_index().rename(columns={"index": "DATE_TIME"})
    df["DATE_TIME"] = pd.to_datetime(df["DATE_TIME"])

    counts = df["OPERATING_CONDITION_CLEAN"].value_counts()
    print("\n=== Operating Condition Counts ===")
    print(f"Number of Optimal (0):     {counts.get('Optimal', 0)}")
    print(f"Number of Suboptimal (1):  {counts.get('Suboptimal', 0)}")

    y = make_label(df)
    df_feat = engineer_features(df)

    X_tr, X_val, X_te, y_tr, y_val, y_te = time_split(df_feat, y, test_days, val_days)
    X_tr = X_tr.drop(columns=drop_col)
    X_val = X_val.drop(columns=drop_col)
    X_te = X_te.drop(columns=drop_col)

    pre = make_preprocessor(df_feat, drop_col)

    # class weights
    cw = compute_class_weight("balanced", classes=np.array([0, 1]), y=y_tr)
    class_w = {0: cw[0], 1: cw[1]}

    # Logistic Regression
    lr = Pipeline([
        ("pre", pre),
        ("clf", LogisticRegression(max_iter=5000, class_weight=class_w))
    ])
    lr.fit(X_tr, y_tr)
    thr_lr_sub = Suboptimal_f1_threshold(y_val, lr.predict_proba(X_val)[:, 1])
    Suboptimal_evaluate("LogReg - max suboptimal f1 score", y_te, lr.predict_proba(X_te)[:, 1], thr_lr_sub, "Full Test")

    # Linear SVM
    svm = Pipeline([
        ("pre", pre),
        ("clf", LinearSVC(class_weight="balanced", max_iter=5000))
    ])
    svm.fit(X_tr, y_tr)
    thr_svm_sub = Suboptimal_f1_threshold(y_val, svm.decision_function(X_val))
    Suboptimal_evaluate("LinearSVM - max suboptimal f1 score", y_te, svm.decision_function(X_te), thr_svm_sub, "Full Test")

    # ALE plots
    print("\n=== ALE for SVM ===")
    for feat in X_te.columns:
        plot_ale_1d(svm, X_te, feat)

    # Drop-column importance
    print("\n=== Baseline F1 Score of SVM ===")
    baseline = f1_threshold_scorer(svm, X_te, y_te, thr_svm_sub)
    print(baseline)

    print("\n=== Drop Column Importance for SVM (change of F1 score) ===")
    svm_importance = drop_column_importance(df_feat, baseline, drop_col, X_tr, y_tr, X_val, y_val, X_te, y_te)
    for k, v in sorted(svm_importance.items(), key=lambda x: -x[1]):
        print(f"{k:25s}: {v:.4f}")

    # Overlap plot
    scores = svm.decision_function(X_te)
    plt.hist(scores[y_te == 0], bins=50, alpha=0.6, label="Optimal")
    plt.hist(scores[y_te == 1], bins=50, alpha=0.6, label="Suboptimal")
    plt.axvline(thr_svm_sub, linestyle='--', label='boundary')
    plt.xlabel("SVM decision function")
    plt.ylabel("Count")
    plt.legend()
    plt.show()

    # Optionally return something if you want
    return {
        "svm": svm,
        "baseline_f1": baseline,
        "svm_importance": svm_importance,
        "thr_svm": thr_svm_sub
    }


In [245]:
# Convenience function for running on a single inverter for calculating feature importance
drop = ["OPERATING_CONDITION_CLEAN","DATE_TIME","PLANT_ID","SOURCE_KEY"]


drop = ["OPERATING_CONDITION_CLEAN", "DATE_TIME", "PLANT_ID", "SOURCE_KEY"]

def run_classification_on_df_importance(df_dict, sk, test_days=10, val_days=3, drop_col=drop):
    """
    Train a LinearSVC on a single inverter (key=sk), compute its baseline F1,
    and then compute drop-column importances for all features.
    
    Parameters
    ----------
    df_dict : dict
        {source_key: dataframe} as produced by your cleaning pipeline.
    sk : hashable
        Specific inverter key to analyse.
    test_days, val_days : int
        Sizes of test/validation windows in days for time_split.
    drop_col : list
        Columns to drop from X (labels, IDs, etc.).

    Returns
    -------
    results : dict
        {
          "svm": fitted Pipeline,
          "features": list of feature names,
          "baseline_f1": F1 on test set at best suboptimal threshold,
          "svm_importance": {feature: importance}
        }
    """
    # 1) Extract and prepare df for this inverter
    df = df_dict[sk].copy()

    # Bring datetime index back as a column called DATE_TIME
    df = df.reset_index().rename(columns={"index": "DATE_TIME"})
    df["SOURCE_KEY"] = sk
    df["DATE_TIME"] = pd.to_datetime(df["DATE_TIME"])

    # 2) Label + feature engineering
    y = make_label(df)
    df_feat = engineer_features(df)

    # 3) Time-based split
    X_tr, X_val, X_te, y_tr, y_val, y_te = time_split(df_feat, y, test_days, val_days)

    # Remove non-feature columns from X
    X_tr = X_tr.drop(columns=drop_col)
    X_val = X_val.drop(columns=drop_col)
    X_te = X_te.drop(columns=drop_col)

    feature_list = list(X_tr.columns)

    # 4) Preprocessor + SVM model
    pre = make_preprocessor(df_feat, drop_col)

    svm = Pipeline([
        ("pre", pre),
        ("clf", LinearSVC(class_weight="balanced", max_iter=5000))
    ])
    svm.fit(X_tr, y_tr)

    # 5) Baseline F1 for "Suboptimal" at best threshold
    thr = Suboptimal_f1_threshold(y_val, svm.decision_function(X_val))
    f1_full = f1_threshold_scorer(svm, X_te, y_te, thr)

    # 6) Drop-column importance
    svm_importance = drop_column_importance(
        df_feat=df_feat,
        baseline_f1=f1_full,
        drop_col=drop_col,
        X_tr=X_tr, y_tr=y_tr,
        X_val=X_val, y_val=y_val,
        X_te=X_te, y_te=y_te,
    )

    return {
        "svm": svm,
        "features": feature_list,
        "baseline_f1": f1_full,
        "svm_importance": svm_importance
    }

### File Paths

In [246]:
# ============================================================
# 0. PATHS
# ============================================================

############################################################################################################################################
# Change here 

# folder = r"C:\Users\MSI-NB\OneDrive - Imperial College London\CIVE70111 Machine Learning\CouseWork\Group-11\data\In"
folder = r"C:\Users\B.KING\OneDrive - Imperial College London\CIVE70111 Machine Learning\CouseWork\Group-11\data\In"
############################################################################################################################################

gen_path_1     = os.path.join(folder, "Plant_1_Generation_Data_updated.csv")   # Plant 1 generation
weather_path_1 = os.path.join(folder, "Plant_1_Weather_Sensor_Data.csv")       # Plant 1 weather

gen_path_2     = os.path.join(folder, "Plant_2_Generation_Data.csv")           # Plant 2 generation
weather_path_2 = os.path.join(folder, "Plant_2_Weather_Sensor_Data.csv")       # Plant 2 weather


### Main Pipeline: Plant 1


In [247]:
# ============================================================
# 3. MAIN PIPELINE
# ============================================================

# ------------------ Plant 1 ------------------

print("\n=== PLANT 1: LOADING DATA ===")
df_p1_gen_raw = pd.read_csv(gen_path_1, parse_dates=["DATE_TIME"])
df_p1_weather_raw = pd.read_csv(weather_path_1, parse_dates=["DATE_TIME"])

# Drop rows with missing Operating_Condition, drop PLANT_ID and 'day' as in original
df_p1_gen = df_p1_gen_raw.dropna().copy()
for col_drop in ["PLANT_ID", "day"]:
    if col_drop in df_p1_gen.columns:
        df_p1_gen = df_p1_gen.drop(columns=[col_drop])
df_p1_gen.set_index("DATE_TIME", inplace=True)

# Aggregate by inverter
df_p1_gen.reset_index(inplace=True)
agg_inv_p1 = aggregate_inverters(df_p1_gen)

# Clean weather
df_p1_weather = clean_weather(df_p1_weather_raw)

# Join inverter + weather
wea_inv_p1 = merge_inverter_weather(agg_inv_p1, df_p1_weather)

# Clean AC/DC, DAILY_YIELD, TOTAL_YIELD
p1_step1 = clean_ac_dc_dict(wea_inv_p1)
p1_step2 = clean_daily_yield_dict(p1_step1)
df_ps1 = clean_total_yield_dict(p1_step2)

# Outlier removal
df_ps1_outlier = remove_outliers_ps_dict(df_ps1)

source_key_1 = list(df_ps1.keys())



=== PLANT 1: LOADING DATA ===


### Main Pipeline: Plant 2

In [248]:
# ------------------ Plant 2 ------------------

print("\n=== PLANT 2: LOADING DATA ===")
df_p2_gen_raw = pd.read_csv(gen_path_2, parse_dates=["DATE_TIME"])
df_p2_weather_raw = pd.read_csv(weather_path_2, parse_dates=["DATE_TIME"])

# Drop PLANT_ID from generation (as in original)
if "PLANT_ID" in df_p2_gen_raw.columns:
    df_p2_gen = df_p2_gen_raw.drop(columns=["PLANT_ID"]).copy()
else:
    df_p2_gen = df_p2_gen_raw.copy()

df_p2_gen.set_index("DATE_TIME", inplace=True)
df_p2_gen.reset_index(inplace=True)

agg_inv_p2 = aggregate_inverters(df_p2_gen)
df_p2_weather = clean_weather(df_p2_weather_raw)
wea_inv_p2 = merge_inverter_weather(agg_inv_p2, df_p2_weather)

p2_step1 = clean_ac_dc_dict(wea_inv_p2)
p2_step2 = clean_daily_yield_dict(p2_step1)
df_ps2 = clean_total_yield_dict(p2_step2)

df_ps2_outlier = remove_outliers_ps_dict(df_ps2)
source_key_2 = list(df_ps2.keys())




=== PLANT 2: LOADING DATA ===


In [249]:

source_key_1 = list(df_ps1.keys())
source_key_2 = list(df_ps2.keys())


# Plant 1 models before feature selection 
for sk in source_key_1:
    run_classification_on_df_inv(df_ps1[sk], drop_col = drop)

PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1751

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0967 | PR-AUC: 0.9998
              precision    recall  f1-score   support

           0      0.981     0.898     0.938        59
           1      0.986     0.998     0.992       419

    accuracy                          0.985       478
   macro avg      0.984     0.948     0.965       478
weighted avg      0.985     0.985     0.985       478

Suboptimal focused Confusion Matrix:
 [[ 53   6]
 [  1 418]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2668 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0047
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1024 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.966     0.966     0.966        59
           1      0.995     0.995     0.995       419

    accuracy                          0.992       478
   macro avg      0.981     0.981     0.981       478
weighted avg      0.992     0.992     0.992       478

Suboptimal focused Confusion Matrix:
 [[ 57 

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AC_CLEAN                 : 0.0012
DC_CLEAN                 : 0.0012
DAILY_YIELD_CLEAN        : 0.0012
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0036

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0910 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.983     0.966     0.974        59
           1      0.995     0.998     0.996       419

    accuracy                          0.994       478
   macro avg      0.989     0.982     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 57

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1737

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0157 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.797     0.887        59
           1      0.972     1.000     0.986       419

    accuracy                          0.975       478
   macro avg      0.986     0.898     0.936       478
weighted avg      0.976     0.975     0.974       478

Suboptimal focused Confusion Matrix:
 [[ 47

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0036

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0873 | PR-AUC: 0.9999


PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      1.000     0.966     0.983        59
           1      0.995     1.000     0.998       419

    accuracy                          0.996       478
   macro avg      0.998     0.983     0.990       478
weighted avg      0.996     0.996     0.996       478

Suboptimal focused Confusion Matrix:
 [[ 57   2]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2915 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
IRRADIATION_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000


PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1737

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0161 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.763     0.865        59
           1      0.968     1.000     0.984       419

    accuracy                          0.971       478
   macro avg      0.984     0.881     0.924       478
weighted avg      0.972     0.971     0.969       478

Suboptimal focused Confusion Matrix:
 [[ 45  14]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2606 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0036

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0925 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.982     0.949     0.966        59
           1      0.993     0.998     0.995       419

    accuracy                          0.992       478
   macro avg      0.988     0.973     0.980       478
weighted avg      0.992     0.992     0.992       478

Suboptimal focused Confusion Matrix:
 [[ 56

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1706


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0266 | PR-AUC: 0.9998
              precision    recall  f1-score   support

           0      1.000     0.814     0.897        59
           1      0.974     1.000     0.987       419

    accuracy                          0.977       478
   macro avg      0.987     0.907     0.942       478
weighted avg      0.978     0.977     0.976       478

Suboptimal focused Confusion Matrix:
 [[ 48  11]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2715 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1734

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0910 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.983     0.966     0.974        59
           1      0.995     0.998     0.996       419

    accuracy                          0.994       478
   macro avg      0.989     0.982     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 57

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012


  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1725

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1120 | PR-AUC: 0.9998
              precision    recall  f1-score   support

           0      0.903     0.949     0.926        59
           1      0.993     0.986     0.989       419

    accuracy                          0.981       478
   macro avg      0.948     0.967     0.957       478
weighted avg      0.982     0.981     0.981       478

Suboptimal focused Confusion Matrix:
 [[ 56   3]
 [  6 413]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2972 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0934 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      0.983     0.966     0.974        59
           1      0.995     0.998     0.996       419

    accuracy                          0.994       478
   macro avg      0.989     0.982     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 57

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0036

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1752

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0165 | PR-AUC: 0.9997


PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      1.000     0.661     0.796        59
           1      0.954     1.000     0.977       419

    accuracy                          0.958       478
   macro avg      0.977     0.831     0.886       478
weighted avg      0.960     0.958     0.954       478

Suboptimal focused Confusion Matrix:
 [[ 39  20]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.3032 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
IRRADIATION_CLEAN        : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0158 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.797     0.887        59
           1      0.972     1.000     0.986       419

    accuracy                          0.975       478
   macro avg      0.986     0.898     0.936       478
weighted avg      0.976     0.975     0.974       478

Suboptimal focused Confusion Matrix:
 [[ 47 

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2666 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0024

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1734

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0710 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.949     0.974        59
           1      0.993     1.000     0.996       419

    accuracy                          0.994       478
   macro avg      0.996     0.975     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 56

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.3231 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
IRRADIATION_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0910 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.949     0.974        59
           1      0.993     1.000     0.996       419

    accuracy                          0.994       478
   macro avg      0.996     0.975     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 56 

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0842 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.966     0.983        59
           1      0.995     1.000     0.998       419

    accuracy                          0.996       478
   macro avg      0.998     0.983     0.990       478
weighted avg      0.996     0.996     0.996       478

Suboptimal focused Confusion Matrix:
 [[ 57

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.3171 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0839 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.949     0.974        59
           1      0.993     1.000     0.996       419

    accuracy                          0.994       478
   macro avg      0.996     0.975     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 56   3]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2869 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
IRRADIATION_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0680 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.898     0.946        59
           1      0.986     1.000     0.993       419

    accuracy                          0.987       478
   macro avg      0.993     0.949     0.970       478
weighted avg      0.988     0.987     0.987       478

Suboptimal focused Confusion Matrix:
 [[ 53 

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.3894 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012
IRRADIATION_CLEAN        : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000


  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1182 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.983     0.966     0.974        59
           1      0.995     0.998     0.996       419

    accuracy                          0.994       478
   macro avg      0.989     0.982     0.985       478
weighted avg      0.994     0.994     0.994       478

Suboptimal focused Confusion Matrix:
 [[ 57   2]
 [  1 418]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2729 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0012


  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0169 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.712     0.832        59
           1      0.961     1.000     0.980       419

    accuracy                          0.964       478
   macro avg      0.981     0.856     0.906       478
weighted avg      0.966     0.964     0.962       478

Suboptimal focused Confusion Matrix:
 [[ 42  17]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1863 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0036


  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1721

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0933 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      0.905     0.966     0.934        59
           1      0.995     0.986     0.990       419

    accuracy                          0.983       478
   macro avg      0.950     0.976     0.962       478
weighted avg      0.984     0.983     0.983       478

Suboptimal focused Confusion Matrix:
 [[ 57   2]
 [  6 413]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.2907 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DAILY_YIELD_CLEAN        : -0.0012
AMBIENT_TEMPERATURE      : -0.0024
IRRADIATION_CLEAN        : -0.0036
DC/IRRA                  : -0.0036
AC/IRRA                  : -0.0036

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0191 | PR-AUC: 0.9999
              precision    recall  f1-score   support

           0      1.000     0.746     0.854        59
           1      0.965     1.000     0.982       419

    accuracy                          0.969       478
   macro avg      0.983     0.873     0.918       478
weighted avg      0.970     0.969     0.967       478

Suboptimal focused Confusion Matrix:
 [

PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0024


  plt.show()


In [250]:
# Mean importance across all inverters for Plant 1
drop1 = drop + ['AC_CLEAN','DC_CLEAN','DAILY_YIELD_CLEAN','AC/IRRA','DC/IRRA','MODULE_TEMPERATURE']
all_imp1 = []
for sk in source_key_1:
    out = run_classification_on_df_importance(df_ps1, sk, drop_col = drop1)

    # store as row, not as dict of dicts
    s = []
    s = pd.Series(out["svm_importance"], name=sk)
    all_imp1.append(s)

importance_df1 = pd.DataFrame(all_imp1)
importance_df1.describe().T.sort_values(by="mean", ascending=False)

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
IRRADIATION_CLEAN,22.0,0.096587,0.001137,0.095592,0.095592,0.095721,0.097782,0.098661
TOTAL_YIELD_CLEAN,22.0,0.001178,0.0,0.001178,0.001178,0.001178,0.001178,0.001178
AMBIENT_TEMPERATURE,22.0,0.001178,0.0,0.001178,0.001178,0.001178,0.001178,0.001178


In [251]:
# Plant 1 models after feature selection 
for sk in source_key_1:
    run_classification_on_df_inv(df_ps1[sk], drop_col = drop1)

PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1751

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0759 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1081 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0776 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1056 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0776 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1057 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1737

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0772 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1072 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0766 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1080 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1737

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1072 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0776 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1058 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1706

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0783 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1063 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1734

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1076 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1725

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0770 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1063 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1722

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0776 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1058 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1752

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0759 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1079 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0768 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1070 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1734

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0765 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1089 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0956
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0769 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1066 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1080 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1075 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0768 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1069 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1075 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0987
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1727

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0767 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1076 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0958
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1721

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0772 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1084 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0966
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012

=== Operating Condition Counts ===
Number of Optimal (0):     348
Number of Suboptimal (1):  1726

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0768 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      1.000     0.915     0.956        59
           1      0.988     1.000     0.994       419

    accuracy                          0.990       478
   macro avg      0.994     0.958     0.975       478
weighted avg      0.990     0.990     0.989       478

Suboptimal focused Confusion Matrix:
 [[ 54   5]
 [  0 419]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -1.1074 | PR-AUC: 1.0000
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9940688018979834

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0978
TOTAL_YIELD_CLEAN        : 0.0012
AMBIENT_TEMPERATURE      : 0.0012


In [252]:
# Plant 2 models before feature selection 
for sk in source_key_2:
    run_classification_on_df_inv(df_ps2[sk], drop_col = drop)

PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2993 | PR-AUC: 0.9985
              precision    recall  f1-score   support

           0      0.816     0.500     0.620        80
           1      0.956     0.990     0.973       881

    accuracy                          0.949       961
   macro avg      0.886     0.745     0.796       961
weighted avg      0.945     0.949     0.943       961

Suboptimal focused Confusion Matrix:
 [[ 40  40]
 [  9 872]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2881 | PR-AUC: 0.9986
              precision    recall  f1-score   support

           0      0.845     0.613     0.710        80
           1      0.966     0.990     0.978       881

    accuracy                          0.958       961
   macro avg      0.905     0.801     0.844       961
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9775784753363229

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0159
AMBIENT_TEMPERATURE      : 0.0071
MODULE_TEMPERATURE       : 0.0038
TOTAL_YIELD_CLEAN        : 0.0000
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
DAILY_YIELD_CLEAN        : -0.0033

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1232 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0      0.825     0.588     0.686        80
           1      0.963     0.989     0.976       881

    accuracy                          0.955       961
   macro avg      0.894     0.788     0.831       961
weighted avg      0.952     0.955     0.952       961

Suboptimal focused Confusion Matrix:
 [[ 47

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3718 | PR-AUC: 0.9987
              precision    recall  f1-score   support

           0      0.800     0.850     0.824        80
           1      0.986     0.981     0.983       881

    accuracy                          0.970       961
   macro avg      0.893     0.915     0.904       961
weighted avg      0.971     0.970     0.970       961

Suboptimal focused Confusion Matrix:
 [[ 68  12]
 [ 17 864]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.983494593056346

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0070
MODULE_TEMPERATURE       : 0.0023
AMBIENT_TEMPERATURE      : 0.0016
DAILY_YIELD_CLEAN        : 0.0010
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1319 | PR-AUC: 0.9986
              precision    recall  f1-score   support

           0      0.944     0.425     0.586        80
           1      0.950     0.998     0.973       881

    accuracy                          0.950       961
   macro avg      0.947     0.711     0.780       961
weighted avg      0.950     0.950     0.941       961

Suboptimal focused Confusion Matrix:
 [[ 34  

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.6174 | PR-AUC: 0.9987
              precision    recall  f1-score   support

           0      0.917     0.412     0.569        80
           1      0.949     0.997     0.972       881

    accuracy                          0.948       961
   macro avg      0.933     0.705     0.771       961
weighted avg      0.946     0.948     0.939       961

Suboptimal focused Confusion Matrix:
 [[ 33  47]
 [  3 878]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9723145071982281

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0011
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
IRRADIATION_CLEAN        : -0.0026
AMBIENT_TEMPERATURE      : -0.0054
MODULE_TEMPERATURE       : -0.0097

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1827 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.848     0.838     0.843        80
           1      0.985     0.986     0.986       881

    accuracy                          0.974       961
   macro avg      0.917     0.912     0.914       961
weighted avg      0.974     0.974     0.974       961

Suboptimal focused Confusion Matrix:
 [[ 

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.6182 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.845     0.750     0.795        80
           1      0.978     0.988     0.982       881

    accuracy                          0.968       961
   macro avg      0.911     0.869     0.889       961
weighted avg      0.967     0.968     0.967       961

Suboptimal focused Confusion Matrix:
 [[ 60  20]
 [ 11 870]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9824957651044608

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0136
TOTAL_YIELD_CLEAN        : 0.0006
DAILY_YIELD_CLEAN        : 0.0001
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0028

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1877 | PR-AUC: 0.9971
              precision    recall  f1-score   support

           0      0.900     0.675     0.771        80
           1      0.971     0.993     0.982       881

    accuracy                          0.967       961
   macro avg      0.936     0.834     0.877       961
weighted avg      0.965     0.967     0.965       961

Suboptimal focused Confusion Matrix:
 [[ 54

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9803921568627451

=== Drop Column Importance for SVM (change of F1 score) ===
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
IRRADIATION_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
TOTAL_YIELD_CLEAN        : -0.0011
AMBIENT_TEMPERATURE      : -0.0017
MODULE_TEMPERATURE       : -0.0022
DAILY_YIELD_CLEAN        : -0.0038

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2445 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.844     0.812     0.828        80
           1      0.983     0.986     0.985       881

    accuracy                          0.972       961
   macro avg      0.914     0.899     0.906       961
weighted avg      0.971     0.972     0.972       961

Suboptimal focused Confusion Matrix:
 [[ 65  15]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3401 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.838     0.838     0.838        80
           1      0.985     0.985     0.985       881

    accuracy                          0.973       961
   macro avg      0.911     0.911     0.911       961
weighted avg      0.973     0.973     0.973       961

Suboptimal focused Confusion Matrix:
 [[ 67  13]
 [ 13 868]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.985244040862656

=== Drop Column Importance for SVM (change of F1 score) ===
DAILY_YIELD_CLEAN        : 0.0077
IRRADIATION_CLEAN        : 0.0066
MODULE_TEMPERATURE       : 0.0041
AMBIENT_TEMPERATURE      : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2370 | PR-AUC: 0.9992
              precision    recall  f1-score   support

           0      0.853     0.800     0.826        80
           1      0.982     0.988     0.985       881

    accuracy                          0.972       961
   macro avg      0.918     0.894     0.905       961
weighted avg      0.971     0.972     0.971       961

Suboptimal focused Confusion Matrix:
 [[ 64  16]
 [ 11 870]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.5498 | PR-AUC: 0.9992
              precision    recall  f1-score   support

           0      0.841     0.662     0.741        80
           1      0.970     0.989     0.979       881

    accuracy                          0.961       961
   macro avg      0.906     0.826     0.860       961
weighted avg      0.959     0.961     0.959       961

Suboptimal focused Confusion Matrix:
 [[ 53  27]
 [ 10 871]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9792017987633502

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0113
DAILY_YIELD_CLEAN        : 0.0017
TOTAL_YIELD_CLEAN        : 0.0011
AC_CLEAN                 : 0.0006
MODULE_TEMPERATURE       : 0.0006
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0027

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2668 | PR-AUC: 0.9984
              precision    recall  f1-score   support

           0      0.860     0.613     0.715        80
           1      0.966     0.991     0.978       881

    accuracy                          0.959       961
   macro avg      0.913     0.802     0.847       961
weighted avg      0.957     0.959     0.956       961

Suboptimal focused Confusion Matrix:
 [[ 49

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9803038829487901

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0118
DAILY_YIELD_CLEAN        : 0.0011
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
TOTAL_YIELD_CLEAN        : -0.0011
MODULE_TEMPERATURE       : -0.0011
AMBIENT_TEMPERATURE      : -0.0015

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1865 | PR-AUC: 0.9980
              precision    recall  f1-score   support

           0      0.904     0.588     0.712        80
           1      0.964     0.994     0.979       881

    accuracy                          0.960       961
   macro avg      0.934     0.791     0.845       961
weighted avg      0.959     0.960     0.957       961

Suboptimal focused Confusion Matrix:
 [[ 

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3896 | PR-AUC: 0.9983
              precision    recall  f1-score   support

           0      0.911     0.637     0.750        80
           1      0.968     0.994     0.981       881

    accuracy                          0.965       961
   macro avg      0.939     0.816     0.865       961
weighted avg      0.963     0.965     0.962       961

Suboptimal focused Confusion Matrix:
 [[ 51  29]
 [  5 876]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display



=== Baseline F1 Score of SVM ===
0.9809630459126539

=== Drop Column Importance for SVM (change of F1 score) ===
MODULE_TEMPERATURE       : 0.0011
DAILY_YIELD_CLEAN        : 0.0005
IRRADIATION_CLEAN        : 0.0005
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
AMBIENT_TEMPERATURE      : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AC_CLEAN                 : -0.0005


  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0724 | PR-AUC: 0.9979
              precision    recall  f1-score   support

           0      0.875     0.263     0.404        80
           1      0.937     0.997     0.966       881

    accuracy                          0.935       961
   macro avg      0.906     0.630     0.685       961
weighted avg      0.932     0.935     0.919       961

Suboptimal focused Confusion Matrix:
 [[ 21  59]
 [  3 878]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.7161 | PR-AUC: 0.9979
              precision    recall  f1-score   support

           0      0.821     0.287     0.426        80
           1      0.939     0.994     0.966       881

    accuracy                          0.935       961
   macro avg      0.880     0.641     0.696       961
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9658213891951488

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0006
MODULE_TEMPERATURE       : 0.0005
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0001
TOTAL_YIELD_CLEAN        : -0.0006
DAILY_YIELD_CLEAN        : -0.0161

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2418 | PR-AUC: 0.9987
              precision    recall  f1-score   support

           0      0.852     0.575     0.687        80
           1      0.963     0.991     0.977       881

    accuracy                          0.956       961
   macro avg      0.907     0.783     0.832       961
weighted avg      0.953     0.956     0.952       961

Suboptimal focused Confusion Matrix:
 [[ 

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3905 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0      0.836     0.637     0.723        80
           1      0.968     0.989     0.978       881

    accuracy                          0.959       961
   macro avg      0.902     0.813     0.851       961
weighted avg      0.957     0.959     0.957       961

Suboptimal focused Confusion Matrix:
 [[ 51  29]
 [ 10 871]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9781021897810219

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0128
MODULE_TEMPERATURE       : 0.0005
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
DAILY_YIELD_CLEAN        : -0.0011
TOTAL_YIELD_CLEAN        : -0.0022
AMBIENT_TEMPERATURE      : -0.0056

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1572 | PR-AUC: 0.9983
              precision    recall  f1-score   support

           0      0.804     0.463     0.587        80
           1      0.953     0.990     0.971       881

    accuracy                          0.946       961
   macro avg      0.879     0.726     0.779       961
weighted avg      0.941     0.946     0.939       961



PyALE._ALE_generic:INFO: Continuous feature detected.


Suboptimal focused Confusion Matrix:
 [[ 37  43]
 [  9 872]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.6838 | PR-AUC: 0.9981
              precision    recall  f1-score   support

           0      0.844     0.338     0.482        80
           1      0.943     0.994     0.968       881

    accuracy                          0.940       961
   macro avg      0.893     0.666     0.725       961
weighted avg      0.935     0.940     0.928       961

Suboptimal focused Confusion Matrix:
 [[ 27  53]
 [  5 876]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9679558011049724

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0058
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
MODULE_TEMPERATURE       : -0.0005
TOTAL_YIELD_CLEAN        : -0.0036
AMBIENT_TEMPERATURE      : -0.0048
DAILY_YIELD_CLEAN        : -0.0134

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1783 | PR-AUC: 0.9981
              precision    recall  f1-score   support

           0      0.864     0.475     0.613        80
           1      0.954     0.993     0.973       881

    accuracy                          0.950       961
   macro avg      0.909     0.734     0.793       961
weighted avg      0.947     0.950     0.943       961

Suboptimal focused Confusion Matrix:
 [[ 38  42]
 [  6 875]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.5350 | PR-AUC: 0.9983
              precision    recall  f1-score   support

           0      0.854     0.438     0.579        80
           1      0.951     0.993     0.972       881

    accuracy                          0.947       961
   macro avg      0.902     0.715     0.775       961
weighted avg      0.943     0.947     0.939       961

Suboptimal focused Confusion Matrix:
 [[ 35  45]
 [  6 875]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.971682398667407

=== Drop Column Importance for SVM (change of F1 score) ===
TOTAL_YIELD_CLEAN        : 0.0016
IRRADIATION_CLEAN        : 0.0014
AC_CLEAN                 : 0.0005
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0005
MODULE_TEMPERATURE       : -0.0011
DAILY_YIELD_CLEAN        : -0.0037

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2444 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.863     0.863     0.863        80
           1      0.988     0.988     0.988       881

    accuracy                          0.977       961
   macro avg      0.925     0.925     0.925       961
weighted avg      0.977     0.977     0.977       961

Suboptimal focused Confusion Matrix:
 [[ 6

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3659 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.840     0.850     0.845        80
           1      0.986     0.985     0.986       881

    accuracy                          0.974       961
   macro avg      0.913     0.918     0.915       961
weighted avg      0.974     0.974     0.974       961

Suboptimal focused Confusion Matrix:
 [[ 68  12]
 [ 13 868]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9858035207268597

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0093
DAILY_YIELD_CLEAN        : 0.0034
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2447 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.915     0.675     0.777        80
           1      0.971     0.994     0.983       881

    accuracy                          0.968       961
   macro avg      0.943     0.835     0.880       961
weighted avg      0.967     0.968     0.965       961

Suboptimal focused Confusion Matrix:
 [[ 54  26]
 [  5 876]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3829 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.915     0.675     0.777        80
           1      0.971     0.994     0.983       881

    accuracy                          0.968       961
   macro avg      0.943     0.835     0.880       961
weighted avg      0.967     0.968     0.965       961

Suboptimal focused Confusion Matrix:
 [[ 54  26]
 [  5 876]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9826135726303982

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0017
MODULE_TEMPERATURE       : 0.0006
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DAILY_YIELD_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
TOTAL_YIELD_CLEAN        : -0.0006
AMBIENT_TEMPERATURE      : -0.0061

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2443 | PR-AUC: 0.9985
              precision    recall  f1-score   support

           0      0.849     0.562     0.677        80
           1      0.961     0.991     0.976       881

    accuracy                          0.955       961
   macro avg      0.905     0.777     0.826       961
weighted avg      0.952     0.955     0.951       961

Suboptimal focused Confusion Matrix:
 [[ 4

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3738 | PR-AUC: 0.9986
              precision    recall  f1-score   support

           0      0.845     0.613     0.710        80
           1      0.966     0.990     0.978       881

    accuracy                          0.958       961
   macro avg      0.905     0.801     0.844       961
weighted avg      0.956     0.958     0.955       961

Suboptimal focused Confusion Matrix:
 [[ 49  31]
 [  9 872]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9775784753363229

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0038
DAILY_YIELD_CLEAN        : 0.0027
DC/IRRA                  : 0.0000
AC_CLEAN                 : -0.0005
DC_CLEAN                 : -0.0005
AC/IRRA                  : -0.0005
TOTAL_YIELD_CLEAN        : -0.0011
MODULE_TEMPERATURE       : -0.0016
AMBIENT_TEMPERATURE      : -0.0033

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.0770 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0      0.846     0.412     0.555        80
           1      0.949     0.993     0.971       881

    accuracy                          0.945       961
   macro avg      0.898     0.703     0.763       961
weighted avg      0.940     0.945     0.936       961

Suboptimal focused Confusion Matrix:
 [[ 33  47]
 [  6 875]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.7990 | PR-AUC: 0.9986
              precision    recall  f1-score   support

           0      0.824     0.350     0.491        80
           1      0.944     0.993     0.968       881

    accuracy                          0.940       961
   macro avg      0.884     0.672     0.730       961
weighted avg      0.934     0.940     0.928       961

Suboptimal focused Confusion Matrix:
 [[ 28  52]
 [  6 875]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9679203539823009

=== Drop Column Importance for SVM (change of F1 score) ===
TOTAL_YIELD_CLEAN        : 0.0011
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
MODULE_TEMPERATURE       : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AMBIENT_TEMPERATURE      : -0.0016
IRRADIATION_CLEAN        : -0.0021
DAILY_YIELD_CLEAN        : -0.0156

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.1899 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.853     0.725     0.784        80
           1      0.975     0.989     0.982       881

    accuracy                          0.967       961
   macro avg      0.914     0.857     0.883       961
weighted avg      0.965     0.967     0.965       961

Suboptimal focused Confusion Matrix:
 [[ 58  22]
 [ 10 871]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.5488 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.846     0.688     0.759        80
           1      0.972     0.989     0.980       881

    accuracy                          0.964       961
   macro avg      0.909     0.838     0.869       961
weighted avg      0.962     0.964     0.962       961

Suboptimal focused Confusion Matrix:
 [[ 55  25]
 [ 10 871]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9803038829487901

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0027
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
AMBIENT_TEMPERATURE      : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
MODULE_TEMPERATURE       : -0.0032
DAILY_YIELD_CLEAN        : -0.0038

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2564 | PR-AUC: 0.9989
              precision    recall  f1-score   support

           0      0.847     0.762     0.803        80
           1      0.979     0.988     0.983       881

    accuracy                          0.969       961
   macro avg      0.913     0.875     0.893       961
weighted avg      0.968     0.969     0.968       961

Suboptimal focused Confusion Matrix:
 [[ 61  19]
 [ 11 870]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3503 | PR-AUC: 0.9989
              precision    recall  f1-score   support

           0      0.824     0.762     0.792        80
           1      0.979     0.985     0.982       881

    accuracy                          0.967       961
   macro avg      0.901     0.874     0.887       961
weighted avg      0.966     0.967     0.966       961

Suboptimal focused Confusion Matrix:
 [[ 61  19]
 [ 13 868]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9819004524886877

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0065
MODULE_TEMPERATURE       : 0.0032
AMBIENT_TEMPERATURE      : 0.0016
DAILY_YIELD_CLEAN        : 0.0006
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
TOTAL_YIELD_CLEAN        : -0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2924 | PR-AUC: 0.9989


PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      0.835     0.825     0.830        80
           1      0.984     0.985     0.985       881

    accuracy                          0.972       961
   macro avg      0.910     0.905     0.907       961
weighted avg      0.972     0.972     0.972       961

Suboptimal focused Confusion Matrix:
 [[ 66  14]
 [ 13 868]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3036 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0      0.840     0.850     0.845        80
           1      0.986     0.985     0.986       881

    accuracy                          0.974       961
   macro avg      0.913     0.918     0.915       961
weighted avg      0.974     0.974     0.974       961

Suboptimal focused Confusion Matrix:
 [[ 68  12]
 [ 13 868]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9858035207268597

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0061
MODULE_TEMPERATURE       : 0.0050
IRRADIATION_CLEAN        : 0.0039
DAILY_YIELD_CLEAN        : 0.0012
AC_CLEAN                 : 0.0000
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2258 | PR-AUC: 0.9966
              precision    recall  f1-score   support

           0      0.881     0.650     0.748        80
           1      0.969     0.992     0.980       881

    accuracy                          0.964       961
   macro avg      0.925     0.821     0.864       961
weighted avg      0.962     0.964     0.961       961

Suboptimal focused Confusion Matrix:
 [[ 52 

PyALE._ALE_generic:INFO: Continuous feature detected.


              precision    recall  f1-score   support

           0      0.889     0.500     0.640        80
           1      0.956     0.994     0.975       881

    accuracy                          0.953       961
   macro avg      0.923     0.747     0.807       961
weighted avg      0.951     0.953     0.947       961

Suboptimal focused Confusion Matrix:
 [[ 40  40]
 [  5 876]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9749582637729549

=== Drop Column Importance for SVM (change of F1 score) ===
DC_CLEAN                 : 0.0000
DC/IRRA                  : 0.0000
AC/IRRA                  : 0.0000
AC_CLEAN                 : -0.0005
IRRADIATION_CLEAN        : -0.0005
DAILY_YIELD_CLEAN        : -0.0038
MODULE_TEMPERATURE       : -0.0049
TOTAL_YIELD_CLEAN        : -0.0060
AMBIENT_TEMPERATURE      : -0.0068

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2443 | PR-AUC: 0.9990
              precision    recall  f1-score   support

           0      0.844     0.675     0.750        80
           1      0.971     0.989     0.980       881

    accuracy                          0.963       961
   macro avg      0.907     0.832     0.865       961
weighted avg      0.960     0.963     0.961       961

Suboptimal focused Confusion Matrix:
 

PyALE._ALE_generic:INFO: Continuous feature detected.



==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3376 | PR-AUC: 0.9989
              precision    recall  f1-score   support

           0      0.844     0.675     0.750        80
           1      0.971     0.989     0.980       881

    accuracy                          0.963       961
   macro avg      0.907     0.832     0.865       961
weighted avg      0.960     0.963     0.961       961

Suboptimal focused Confusion Matrix:
 [[ 54  26]
 [ 10 871]]

=== ALE for SVM ===


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9797525309336333

=== Drop Column Importance for SVM (change of F1 score) ===
AMBIENT_TEMPERATURE      : 0.0133
IRRADIATION_CLEAN        : 0.0102
MODULE_TEMPERATURE       : 0.0075
AC_CLEAN                 : 0.0006
DC/IRRA                  : 0.0006
DC_CLEAN                 : 0.0000
TOTAL_YIELD_CLEAN        : 0.0000
AC/IRRA                  : 0.0000
DAILY_YIELD_CLEAN        : -0.0033


  plt.show()  # prevents display
  plt.show()


In [253]:
# Mean importance across all inverters for Plant 1
drop2 = drop + ['DAILY_YIELD_CLEAN', 'AMBIENT_TEMPERATURE','MODULE_TEMPERATURE','AC_CLEAN', 'TOTAL_YIELD_CLEAN','DC/IRRA'] 
all_imp2 = []
for sk in source_key_2:
    out = run_classification_on_df_importance(df_ps2, sk, drop_col = drop2)

    # store as row, not as dict of dicts
    s = []
    s = pd.Series(out["svm_importance"], name=sk)
    all_imp2.append(s)

importance_df2 = pd.DataFrame(all_imp2)
importance_df2.describe().T.sort_values(by="mean", ascending=False)

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
IRRADIATION_CLEAN,22.0,0.009008,0.007092,-0.002194,0.003972,0.009278,0.012263,0.025177
AC/IRRA,22.0,0.000725,0.002347,-0.003359,-0.000441,0.000841,0.00279,0.003934
DC_CLEAN,22.0,0.000646,0.001979,-0.003349,-0.000407,0.000561,0.001582,0.003934


In [254]:
# Plant 2 models after feature selection  
for sk in source_key_2:
    run_classification_on_df_inv(df_ps2[sk], drop_col = drop2)

PyALE._ALE_generic:INFO: Continuous feature detected.



=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3917 | PR-AUC: 0.9992
              precision    recall  f1-score   support

           0      0.850     0.850     0.850        80
           1      0.986     0.986     0.986       881

    accuracy                          0.975       961
   macro avg      0.918     0.918     0.918       961
weighted avg      0.975     0.975     0.975       961

Suboptimal focused Confusion Matrix:
 [[ 68  12]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1932 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.862     0.938     0.898        80
           1      0.994     0.986     0.990       881

    accuracy                          0.982       961
   macro avg      0.928     0.962     0.944       961
weighted avg     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9903133903133903

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0099
DC_CLEAN                 : 0.0039
AC/IRRA                  : 0.0039

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3615 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.867     0.975     0.918        80
           1      0.998     0.986     0.992       881

    accuracy                          0.985       961
   macro avg      0.932     0.981     0.955       961
weighted avg      0.987     0.985     0.986       961

Suboptimal focused Confusion Matrix:
 [[ 78   2]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2272 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9920091324200914

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0011
DC_CLEAN                 : 0.0000
AC/IRRA                  : 0.0000

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4041 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.873     0.863     0.868        80
           1      0.988     0.989     0.988       881

    accuracy                          0.978       961
   macro avg      0.930     0.926     0.928       961
weighted avg      0.978     0.978     0.978       961

Suboptimal focused Confusion Matrix:
 [[ 69  11]
 [ 10 871]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1662 | PR-AUC: 0.9995
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9908883826879271

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0252
AC/IRRA                  : 0.0039
DC_CLEAN                 : 0.0028

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3262 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.855     0.887     0.871        80
           1      0.990     0.986     0.988       881

    accuracy                          0.978       961
   macro avg      0.923     0.937     0.930       961
weighted avg      0.979     0.978     0.978       961

Suboptimal focused Confusion Matrix:
 [[ 71   9]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3636 | PR-AUC: 0.9992
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9841628959276018

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0124
DC_CLEAN                 : -0.0006
AC/IRRA                  : -0.0033

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3700 | PR-AUC: 0.9987
              precision    recall  f1-score   support

           0      0.838     0.838     0.838        80
           1      0.985     0.985     0.985       881

    accuracy                          0.973       961
   macro avg      0.911     0.911     0.911       961
weighted avg      0.973     0.973     0.973       961

Suboptimal focused Confusion Matrix:
 [[ 67  13]
 [ 13 868]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2430 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0    

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9852607709750567

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0006
AC/IRRA                  : 0.0000
DC_CLEAN                 : -0.0023

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3837 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.864     0.950     0.905        80
           1      0.995     0.986     0.991       881

    accuracy                          0.983       961
   macro avg      0.930     0.968     0.948       961
weighted avg      0.984     0.983     0.984       961

Suboptimal focused Confusion Matrix:
 [[ 76   4]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2051 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9914334665905197

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0056
DC_CLEAN                 : 0.0006
AC/IRRA                  : 0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3992 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.871     0.925     0.897        80
           1      0.993     0.988     0.990       881

    accuracy                          0.982       961
   macro avg      0.932     0.956     0.944       961
weighted avg      0.983     0.982     0.983       961

Suboptimal focused Confusion Matrix:
 [[ 74   6]
 [ 11 870]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1784 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.992

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0196
AC/IRRA                  : 0.0017
DC_CLEAN                 : 0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3913 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.870     0.838     0.854        80
           1      0.985     0.989     0.987       881

    accuracy                          0.976       961
   macro avg      0.928     0.913     0.920       961
weighted avg      0.976     0.976     0.976       961

Suboptimal focused Confusion Matrix:
 [[ 67  13]
 [ 10 871]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1958 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.864     0.8

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9880749574105622

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0093
DC_CLEAN                 : 0.0011
AC/IRRA                  : 0.0011

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3446 | PR-AUC: 0.9988
              precision    recall  f1-score   support

           0      0.877     0.800     0.837        80
           1      0.982     0.990     0.986       881

    accuracy                          0.974       961
   macro avg      0.929     0.895     0.911       961
weighted avg      0.973     0.974     0.973       961

Suboptimal focused Confusion Matrix:
 [[ 64  16]
 [  9 872]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3741 | PR-AUC: 0.9992
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9853768278965129

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0034
DC_CLEAN                 : 0.0001
AC/IRRA                  : -0.0017

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4394 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.849     0.988     0.913        80
           1      0.999     0.984     0.991       881

    accuracy                          0.984       961
   macro avg      0.924     0.986     0.952       961
weighted avg      0.986     0.984     0.985       961

Suboptimal focused Confusion Matrix:
 [[ 79   1]
 [ 14 867]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1786 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.992

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0090
AC/IRRA                  : 0.0035
DC_CLEAN                 : 0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4064 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.859     0.838     0.848        80
           1      0.985     0.988     0.986       881

    accuracy                          0.975       961
   macro avg      0.922     0.913     0.917       961
weighted avg      0.975     0.975     0.975       961

Suboptimal focused Confusion Matrix:
 [[ 67  13]
 [ 11 870]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1858 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.871     0.9

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9903244166192373

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0143
DC_CLEAN                 : 0.0039
AC/IRRA                  : 0.0034

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3993 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.846     0.963     0.901        80
           1      0.997     0.984     0.990       881

    accuracy                          0.982       961
   macro avg      0.921     0.973     0.945       961
weighted avg      0.984     0.982     0.983       961

Suboptimal focused Confusion Matrix:
 [[ 77   3]
 [ 14 867]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2891 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9880613985218875

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0118
DC_CLEAN                 : 0.0011
AC/IRRA                  : -0.0022

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.5560 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.743     0.938     0.829        80
           1      0.994     0.970     0.982       881

    accuracy                          0.968       961
   macro avg      0.868     0.954     0.905       961
weighted avg      0.973     0.968     0.969       961

Suboptimal focused Confusion Matrix:
 [[ 75   5]
 [ 26 855]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.0880 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9862857142857143

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0092
AC/IRRA                  : 0.0011
DC_CLEAN                 : -0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4030 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.854     0.950     0.899        80
           1      0.995     0.985     0.990       881

    accuracy                          0.982       961
   macro avg      0.925     0.968     0.945       961
weighted avg      0.984     0.982     0.983       961

Suboptimal focused Confusion Matrix:
 [[ 76   4]
 [ 13 868]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2547 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9880749574105622

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0061
AC/IRRA                  : -0.0028
DC_CLEAN                 : -0.0033

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.2032 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.919     0.713     0.803        80
           1      0.974     0.994     0.984       881

    accuracy                          0.971       961
   macro avg      0.947     0.853     0.894       961
weighted avg      0.970     0.971     0.969       961

Suboptimal focused Confusion Matrix:
 [[ 57  23]
 [  5 876]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.3929 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0    

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.988155668358714

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0067
AC/IRRA                  : 0.0028
DC_CLEAN                 : 0.0017

=== Operating Condition Counts ===
Number of Optimal (0):     355
Number of Suboptimal (1):  2840

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3365 | PR-AUC: 0.9991
              precision    recall  f1-score   support

           0      0.885     0.675     0.766        80
           1      0.971     0.992     0.981       881

    accuracy                          0.966       961
   macro avg      0.928     0.834     0.874       961
weighted avg      0.964     0.966     0.964       961

Suboptimal focused Confusion Matrix:
 [[ 54  26]
 [  7 874]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2082 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9880884855360181

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0148
DC_CLEAN                 : 0.0033
AC/IRRA                  : 0.0028

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4262 | PR-AUC: 0.9993
              precision    recall  f1-score   support

           0      0.856     0.963     0.906        80
           1      0.997     0.985     0.991       881

    accuracy                          0.983       961
   macro avg      0.926     0.974     0.948       961
weighted avg      0.985     0.983     0.984       961

Suboptimal focused Confusion Matrix:
 [[ 77   3]
 [ 13 868]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1361 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.992

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0095
AC/IRRA                  : 0.0034
DC_CLEAN                 : 0.0029

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3869 | PR-AUC: 0.9995
              precision    recall  f1-score   support

           0      0.862     0.938     0.898        80
           1      0.994     0.986     0.990       881

    accuracy                          0.982       961
   macro avg      0.928     0.962     0.944       961
weighted avg      0.983     0.982     0.983       961

Suboptimal focused Confusion Matrix:
 [[ 75   5]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1918 | PR-AUC: 0.9995
              precision    recall  f1-score   support

           0      0.857     0.9

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()



=== Baseline F1 Score of SVM ===
0.9914334665905197

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0100
DC_CLEAN                 : 0.0011
AC/IRRA                  : 0.0011

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900


PyALE._ALE_generic:INFO: Continuous feature detected.



==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4010 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.864     0.950     0.905        80
           1      0.995     0.986     0.991       881

    accuracy                          0.983       961
   macro avg      0.930     0.968     0.948       961
weighted avg      0.984     0.983     0.984       961

Suboptimal focused Confusion Matrix:
 [[ 76   4]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1875 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.849     0.988     0.913        80
           1      0.999     0.984     0.991       881

    accuracy                          0.984       961
   macro avg      0.924     0.986     0.952       961
weighted avg      0.986     0.984     0.985       961

Suboptimal focused Confusion Matrix:
 [[ 79   1]
 [ 14 867]]


  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9914236706689536

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0208
DC_CLEAN                 : 0.0005
AC/IRRA                  : -0.0006

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.4017 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.867     0.975     0.918        80
           1      0.998     0.986     0.992       881

    accuracy                          0.985       961
   macro avg      0.932     0.981     0.955       961
weighted avg      0.987     0.985     0.986       961

Suboptimal focused Confusion Matrix:
 [[ 78   2]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2612 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9880749574105622

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : -0.0022
DC_CLEAN                 : -0.0028
AC/IRRA                  : -0.0034

=== Operating Condition Counts ===
Number of Optimal (0):     244
Number of Suboptimal (1):  2111

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3476 | PR-AUC: 0.9982
              precision    recall  f1-score   support

           0      0.855     0.812     0.833        80
           1      0.983     0.988     0.985       881

    accuracy                          0.973       961
   macro avg      0.919     0.900     0.909       961
weighted avg      0.972     0.973     0.973       961

Suboptimal focused Confusion Matrix:
 [[ 65  15]
 [ 11 870]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.2529 | PR-AUC: 0.9984
              precision    recall  f1-score   support

           0   

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
  plt.show()
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.9852774631936579

=== Drop Column Importance for SVM (change of F1 score) ===
AC/IRRA                  : 0.0006
IRRADIATION_CLEAN        : 0.0000
DC_CLEAN                 : -0.0005

=== Operating Condition Counts ===
Number of Optimal (0):     359
Number of Suboptimal (1):  2900

==== LogReg - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: 0.3896 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0      0.867     0.975     0.918        80
           1      0.998     0.986     0.992       881

    accuracy                          0.985       961
   macro avg      0.932     0.981     0.955       961
weighted avg      0.987     0.985     0.986       961

Suboptimal focused Confusion Matrix:
 [[ 78   2]
 [ 12 869]]

==== LinearSVM - max suboptimal f1 score | Full Test ====
Suboptimal focused Threshold: -0.1853 | PR-AUC: 0.9994
              precision    recall  f1-score   support

           0     

  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.
  plt.show()  # prevents display
PyALE._ALE_generic:INFO: Continuous feature detected.



=== Baseline F1 Score of SVM ===
0.992

=== Drop Column Importance for SVM (change of F1 score) ===
IRRADIATION_CLEAN        : 0.0011
DC_CLEAN                 : 0.0000
AC/IRRA                  : -0.0000


  plt.show()  # prevents display
  plt.show()


### Experiments: With/Without Outliers, Before/After Feature Selection


### Summary
All models have poorer performance when trained without feature scaling because it prevents large features (such as AC and DC) from dominating the loss function while small feature (such as irradiation) is ignored.

Feature selection was performed using the drop-column method. Each feature was removed in turn, and the model was retrained to measure the change in performance. A positive importance value indicates that the feature contributes to model performance. A value of zero suggests the feature is redundant, and a negative value indicates the feature reduces model performance. Features were removed until all remaining features had positive importance values.


For Plant 1, the retained features are daily yield, ambient temperature, and irradiation.


For Plant 2, the retained features are DC power, irradiation, and the ratio of AC power to irradiation (AC/IRRA).

The effect of each feature on model predictions was calculated using accumulated local effects (ALE), a method that remains reliable even when features are correlated. For the LinearSVC model, more negative prediction values correspond to a higher probability of the inverter being in an optimal state. More positive values indicate a higher probability of being suboptimal. For Plant 1, the ALE plots show that increases in daily yield and irradiation lead to more negative prediction values, indicating a higher likelihood of optimal performance. In contrast, higher ambient temperatures raise the prediction value, which signals a greater probability of suboptimal performance. In summary, maintaining optimal operation in Plant 1 requires lower ambient temperatures and sufficient irradiation so that both irradiation and daily yield can increase. For Plant 2, higher irradiation and higher DC output both move the inverter toward an optimal state. An increase in the AC-to-irradiation ratio, however, shifts the inverter toward suboptimal performance. To support optimal operation in Plant 2, locating the plant in an area with strong and consistent sunlight is beneficial.

Data Quality Issues:


A key limitation was the uneven distribution between optimal and suboptimal operating states. Optimal events are relatively rare, leading the model to bias predictions toward the majority class. Although threshold tuning and class weighting help, the imbalance fundamentally limits the modelâ€™s ability to learn subtle patterns associated with rare operational faults. 


Model Assumptions and Limitations:


LinearSVC assumes that the two classes can be separated by a linear decision boundary in feature space. In reality, inverter performance is influenced by complex relationships such as: non-linear efficiency curves,
interaction between temperature and irradiation, operational hysteresis effects.These interactions are difficult for a linear model to capture, limiting predictive accuracy.


Challenges:


It is difficult to identity the outliers in the data that affected the classification model performance, eventhough outliers are removed based on linear regression model. 



Data Collection and Quality Improvements:


Increase representative coverage of minority class. my modelâ€™s difficulty in detecting optimal or suboptimal states (depending on imbalance structure) often comes from limited examples of optimal class, which can be improved by: collecting more data for optimal operating conditions .


Alternative Modelling Approaches to Improve Prediction:


Even if LinearSVC performs reasonably, consider alternative models that can capture non-linearities or provide complementary insights. Non-linear models such as RBF-SVM, XGBoost and Logistic Regression with engineered polynomial features, are able to capture non-linear feature interactions


Real world application:


One application will be applying the results of the model to find out the factors causing the inverters to be suboptimal so that corresponding measures can be implemented to prevent inverters from being suboptimal. 