# Binary Classification Metrics for OGLE Data

This notebook demonstrates how to use the reusable functions from `scripts/binary_metrics.py` to calculate and visualize metrics for binary classification using the OGLE catalog predictions.

In [7]:
import pandas as pd
import sys
sys.path.append('../scripts')
import importlib
import binary_metrics
print('binary_metrics.py location:', binary_metrics.__file__)
print('map_binary_labels mapping:', binary_metrics.map_binary_labels.__code__.co_consts)
from binary_metrics import print_metrics_table
from IPython.display import display

# Load OGLE classification data
ogle_df = pd.read_csv('../data/classification_OGLE.csv')
ogle_df.columns = ogle_df.columns.str.strip()  # <-- Add this line
display(ogle_df.head())

binary_metrics.py location: /Users/wera/Max_astro/Slovakia/EBML_test/EBML/notebooks/../scripts/binary_metrics.py
map_binary_labels mapping: ("\nMap binary class labels to 0/1: 0 = det, 1 = over.\nAccepts: 'det', 'DET', 0 -> 0; 'over', 'OVER', 1 -> 1\n", 0, 1, ('det', 'DET', 0, 'over', 'OVER', 1))


Unnamed: 0,Name,Gaia,orig_ogle_class,binary_I_Res,spot_I_Res,binary_I_ViT,spot_I_ViT,binary_gaia_Res,spot_Gaia_Res,binary_gaia_ViT,spot_Gaia_ViT
0,OGLE-BLG-ECL-002011,6028823779367951744,det,det,s,det,s,det,n,det,n
1,OGLE-BLG-ECL-004840,4107331719835398656,det,det,n,det,s,det,s,det,s
2,OGLE-BLG-ECL-005098,4059230147580348160,det,det,s,det,s,det,s,det,s
3,OGLE-BLG-ECL-005728,4107530701320089728,over,over,s,over,n,over,n,over,n
4,OGLE-BLG-ECL-010040,4109951241823489408,det,det,n,over,n,det,n,over,n


In [8]:
# Define columns for predictions and probabilities
# Adjust these column names to match your OGLE DataFrame
pred_cols = [
    'binary_I_Res', 'binary_I_ViT', 'binary_gaia_Res', 'binary_gaia_ViT'
]
label_col = 'orig_ogle_class'  # Adjust if your label column is named differently

# Show metrics tables
print_metrics_table(ogle_df, label_col=label_col, pred_cols=pred_cols, class_name='OGLE')


Metrics for OGLE systems:


Unnamed: 0,Model/Passband,Accuracy,Precision,Recall,F1-score,TN,FP,FN,TP
0,binary_I_Res,0.9,0.81,0.96,0.88,107,17,3,73
1,binary_I_ViT,0.87,0.77,0.95,0.85,102,22,4,72
2,binary_gaia_Res,0.97,0.95,0.97,0.96,120,4,2,74
3,binary_gaia_ViT,0.94,0.9,0.96,0.93,116,8,3,73


The next cell loads and preprocesses the WUMaCat (overcontact) and DEBcat (detached) datasets, combines them, and evaluates binary classification metrics using the reusable functions from `binary_metrics.py`.

In [9]:
# Read WUMaCat (overcontact) and DEBcat (detached) CSVs
wuma_df = pd.read_csv('../data/classification_WUMaCat.csv')
debcat_df = pd.read_csv('../data/classification_DEBcat.csv')

# Remove spaces from column names
wuma_df.columns = wuma_df.columns.str.strip()
debcat_df.columns = debcat_df.columns.str.strip()
for df in [wuma_df, debcat_df]:
    for col in df.select_dtypes(include='object').columns:
        df[col] = df[col].str.strip()

# Add true_class column
wuma_df['true_class'] = 'over'
debcat_df['true_class'] = 'det'

# Select columns for binary classification
pred_cols = [
    'binary_tess_res', 'binary_tess_vit', 'binary_gaia_res', 'binary_gaia_vit'
]

# Concatenate both DataFrames
allcat_df = pd.concat([wuma_df, debcat_df], ignore_index=True)

# Show metrics for all systems
from binary_metrics import print_metrics_table
print_metrics_table(allcat_df, label_col='true_class', pred_cols=pred_cols, class_name='WUMaCat+DEBcat')


Metrics for WUMaCat+DEBcat systems:


Unnamed: 0,Model/Passband,Accuracy,Precision,Recall,F1-score,TN,FP,FN,TP
0,binary_tess_res,1.0,1.0,1.0,1.0,52,0,0,90
1,binary_tess_vit,1.0,1.0,1.0,1.0,52,0,0,90
2,binary_gaia_res,0.98,1.0,0.97,0.98,52,0,3,87
3,binary_gaia_vit,0.94,1.0,0.91,0.95,52,0,8,82
