[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openscilabs/isda/blob/main/dtlz.ipynb)

In [None]:
# Setup for Google Colab
import os

try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    print("Running in Colab. setup...")
    # Clone repository to access dtlz_lib.py
    !git clone https://github.com/openscilabs/isda.git
    %cd isda
    # Install misda
    !pip install .
else:
    print("Running locally.")

# MISDA Benchmark: DTLZ Suite

This notebook evaluates MISDA on standard Multi-Objective Optimization benchmarks:
*   **DTLZ2**: Spherical Pareto Front (Non-degenerate). Dimensionality should be preserved.
*   **DTLZ5**: Degenerate Pareto Front (Curve). Dimensionality should be reduced.
*   **DTLZ2 + Redundancy**: Evaluation of noise/redundancy removal capability.

In [None]:
import numpy as np
import pandas as pd
import misda
import dtlz_lib

print("Libraries loaded.")

## 1. DTLZ2 (Irreducible)
M=3 objectives. Intrinsic dimensionality is 2 (manifold), but 3 conflicting objectives are required to describe the front.
Expectation: MISDA should **retain all 3 objectives** (No reduction).

In [None]:
Y, _ = dtlz_lib.generate_dtlz2(N=500, M=3)
df = pd.DataFrame(Y, columns=['f1', 'f2', 'f3'])

res = misda.analyze(df, caution=0.5, run_ses=True, name="DTLZ2 (M=3)")
print(res.summary())
res.plot()

## 2. DTLZ5 (Degenerate)
M=3 objectives, but defined on a curve (1D manifold). 
Expectation: MISDA should potentially detect strong correlations or reducibility, although DTLZ5 is tricky due to nonlinear relationships.

In [None]:
Y, _ = dtlz_lib.generate_dtlz5(N=500, M=3)
df = pd.DataFrame(Y, columns=['f1', 'f2', 'f3'])

res = misda.analyze(df, caution=0.5, run_ses=True, name="DTLZ5 (Degenerate)")
print(res.summary())
res.plot()

## 3. DTLZ2 with Linear Redundancy
We take DTLZ2 (3 objs) and add 3 noisy copies of each. Total 12 objectives.
Expectation: MISDA should reduce it back to **3 objectives**.

In [None]:
Y_base, _ = dtlz_lib.generate_dtlz2(N=500, M=3)
rng = np.random.default_rng(42)

all_feats = []
names = []
for i in range(3):
    orig = Y_base[:, i]
    all_feats.append(orig)
    names.append(f"f{i+1}")
    # Add 3 copies
    for k in range(3):
        copy = orig + 0.05 * rng.normal(size=len(orig))
        all_feats.append(copy)
        names.append(f"f{i+1}_copy{k+1}")

Y_redundant = np.column_stack(all_feats)
df_red = pd.DataFrame(Y_redundant, columns=names)

res = misda.analyze(df_red, caution=0.5, run_ses=True, name="DTLZ2 + Redundancy")
print(res.summary())
print("Selected:", res.best_mis['mis_labels'])
res.plot()