# Subset Finding Algorithm - Example Notebook

This Notebook shows the funtionality of the subset-finding algortihm.

## Objective

Given a feature attribution model (e.g., SHAP), we aim to identify relevant subsets of features
that contribute to the explanaation of a specific model output.

This example uses synthetic data and a simple model structure to illustrate the concept

Imports:

In [1]:
from __future__ import annotations

import os
import sys

notebook_dir = os.getcwd()
project_root = os.path.abspath(os.path.join(notebook_dir, ".."))
sys.path.insert(0, project_root)

In [2]:
from shapiq import ExactComputer
from shapiq.games.benchmark import SOUM

from shapiq_student.subset_finding import subset_finding

  from .autonotebook import tqdm as notebook_tqdm


Initialize a game:

In [3]:
game = SOUM(n = 8, n_basis_games = 50)

In [4]:
computer = ExactComputer(n_players = game.n_players, game = game)
iv = computer(index = "FSII", order = 3)

In [5]:
print(f"Original interaction values: {len(iv.index)} coalitions")

Original interaction values: 4 coalitions


In [6]:
iv_subset = subset_finding(interaction_values = iv, max_size = 3)

In [7]:
print(f"Subset interaction values: {(iv_subset.index)} coalitions")

Subset interaction values: FSII coalitions


In [8]:
print(iv_subset)

InteractionValues(
    index=FSII, max_order=3, min_order=0, estimated=True, estimation_budget=None,
    n_players=8, baseline_value=-0.6778843098211,
    Top 10 interactions:
        (1,): -0.07080412786109225
        (2,): -0.07080412786109225
        (): -0.6778843098211
        (0,): -1.0245539945426618
)


In [9]:
print("Alle Werte nach der Selektion:")
for coal, val in zip(iv_subset.index, iv_subset.values, strict=False):
    print(coal, val)


Alle Werte nach der Selektion:
F -0.6778843098211
S -1.0245539945426618
I -0.07080412786109225
I -0.07080412786109225


## Summary

- The subset-finding algorithm enables focused analysis of feature groups.
- It provides interpretable insights based on Shapley values.
- This notebook demonstrates the process from data generation to result interpretation.

It is intended for inclusion in Sphinx documentation to illustrate the algorithm's application in a minimal example.
