---
execute:
  cache: false
  eval: true
  echo: true
  warning: false
jupyter: python3
title: Explainable AI with SpotPython and Pytorch
---

In [1]:
#| echo: false
#| label: imports
import warnings
warnings.filterwarnings("ignore")

In [2]:
#| label: configure_spot
from spotpython.data.diabetes import Diabetes
from spotpython.hyperdict.light_hyper_dict import LightHyperDict
from spotpython.fun.hyperlight import HyperLight
from spotpython.utils.init import (fun_control_init, surrogate_control_init, design_control_init)
from spotpython.utils.eda import gen_design_table
from spotpython.spot import spot
from spotpython.utils.file import get_experiment_filename
from spotpython.hyperparameters.values import set_hyperparameter
from math import inf

PREFIX="602_1"

data_set = Diabetes()

fun_control = fun_control_init(
    save_experiment=True,
    PREFIX=PREFIX,
    fun_evals=inf,
    max_time=60,
    data_set = data_set,
    core_model_name="light.regression.NNLinearRegressor",
    hyperdict=LightHyperDict,
    _L_in=10,
    _L_out=1)

fun = HyperLight().fun


set_hyperparameter(fun_control, "optimizer", [ "Adadelta", "Adam", "Adamax"])
set_hyperparameter(fun_control, "l1", [5,7])
set_hyperparameter(fun_control, "epochs", [10,12])
set_hyperparameter(fun_control, "batch_size", [4,11])
set_hyperparameter(fun_control, "dropout_prob", [0.0, 0.025])
set_hyperparameter(fun_control, "patience", [2,9])

design_control = design_control_init(init_size=20)

spot_tuner = spot.Spot(fun=fun,fun_control=fun_control, design_control=design_control)

Seed set to 123


Seed set to 123


module_name: light
submodule_name: regression
model_name: NNLinearRegressor


### Running the Hyperparameter Tuning or Loading the Existing Model


In [3]:
#| label: run_experiment
from spotpython.utils.file import get_experiment_filename, load_experiment
import os
overwrite = False
filename = get_experiment_filename(PREFIX)
if os.path.exists(filename) and not overwrite:
    (spot_tuner, fun_control, design_control,
    surrogate_control, optimizer_control) = load_experiment(filename)
else:
    print("File does not exist or overwrite is True. Starting new experiment.")
    res = spot_tuner.run()

Loaded experiment from spot_602_1_experiment.pickle


### Results from the Hyperparameter Tuning Experiment

* After the hyperparameter tuning is finished, the following information is available:
    * the `spot_tuner` object and the associated
    * `fun_control` dictionary


In [4]:
#| label: print_results
res = spot_tuner.print_results(print_screen=True)

min y: 2853.099609375
l1: 5.0
epochs: 12.0
batch_size: 8.0
act_fn: 3.0
optimizer: 2.0
dropout_prob: 0.025
lr_mult: 1.9375352248449356
patience: 8.0
batch_norm: 0.0
initialization: 4.0


#### Getting the Best Model, i.e, the Tuned Architecture

* The method `get_tuned_architecture` [[DOC]](https://sequential-parameter-optimization.github.io/spotPython/reference/spotpython/hyperparameters/values/#spotpython.hyperparameters.values.get_tuned_architecture) returns the best model architecture found during the hyperparameter tuning.
* It returns the transformed values, i.e., `batch_size = 2^x` if the hyperparameter `batch_size` was transformed with the `transform_power_2_int` function.


In [5]:
#| label: get_tuned_architecture
from spotpython.hyperparameters.values import get_tuned_architecture
import pprint
config = get_tuned_architecture(spot_tuner, fun_control)
pprint.pprint(config)

{'act_fn': LeakyReLU(),
 'batch_norm': False,
 'batch_size': 256,
 'dropout_prob': 0.025,
 'epochs': 4096,
 'initialization': 'xavier_normal',
 'l1': 32,
 'lr_mult': 1.9375352248449356,
 'optimizer': 'Adamax',
 'patience': 256}


* Note: `get_tuned_architecture` has the option `force_minX` which does not have any effect in this case.


In [6]:
#| label: get_tuned_architecture_force_minX
from spotpython.hyperparameters.values import get_tuned_architecture
config = get_tuned_architecture(spot_tuner, fun_control, force_minX=True)
pprint.pprint(config)

{'act_fn': LeakyReLU(),
 'batch_norm': False,
 'batch_size': 256,
 'dropout_prob': 0.025,
 'epochs': 4096,
 'initialization': 'xavier_normal',
 'l1': 32,
 'lr_mult': 1.9375352248449356,
 'optimizer': 'Adamax',
 'patience': 256}


### Training the Tuned Architecture on the Test Data

* Since we are interested in the explainability of the model, we will train the tuned architecture on the test data.
* `spotpythons`'s `test_model` function [[DOC]](https://sequential-parameter-optimization.github.io/spotPython/reference/spotpython/light/testmodel/) is used to train the model on the test data.
* Note: Until now, we do not use any information about the NN's weights and biases. Only the architecture, which is available as the `config`, is used.
* `spotpython` used the TensorBoard logger to save the training process in the `./runs` directory. Therefore, we have to enable the TensorBoard logger in the `fun_control` dictionary. To get a clean start, we remove an existing `runs` folder.


In [7]:
#| label: test_model
from spotpython.light.testmodel import test_model
from spotpython.light.loadmodel import load_light_from_checkpoint
import os
# if the directory "./runs" exists, delete it
if os.path.exists("./runs"):
    os.system("rm -r ./runs")
fun_control.update({"tensorboard_log": True})
test_model(config, fun_control)

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


Restoring states from the checkpoint path at /Users/bartz/workspace/Hyperparameter-Tuning-Cookbook/runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TEST/last.ckpt


Loaded model weights from the checkpoint at /Users/bartz/workspace/Hyperparameter-Tuning-Cookbook/runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TEST/last.ckpt


test_model result: {'val_loss': 3528.567138671875, 'hp_metric': 3528.567138671875}


(3528.567138671875, 3528.567138671875)

In [8]:
#| label: load_model_from_chkpt
model = load_light_from_checkpoint(config, fun_control)

config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TEST from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TEST/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReLU()
    (11): Dropout(p=0.025, inplace=False)
    (12): Linear(in_features=8, out_feat

##### Details of the Training Process on the Test Data

* The `test_model` method initializes the model with the tuned architecture as follows:

```python
model = fun_control["core_model"](**config, _L_in=_L_in, _L_out=_L_out, _torchmetric=_torchmetric)
```

* Then, the Lightning Trainer is initialized with the `fun_control` dictionary and the model as follows:
    
    ```python
        trainer = L.Trainer(
        default_root_dir=os.path.join(fun_control["CHECKPOINT_PATH"], config_id),
        max_epochs=model.hparams.epochs,
        accelerator=fun_control["accelerator"],
        devices=fun_control["devices"],
        logger=TensorBoardLogger(
            save_dir=fun_control["TENSORBOARD_PATH"],
            version=config_id,
            default_hp_metric=True,
            log_graph=fun_control["log_graph"],
        ),
        callbacks=[
            EarlyStopping(monitor="val_loss", patience=config["patience"], mode="min", strict=False, verbose=False),
            ModelCheckpoint(
                dirpath=os.path.join(fun_control["CHECKPOINT_PATH"], config_id), save_last=True
            ), 
        ],
        enable_progress_bar=enable_progress_bar,
    )
    trainer.fit(model=model, datamodule=dm)    
    test_result = trainer.test(datamodule=dm, ckpt_path="last")
    ```

* As shown in the code above, the last checkpoint ist saved.
* `spotpython`'s method `load_light_from_checkpoint` is used to load the last checkpoint and to get the model's weights and biases. It requires the `fun_control` dictionary and the `config_id` as input to find the correct checkpoint.
* Now, the model is trained and the weights and biases are available.


In [9]:
#| label: model_viz
from torchviz import make_dot
import torch
x = torch.randn(1, 10).requires_grad_(True)
x = x.to("mps")
output = model(x)
dot = make_dot(output, params=dict(model.named_parameters()), show_attrs=True, show_saved=True)
dot.render("model_architecture", format="png")

'model_architecture.png'

![architecture](./model_architecture.png)

## XAI Methods

* `spotpython` provides methods to explain the model's predictions. The following neural network elements can be analyzed: 

### Weights

* Weights are the parameters of the neural network that are learned from the data during training. They connect neurons between layers and determine the strength and direction of the signal sent from one neuron to another. The network adjusts the weights during training to minimize the error between the predicted output and the actual output.
* Interpretation of the weights: A high weight value indicates a strong influence of the input neuron on the output. Positive weights suggest a positive correlation, whereas negative weights suggest an inverse relationship between neurons.

### Activations

* Activations are the outputs produced by neurons after applying an activation function to the weighted sum of inputs. The activation function (e.g., ReLU, sigmoid, tanh) adds non-linearity to the model, allowing it to learn more complex relationships.
* Interpretation of the activations: The value of activations indicates the intensity of the signal passed to the next layer. Certain activation patterns can highlight which features or parts of the data the network is focusing on.

### Gradients

* Gradients are the partial derivatives of the loss function with respect to different parameters (weights) of the network. During backpropagation, gradients are used to update the weights in the direction that reduces the loss by methods like gradient descent.
* Interpretation of the gradients: The magnitude of the gradient indicates how much a parameter should change to reduce the error. A large gradient implies a steeper slope and a bigger update, while a small gradient suggests that the parameter is near an optimal point. If gradients are too small (vanishing gradient problem), the network may learn slowly or stop learning. If they are too large (exploding gradient problem), the updates may be unstable.


In [10]:
#| label: import_xai
from spotpython.plot.xai import (get_activations, get_gradients, get_weights, visualize_weights, visualize_gradients, visualize_mean_activations, visualize_gradient_distributions, visualize_weights_distributions, visualize_activations_distributions)
batch_size = config["batch_size"]

### Getting the Weights


In [11]:
#| label: get_weights
from spotpython.plot.xai import sort_layers
weights = get_weights(model)
sort_layers(weights)

{'Layer 0': array([-7.16462612e-01, -3.67333621e-01,  9.30687606e-01,  2.68607646e-01,
        -9.38906491e-01, -6.10677361e-01, -6.56087816e-01, -5.93512356e-02,
         6.50186419e-01,  8.96262378e-03, -1.34425551e-01, -3.69364440e-01,
         7.18996942e-01,  5.25674284e-01, -3.29064786e-01,  1.12666309e-01,
        -6.35691762e-01,  5.67029826e-02,  5.56883752e-01, -9.66491550e-03,
        -1.26639381e-01, -1.82072327e-01,  5.12582004e-01,  3.81299615e-01,
        -3.02605510e-01,  1.38671339e-01, -3.21334690e-01,  2.06663862e-01,
         6.48850977e-01,  3.09361339e-01, -2.26798877e-01, -3.57937008e-01,
         1.17232597e+00,  8.88341308e-01, -5.78630567e-01, -3.86192292e-01,
        -6.57020152e-01,  9.64168459e-03,  7.52761960e-01,  4.36286598e-01,
         1.39316216e-01, -4.74126311e-04,  1.12638092e+00,  1.05096900e+00,
        -1.19841985e-01, -2.50326693e-01, -5.07486284e-01,  1.34136111e-01,
         8.01834345e-01,  2.69484609e-01,  6.14033453e-02, -7.43488908e-01,
 

In [12]:
#| label: visualize_weights
visualize_weights(model, absolute=True, cmap="GreenYellowRed", figsize=(6, 6))

320 values in Layer Layer 0.
4 padding values added.
324 values now in Layer Layer 0.


<Figure size 1800x1800 with 2 Axes>

512 values in Layer Layer 3.
17 padding values added.
529 values now in Layer Layer 3.


<Figure size 1800x1800 with 2 Axes>

256 values in Layer Layer 6.
256 values now in Layer Layer 6.


<Figure size 1800x1800 with 2 Axes>

128 values in Layer Layer 9.
16 padding values added.
144 values now in Layer Layer 9.


<Figure size 1800x1800 with 2 Axes>

64 values in Layer Layer 12.
64 values now in Layer Layer 12.


<Figure size 1800x1800 with 2 Axes>

64 values in Layer Layer 15.
64 values now in Layer Layer 15.


<Figure size 1800x1800 with 2 Axes>

32 values in Layer Layer 18.
4 padding values added.
36 values now in Layer Layer 18.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer Layer 21.
16 values now in Layer Layer 21.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer Layer 24.
16 values now in Layer Layer 24.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer Layer 27.
16 values now in Layer Layer 27.


<Figure size 1800x1800 with 2 Axes>

8 values in Layer Layer 30.
1 padding values added.
9 values now in Layer Layer 30.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer Layer 33.
4 values now in Layer Layer 33.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer Layer 36.
4 values now in Layer Layer 36.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer Layer 39.
4 values now in Layer Layer 39.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer Layer 42.
2 padding values added.
4 values now in Layer Layer 42.


<Figure size 1800x1800 with 2 Axes>

In [13]:
#| label: visualize_weights_distributions
visualize_weights_distributions(model, color=f"C{0}", columns=4)

n:15


<Figure size 3240x3000 with 16 Axes>

### Getting the Activations


In [14]:
#| label: get_activations
from spotpython.plot.xai import get_activations
activations, mean_activations = get_activations(net=model, fun_control=fun_control, batch_size=batch_size, device="mps")

stage: test
full_sizefull_train_size: 0.6
full_sizeval_size: 0.24
full_sizetrain_size: 0.36
full_sizetest_size: 0.4
test_size: 0.4 used for test dataset.
LightDataModule.test_dataloader(). Test set size: 177


In [15]:
#| label: visualize_mean_activations
visualize_mean_activations(mean_activations, absolute=True, cmap="GreenYellowRed", figsize=(6, 6))

32 values in Layer 0.
4 padding values added.
36 values now in Layer 0.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer 3.
16 values now in Layer 3.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer 6.
16 values now in Layer 6.


<Figure size 1800x1800 with 2 Axes>

8 values in Layer 9.
1 padding values added.
9 values now in Layer 9.


<Figure size 1800x1800 with 2 Axes>

8 values in Layer 12.
1 padding values added.
9 values now in Layer 12.


<Figure size 1800x1800 with 2 Axes>

8 values in Layer 15.
1 padding values added.
9 values now in Layer 15.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer 18.
4 values now in Layer 18.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer 21.
4 values now in Layer 21.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer 24.
4 values now in Layer 24.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer 27.
4 values now in Layer 27.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer 30.
2 padding values added.
4 values now in Layer 30.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer 33.
2 padding values added.
4 values now in Layer 33.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer 36.
2 padding values added.
4 values now in Layer 36.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer 39.
2 padding values added.
4 values now in Layer 39.


<Figure size 1800x1800 with 2 Axes>

In [16]:
#| label: visualize_activations_distributions
visualize_activations_distributions(activations=activations,
                                    net=model, color="C0", columns=4)

<Figure size 3240x3000 with 16 Axes>

### Getting the Gradients


In [17]:
#| label: get_gradients
gradients = get_gradients(net=model, fun_control=fun_control, batch_size=batch_size, device = "mps")

stage: test
full_sizefull_train_size: 0.6
full_sizeval_size: 0.24
full_sizetrain_size: 0.36
full_sizetest_size: 0.4
test_size: 0.4 used for test dataset.
LightDataModule.test_dataloader(). Test set size: 177


In [18]:
#| label: visualize_gradients
visualize_gradients(model, fun_control, batch_size, absolute=True, cmap="GreenYellowRed", figsize=(6, 6), device="mps")

stage: test
full_sizefull_train_size: 0.6
full_sizeval_size: 0.24
full_sizetrain_size: 0.36
full_sizetest_size: 0.4
test_size: 0.4 used for test dataset.
LightDataModule.test_dataloader(). Test set size: 177
320 values in Layer layers.0.weight.
4 padding values added.
324 values now in Layer layers.0.weight.


<Figure size 1800x1800 with 2 Axes>

512 values in Layer layers.3.weight.
17 padding values added.
529 values now in Layer layers.3.weight.


<Figure size 1800x1800 with 2 Axes>

256 values in Layer layers.6.weight.
256 values now in Layer layers.6.weight.


<Figure size 1800x1800 with 2 Axes>

128 values in Layer layers.9.weight.
16 padding values added.
144 values now in Layer layers.9.weight.


<Figure size 1800x1800 with 2 Axes>

64 values in Layer layers.12.weight.
64 values now in Layer layers.12.weight.


<Figure size 1800x1800 with 2 Axes>

64 values in Layer layers.15.weight.
64 values now in Layer layers.15.weight.


<Figure size 1800x1800 with 2 Axes>

32 values in Layer layers.18.weight.
4 padding values added.
36 values now in Layer layers.18.weight.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer layers.21.weight.
16 values now in Layer layers.21.weight.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer layers.24.weight.
16 values now in Layer layers.24.weight.


<Figure size 1800x1800 with 2 Axes>

16 values in Layer layers.27.weight.
16 values now in Layer layers.27.weight.


<Figure size 1800x1800 with 2 Axes>

8 values in Layer layers.30.weight.
1 padding values added.
9 values now in Layer layers.30.weight.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer layers.33.weight.
4 values now in Layer layers.33.weight.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer layers.36.weight.
4 values now in Layer layers.36.weight.


<Figure size 1800x1800 with 2 Axes>

4 values in Layer layers.39.weight.
4 values now in Layer layers.39.weight.


<Figure size 1800x1800 with 2 Axes>

2 values in Layer layers.42.weight.
2 padding values added.
4 values now in Layer layers.42.weight.


<Figure size 1800x1800 with 2 Axes>

In [19]:
#| label: visualize_gradient_distributions
visualize_gradient_distributions(model, fun_control, batch_size=batch_size, color=f"C{0}", device="mps", columns=3)

stage: test
full_sizefull_train_size: 0.6
full_sizeval_size: 0.24
full_sizetrain_size: 0.36
full_sizetest_size: 0.4
test_size: 0.4 used for test dataset.
LightDataModule.test_dataloader(). Test set size: 177
n:15


<Figure size 2430x3750 with 15 Axes>

## Feature Attributions

### Integrated Gradients


In [20]:
#| label: get_attributions_xai
from spotpython.plot.xai import get_attributions, plot_attributions
df_att = get_attributions(spot_tuner, fun_control, attr_method="IntegratedGradients", n_rel=10)
plot_attributions(df_att, attr_method="IntegratedGradients")

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 2877.01708984375, 'hp_metric': 2877.01708984375}
config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReLU(

<Figure size 3000x1800 with 1 Axes>

### Deep Lift


In [21]:
#| label: get_attributions_deep_lift
df_lift = get_attributions(spot_tuner, fun_control, attr_method="DeepLift",n_rel=10)
print(df_lift)
plot_attributions(df_lift,  attr_method="DeepLift")

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 2870.75244140625, 'hp_metric': 2870.75244140625}
config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReLU(

<Figure size 3000x1800 with 1 Axes>

### Feature Ablation


In [22]:
#| label: get_attributions_feature_ablation
df_fl = get_attributions(spot_tuner, fun_control, attr_method="FeatureAblation",n_rel=10)

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 2883.1650390625, 'hp_metric': 2883.1650390625}
config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReLU()


In [23]:
#| label: plot_attributions_feature_ablation
print(df_fl)
plot_attributions(df_fl, attr_method="FeatureAblation")

   Feature Index Feature  FeatureAblationAttribution
0              2     bmi               -89691.906250
1              8  s5_ltg               -74743.070312
2              3      bp               -57792.734375
3              6  s3_hdl               -32182.736328
4              9  s6_glu               -16432.332031
5              5  s2_ldl                14808.999023
6              4   s1_tc                11206.945312
7              7  s4_tch                -9825.755859
8              1     sex                 8321.515625
9              0     age                 3099.384277


<Figure size 3000x1800 with 1 Axes>

## Conductance


In [24]:
#| label: get_conductance
from spotpython.plot.xai import plot_conductance_last_layer, get_weights_conductance_last_layer
weights_last, layer_conductance_last = get_weights_conductance_last_layer(spot_tuner, fun_control)
plot_conductance_last_layer(weights_last, layer_conductance_last, figsize=(6, 6))

GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores


HPU available: False, using: 0 HPUs



  | Name   | Type       | Params | Mode  | In sizes  | Out sizes
----------------------------------------------------------------------
0 | layers | Sequential | 1.6 K  | train | [256, 10] | [256, 1] 
----------------------------------------------------------------------
1.6 K     Trainable params
0         Non-trainable params
1.6 K     Total params
0.006     Total estimated model params size (MB)
31        Modules in train mode
0         Modules in eval mode


train_model result: {'val_loss': 2994.139404296875, 'hp_metric': 2994.139404296875}
config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReL

train_model result: {'val_loss': 2892.18212890625, 'hp_metric': 2892.18212890625}
config: {'l1': 32, 'epochs': 4096, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.025, 'lr_mult': 1.9375352248449356, 'patience': 256, 'batch_norm': False, 'initialization': 'xavier_normal'}
Loading model with 32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN from runs/saved_models/32_4096_256_LeakyReLU_Adamax_0.025_1.9375_256_False_xavier_normal_TRAIN/last.ckpt
Model: NNLinearRegressor(
  (layers): Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): LeakyReLU()
    (2): Dropout(p=0.025, inplace=False)
    (3): Linear(in_features=32, out_features=16, bias=True)
    (4): LeakyReLU()
    (5): Dropout(p=0.025, inplace=False)
    (6): Linear(in_features=16, out_features=16, bias=True)
    (7): LeakyReLU()
    (8): Dropout(p=0.025, inplace=False)
    (9): Linear(in_features=16, out_features=8, bias=True)
    (10): LeakyReLU(

<Figure size 1800x1800 with 1 Axes>