$\textbf{Conflicting Dataset}$

Initialise the default dataset

In [2]:
import torch
from xaiunits.datagenerator import ConflictingDataset

data = ConflictingDataset()

It is possible to access the main attributes: 
* `n_features` being the number of features
* `cancellation_features` being the indices of the features subject to cancellation
* `cancellation_outcomes` being a binary tensor indicating whether each feature in each sample is canceled
* `cancellation_samples` being a concatenation of samples with their cancellation outcomes
* `cancellation_attributes` is the attribution of each feature considering the cancellation

In [3]:
       
print(data.n_features)
print(data.cancellation_features)
print(data.cancellation_outcomes)
print(data.cancellation_samples)
print(data.cancellation_attributions)

2
[0, 1]
tensor([[0, 1],
        [1, 0],
        [0, 1],
        [0, 1],
        [1, 1],
        [0, 1],
        [0, 0],
        [0, 0],
        [1, 1],
        [0, 0]], dtype=torch.int32)
tensor([[-1.1258, -1.1524,  0.0000,  1.0000],
        [-0.2506, -0.4339,  1.0000,  0.0000],
        [ 0.5988, -1.5551,  0.0000,  1.0000],
        [-0.3414,  1.8530,  0.0000,  1.0000],
        [ 0.4681, -0.1577,  1.0000,  1.0000],
        [ 1.4437,  0.2660,  0.0000,  1.0000],
        [ 1.3894,  1.5863,  0.0000,  0.0000],
        [ 0.9463, -0.8437,  0.0000,  0.0000],
        [ 0.9318,  1.2590,  1.0000,  1.0000],
        [ 2.0050,  0.0537,  0.0000,  0.0000]])
tensor([[ 0.0000, -0.8934],
        [ 0.0467,  0.0000],
        [ 0.0000, -1.2057],
        [ 0.0000,  1.4366],
        [-0.0872, -0.1223],
        [ 0.0000,  0.2063],
        [ 0.0000,  0.0000],
        [ 0.0000,  0.0000],
        [-0.1736,  0.9761],
        [ 0.0000,  0.0000]])


For every dataset you can access the attributes and change them. If you change some attributes, other attributes may also need to be adapted accordingly. An example is shown below. If you change the cancellation features attributes, you need to update the `cancellation_outcomes`, `cancellation_samples`, `cancellation_attributions` attributes, as well as the weights.

In [4]:
data.cancellation_features = [1,0]
data.weights = data._initialize_weights(data.weights, data.weight_range)[0]
data.cancellation_outcomes = data._get_cancellations()
data.cancellation_samples = data._get_cancellation_samples()
data.cancellation_attributions = data._get_cancellation_attributions()
print(data.cancellation_outcomes)
print(data.cancellation_samples)
print(data.cancellation_attributions)

tensor([[1, 1],
        [0, 0],
        [1, 1],
        [1, 1],
        [1, 1],
        [1, 0],
        [1, 0],
        [1, 0],
        [0, 1],
        [1, 0]], dtype=torch.int32)
tensor([[-1.1258, -1.1524,  1.0000,  1.0000],
        [-0.2506, -0.4339,  0.0000,  0.0000],
        [ 0.5988, -1.5551,  1.0000,  1.0000],
        [-0.3414,  1.8530,  1.0000,  1.0000],
        [ 0.4681, -0.1577,  1.0000,  1.0000],
        [ 1.4437,  0.2660,  1.0000,  0.0000],
        [ 1.3894,  1.5863,  1.0000,  0.0000],
        [ 0.9463, -0.8437,  1.0000,  0.0000],
        [ 0.9318,  1.2590,  0.0000,  1.0000],
        [ 2.0050,  0.0537,  1.0000,  0.0000]])
tensor([[ 0.2098, -0.8934],
        [ 0.0000,  0.0000],
        [-0.1116, -1.2057],
        [ 0.0636,  1.4366],
        [-0.0872, -0.1223],
        [-0.2690,  0.0000],
        [-0.2589,  0.0000],
        [-0.1763,  0.0000],
        [ 0.0000,  0.9761],
        [-0.3736,  0.0000]])


Every datasets have a `generate_model` method which generates the paired model. 

Generate the corresponding model. 

In [5]:
model = data.generate_model()
print(type(model))

<class 'xaiunits.model.conflicting.ConflictingFeaturesNN'>


$\textbf{Pertinent Negative Dataset}$

Initialise the default dataset.

In [18]:
from xaiunits.datagenerator import PertinentNegativesDataset

data = PertinentNegativesDataset()

Again, here are the attributes for this dataset:
* `n_features` is the number of features
* `weights` are the weights of the model
* `pn_features` represents the indices of features to be considered as pertinent negatives
* `pn_weight_factor` is the factor representing the enhance impact of pertinent negatives
* `pn_zero_likelihood` represent the likelihood of a pertinent negative feature being set to zero

In [19]:
print(data.n_features)
print(data.weights)
print(data.pn_features)
print(data.pn_weight_factor)
print(data.pn_zero_likelihood)

print(data.samples)

5
tensor([-0.1646, -0.4578,  0.3846, -0.5923,  0.3666])
[0]
10
0.5
tensor([[ 0.0000, -1.1524, -0.2506, -0.4339,  0.8487],
        [ 1.0000, -0.3160, -2.1152,  0.3223, -1.2633],
        [ 0.0000,  0.3081,  0.1198,  1.2377,  1.1168],
        [ 0.0000, -1.3527, -1.6959,  0.5667,  0.7935],
        [ 1.0000, -1.5551, -0.3414,  1.8530,  0.7502],
        [ 1.0000, -0.1734,  0.1835,  1.3894,  1.5863],
        [ 0.0000, -0.8437, -0.6136,  0.0316,  1.0554],
        [ 0.0000, -0.2303, -0.3918,  0.5433, -0.3952],
        [ 0.0000, -0.4503,  1.5210,  3.4105, -1.5312],
        [ 0.0000,  1.8197, -0.5515, -1.3253,  0.1886]])


If you change one of the previous attribute you then need to call the following method.

In [20]:
data.pn_features = [0,1]
data.pn_weight_factor = 20

data._initialize_zeros_for_PN()
data._get_new_weighted_samples()


print(data.samples)


tensor([[ 0.0000,  0.0000, -0.2506, -0.4339,  0.8487],
        [ 1.0000,  1.0000, -2.1152,  0.3223, -1.2633],
        [ 1.0000,  0.0000,  0.1198,  1.2377,  1.1168],
        [ 1.0000,  1.0000, -1.6959,  0.5667,  0.7935],
        [ 1.0000,  1.0000, -0.3414,  1.8530,  0.7502],
        [ 1.0000,  1.0000,  0.1835,  1.3894,  1.5863],
        [ 0.0000,  0.0000, -0.6136,  0.0316,  1.0554],
        [ 1.0000,  0.0000, -0.3918,  0.5433, -0.3952],
        [ 1.0000,  1.0000,  1.5210,  3.4105, -1.5312],
        [ 1.0000,  1.0000, -0.5515, -1.3253,  0.1886]])


Generate the corresponding model

In [21]:
model = data.generate_model()
print(type(model))

<class 'xaiunits.model.pertinent_negative.PertinentNN'>


$\textbf{Interacting Features Dataset}$

In [39]:
from xaiunits.datagenerator import InteractingFeatureDataset

data = InteractingFeatureDataset()

The main attributes of the `InteractingFeatureDataset` include:
* `n_features` is the number of features
* `weights` represents the weights of the model
* `interacting_features` represents the pairs of indices where the first index is the feature whose weight is influenced by the second categorical feature

In [40]:
print(data.n_features)
print(data.weights)
print(data.interacting_features)

4
[(0.44000208377838135, 0.8909304141998291), 0.330818772315979, (0.9996763467788696, 0.5186629295349121), 0.6216483116149902]
[[1, 0], [3, 2]]


Generating the correspoding model

In [41]:
model = data.generate_model()
print(type(model))

<class 'xaiunits.model.interaction_features.InteractingFeaturesNN'>


$\textbf{Uncertainty Aware Dataset}$

In [52]:
from xaiunits.datagenerator import UncertaintyAwareDataset

data = UncertaintyAwareDataset()

The main attributes of the `UncertaintyAwareDataset` include:
* `n_features` is the number of features
* `weights` corresponds to the weights of the model
* `common_features` represents the number of common features

In [53]:
print(data.n_features)
print(data.weights)
print(data.common_features)

5
tensor([[1., 0., 0., 0., 1.],
        [0., 1., 0., 0., 1.],
        [0., 0., 1., 0., 1.],
        [0., 0., 0., 1., 1.]])
1


If you change any of the previous attributes, you can call the `_create_weights` method which adapt the weights accordingly to the number of common features. Set the weights to `None` in the input if you want the weights of dataset to be adapted. Otherwise you can just set the weight manually.

In [57]:
data.common_features = 3

data.weights = data._create_weights(data.n_features, None, data.common_features)

print(data.weights)
print(data.common_features)

tensor([[1., 0., 1., 1., 1.],
        [0., 1., 1., 1., 1.]])
3


In [58]:
model = data.generate_model()
print(type(model))

<class 'xaiunits.model.uncertainty_model.UncertaintyNN'>


$\textbf{Shattered Gradient Dataset}$

Initialise the Dataset

In [59]:
from xaiunits.datagenerator import ShatteredGradientsDataset

data = ShatteredGradientsDataset()

Access the main attributes

The main attributes of the `ShatteredGradientsDataset` include:
* `n_features` represents the number of features
* `discontinuity_ratios` is the ratio indicating feature discontinuity
* `bias` is the bias value of the model
* `act_fun` represents the activation function used in the model

In [60]:
print(data.n_features) 
print(data.discontinuity_ratios)
print(data.bias)
print(data.act_fun)

5
[-1, 4, -2, -5, -2]
0.5
ReLU()


In [61]:
data.generate_model()
print(type(data))

<class 'xaiunits.datagenerator.shattered_grad.ShatteredGradientsDataset'>


$\textbf{Boolean Formula Dataset}$

Initialise the default dataset

Here we also need to define the initial atoms as well as the Boolean formula.

In [62]:
from xaiunits.datagenerator import BooleanDataset

from sympy import symbols

x, y, z, a = symbols("x y z a")
k = (x & (y | ~z)) & (z | a)
data = BooleanDataset(k)

Access the main attributes:
* `atoms` being the atoms
* `formula` being the boolean formula provided

In [3]:
print(data.formula)
print(data.atoms)

x & (a | z) & (y | ~z)
(x, y, z, a)


$\textbf{Balanced Image Dataset}$

Initialise the default dataset.

In [12]:
from xaiunits.datagenerator.image_generation import BalancedImageDataset

data = BalancedImageDataset()

The main attributes of `BalancedImageDataset` are:

* `backgrounds` is a list of specific backgrounds to use
* `shapes` is a list of specific shapes
* `shape_colors` is the default color(s) for shapes

In [7]:
print(data.backgrounds)
print(data.shapes)
print(data.shape_colors)
       

['blotchy_0083.jpg', 'lacelike_0065.jpg', 'lined_0086.jpg', 'stratified_0101.jpg', 'fibrous_0171.jpg']
['heptagon', 'hexagon', 'rectangle', 'decagon', 'triangle', 'octagon', 'ellipse', 'pentagon', 'square', 'circle']
[(0, 255, 0, 255)]


Here we can show an image given its tensor representation.

In [16]:
x, y_label, context = data[0]
data.show_image(x)
print(y_label)

6


And here we show another image

In [17]:
x, y_label, context = data[3]
data.show_image(x)
print(y_label)

3


$\textbf{Imbalanced Image Dataset}$

This dataset is very similar to the previous one. But here imbalance refers to the fact that users can specify the percentage of dominant (background, foreground) pair versus other pair.


In [19]:
from xaiunits.datagenerator.image_generation import ImbalancedImageDataset

data = ImbalancedImageDataset()

In [20]:
print(data.backgrounds)
print(data.shapes)
print(data.shape_colors)

['blotchy_0083.jpg', 'lacelike_0065.jpg', 'lined_0086.jpg', 'stratified_0101.jpg', 'fibrous_0171.jpg']
['heptagon', 'hexagon', 'rectangle']
[(255, 0, 0, 255)]


In [21]:
x, y_label, context = data[0]
data.show_image(x)
print(y_label)

0
