<a href="https://colab.research.google.com/github/finardi/tutos/blob/master/Layer_Integrated_Gradients.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

[https://arxiv.org/pdf/1703.01365.pdf](https://arxiv.org/pdf/1703.01365.pdf)

<img src="https://drive.google.com/uc?id=1M5-5_CrvT9BTJyZR6NaycboFzoMAQv7I" alt="drawing" width="500"/>


In [None]:
class ToyModel_F(nn.Module):
    """
    Example toy model from the original paper (page 10)
    f(x1, x2) = RELU(ReLU(x1) - 1 - ReLU(x2))
    """

    def __init__(self):
        super().__init__()

    def forward(self, x1, x2):
        relu_x1 = F.relu(x1)
        relu_x2 = F.relu(x2)
        return F.relu(relu_x1 - 1 - relu_x2)

> ###  The code snippet below computes the attribution of output with respect to the inputs. `attribute` method of `IntegratedGradients` class returns input attributions which have the same size and dimensionality as the inputs and an approximation error which is computed based on the completeness property of the integrated gradients. 

#### Completeness property is one of the axioms that integrated gradients satisfies. *It states that the sum of the attributions must be equal to the difference between the output of the DNN function F at the inputs and corresponding baselines. The baselines also have the same shape and dimensionality as the inputs and if not provided zero is used as default value.*

In [None]:
!pip install -q captum

[?25l[K     |▎                               | 10 kB 22.3 MB/s eta 0:00:01[K     |▌                               | 20 kB 11.2 MB/s eta 0:00:01[K     |▊                               | 30 kB 8.6 MB/s eta 0:00:01[K     |█                               | 40 kB 3.8 MB/s eta 0:00:01[K     |█▏                              | 51 kB 3.7 MB/s eta 0:00:01[K     |█▍                              | 61 kB 4.4 MB/s eta 0:00:01[K     |█▋                              | 71 kB 4.7 MB/s eta 0:00:01[K     |█▉                              | 81 kB 4.9 MB/s eta 0:00:01[K     |██                              | 92 kB 5.5 MB/s eta 0:00:01[K     |██▎                             | 102 kB 4.4 MB/s eta 0:00:01[K     |██▌                             | 112 kB 4.4 MB/s eta 0:00:01[K     |██▊                             | 122 kB 4.4 MB/s eta 0:00:01[K     |███                             | 133 kB 4.4 MB/s eta 0:00:01[K     |███▏                            | 143 kB 4.4 MB/s eta 0:00:01[K   

In [None]:
from captum.attr import IntegratedGradients
model = ToyModel_F()

# defining model input tensors
input1 = torch.tensor([3.0], requires_grad=True)
input2 = torch.tensor([1.0], requires_grad=True)

# defining baselines for each input tensor
baseline1 = torch.tensor([0.0])
baseline2 = torch.tensor([0.0])

# defining and applying integrated gradients on ToyModel and the
ig = IntegratedGradients(model)
attributions_F, approximation_error_F = ig.attribute(
    (input1, input2),
    baselines=(baseline1, baseline2),
    method='gausslegendre',
    return_convergence_delta=True,
    )

In [None]:
attributions_F

(tensor([1.5000], dtype=torch.float64, grad_fn=<MulBackward0>),
 tensor([-0.5000], dtype=torch.float64, grad_fn=<MulBackward0>))

In [None]:
approximation_error_F

tensor([0.], dtype=torch.float64)

In [None]:
class ToyModel_G(nn.Module):
    """
    Example toy model from the original paper (page 10)
    f(x1, x2) = ReLU(x1 - x2)
    """

    def __init__(self):
        super().__init__()

    def forward(self, x1, x2):
        relu_x1 = F.relu(x1 -1)
        relu_x2 = F.relu(x2)

        return F.relu(relu_x1 - relu_x2)

In [None]:
# defining model input tensors
model = ToyModel_G()

input1 = torch.tensor([3.0], requires_grad=True)
input2 = torch.tensor([1.0], requires_grad=True)

# defining baselines for each input tensor
baseline1 = torch.tensor([0.0])
baseline2 = torch.tensor([0.0])

# defining and applying integrated gradients on ToyModel and the
ig = IntegratedGradients(model)
attributions_G, approximation_error_G = ig.attribute(
    (input1, input2),
    baselines=(baseline1, baseline2),
    method='gausslegendre',
    return_convergence_delta=True,
    )

In [None]:
attributions_G

(tensor([1.5000], dtype=torch.float64, grad_fn=<MulBackward0>),
 tensor([-0.5000], dtype=torch.float64, grad_fn=<MulBackward0>))

In [None]:
approximation_error_G

tensor([0.], dtype=torch.float64)

In [None]:
!pip install -q alibi

[K     |████████████████████████████████| 445 kB 4.5 MB/s 
[K     |████████████████████████████████| 60 kB 8.1 MB/s 
[K     |████████████████████████████████| 4.7 MB 65.1 MB/s 
[K     |████████████████████████████████| 98.5 MB 102 kB/s 
[K     |████████████████████████████████| 6.6 MB 45.4 MB/s 
[K     |████████████████████████████████| 596 kB 73.0 MB/s 
[K     |████████████████████████████████| 101 kB 11.5 MB/s 
[?25h

In [None]:
from alibi.explainers import IntegratedGradients

# model = tf.keras.models.load_model("path_to_your_model")

ig  = IntegratedGradients(model,
                        #   layer=None,
                        #   taget_fn=None,
                          method="gausslegendre",
                        #   n_steps=50,
                        #   internal_batch_size=100,
                          )

AttributeError: ignored

output

...................

attributions: (tensor([1.5000], grad_fn=<MulBackward0>),
               tensor([-0.5000], grad_fn=<MulBackward0>))

approximation_error (aka delta): 1.1801719665527344e-05