Feature request: Assign rule by layer index #76

rodrigobdz · 2021-12-10T09:30:29Z

I'd like to assign LRP-rules by layer index, like shown in the screenshots.
Please correct me if this is already possible, I've taken a look at the code and the paper but it seems currently it's only possible by type.

Source: gmontavon/lrp-tutorial

Source: Layer-Wise Relevance Propagation: An Overview

Thanks for the great framework! I especially like its architecture.

chr5tphr · 2021-12-10T14:16:11Z

A note about the code from the tutorial:
The last line if l >= 31: will actually never happen for vgg16, since there are no AvgPool2d or Conv2d layers beyond index 30.
Also, the LRP-0 above is more like the conventional LRP-Epsilon rule, and the LRP-Epsilon rule above depends on the output, and as such is not really the conventional LRP-Epsilon rule

While it would be straight forward to write a composite which does take the layer index into account, I would suggest to rather go with the NameMapComposite here, since the rules are very model-dependent and it's much more transparent than any index-based approach.

If you want to automatically create a name-map as the composite from the tutorial/ paper, here's some code on how to do it:

Click to unfold code

import torch
from torch.nn import Conv2d, AvgPool2d
from torchvision.models import vgg16

from zennit.composites import NameMapComposite
from zennit.core import BasicHook, collect_leaves, stabilize
from zennit.rules import Gamma, Epsilon


# the LRP-Epsilon from the tutorial
class GMontavonEpsilon(BasicHook):
    def __init__(self, epsilon=1e-6, delta=0.25):
        super().__init__(
            input_modifiers=[lambda input: input],
            param_modifiers=[lambda param, _: param],
            output_modifiers=[lambda output: output],
            gradient_mapper=(lambda out_grad, outputs: out_grad / stabilize(outputs[0] + delta * (outputs[0] ** 2).mean() ** .5, epsilon)),
            reducer=(lambda inputs, gradients: inputs[0] * gradients[0])
        )

model = vgg16()

# only these get rules, linear layers will be attributed by the gradient alone
target_types = (Conv2d, AvgPool2d)
# lookup module -> name
child_name = {module: name for name, module in model.named_modules()}
# the layers in sequential order without any containers etc.
layers = list(enumerate(collect_leaves(model)))

# list of tuples [([names..], rule)] as used by NameMapComposite
name_map = [
    ([child_name[module] for n, module in layers if n <= 16 and isinstance(module, target_types)], Gamma(0.25)),
    ([child_name[module] for n, module in layers if 17 <= n <= 30 and isinstance(module, target_types)], GMontavonEpsilon(1e-9, 0.25)),
    ([child_name[module] for n, module in layers if 30 <= n and isinstance(module, target_types)], Epsilon(1e-9)),
]
# look at the name_map and you will see that there is no layer for which the last condition holds
print(name_map)

# create the composite from the name map
composite = NameMapComposite(name_map)

with composite.context(model) as modified_model:
    # compute attribution
    data = torch.randn(1, 3, 224, 224, requires_grad=True)
    output = modified_model(data)
    output.backward(torch.eye(1000)[[0]])
    # print absolute sum of attribution
    print(data.grad.abs().sum().item())

Note that doing model.named_modules() alone will give you all modules, eg. Sequential, and thus not count the layers correctly for vgg16. However, when manually constructing a name-map, this will show you the names of all layers.

rodrigobdz · 2021-12-13T15:16:11Z

@chr5tphr Awesome! Thank you for your time, the snippet has been of great help.

Clarification about the tutorial:

The case if l >= 31: is indeed evaluated because dense layers in the classifier are converted to convolutional layers in this LOC:
```
layers = list(model.features) + utils.toconv(list(model.classifier))
```

With your help, I've been able to reproduce the results from the LRP tutorial; here are the changes I made:

Set data to castle image, then I plotted the heatmap but it lacked the attributions from the classifier layers and the ZBox rule for the pixel layer.

Code diff

 import torch
from torch.nn import Conv2d, AvgPool2d
from torchvision.models import vgg16

from zennit.composites import NameMapComposite
from zennit.core import BasicHook, collect_leaves, stabilize
from zennit.rules import Gamma, Epsilon

+import cv2
+import numpy
+import utils

# the LRP-Epsilon from the tutorial
class GMontavonEpsilon(BasicHook):
    def __init__(self, epsilon=1e-6, delta=0.25):
        super().__init__(
            input_modifiers=[lambda input: input],
            param_modifiers=[lambda param, _: param],
            output_modifiers=[lambda output: output],
            gradient_mapper=(lambda out_grad, outputs: out_grad / stabilize(outputs[0] + delta * (outputs[0] ** 2).mean() ** .5, epsilon)),
            reducer=(lambda inputs, gradients: inputs[0] * gradients[0])
        )

model = vgg16()

+class BatchNormalize:
+    def __init__(self, mean, std, device=None):
+        self.mean = torch.tensor(mean, device=device)[None, :, None, None]
+        self.std = torch.tensor(std, device=device)[None, :, None, None]
+
+    def __call__(self, tensor):
+        return (tensor - self.mean) / self.std
+
+
+# mean and std of ILSVRC2012 as computed for the torchvision models
+norm_fn = BatchNormalize((0.485, 0.456, 0.406),
+                         (0.229, 0.224, 0.225), device='cpu')
+
# only these get rules, linear layers will be attributed by the gradient alone
target_types = (Conv2d, AvgPool2d)
# lookup module -> name
child_name = {module: name for name, module in model.named_modules()}
# the layers in sequential order without any containers etc.
layers = list(enumerate(collect_leaves(model)))

# list of tuples [([names..], rule)] as used by NameMapComposite
name_map = [
    ([child_name[module] for n, module in layers if n <= 16 and isinstance(module, target_types)], Gamma(0.25)),
    ([child_name[module] for n, module in layers if 17 <= n <= 30 and isinstance(module, target_types)], GMontavonEpsilon(1e-9, 0.25)),
    ([child_name[module] for n, module in layers if 30 <= n and isinstance(module, target_types)], Epsilon(1e-9)),
]
# look at the name_map and you will see that there is no layer for which the last condition holds
print(name_map)

# create the composite from the name map
composite = NameMapComposite(name_map)

+R = None
with composite.context(model) as modified_model:
    # compute attribution
-    data = torch.randn(1, 3, 224, 224, requires_grad=True)
+    # Returns a numpy array in BGR color space, not RGB
+    img = cv2.imread('castle.jpg')
+
+    # Convert from BGR to RGB color space
+    img = img[..., ::-1]
+
+    # img.shape is (224, 224, 3), where 3 corresponds to RGB channels
+    # Divide by 255 (max. RGB value) to normalize pixel values to [0,1]
+    img = img/255.0
+
+    data = norm_fn(
+        torch.FloatTensor(
+            img[numpy.newaxis].transpose([0, 3, 1, 2])*1
+        )
+    )
+    data.requires_grad = True
+
    output = modified_model(data)
-    output.backward(torch.eye(1000)[[0]])
+    output[0].max().backward()
+
    # print absolute sum of attribution
    print(data.grad.abs().sum().item())
+
+    R = data.grad
+
+    utils.heatmap(R[0].sum(dim=0).detach().numpy(), 4,4)

Finally, added Linear to target_types, fixed LRP-0 rule, and assigned ZBox-rule to pixel layer.

Code diff

 import torch
-from torch.nn import Conv2d, AvgPool2d
+from torch.nn import Conv2d, AvgPool2d, Linear
from torchvision.models import vgg16

from zennit.composites import NameMapComposite
from zennit.core import BasicHook, collect_leaves, stabilize
-from zennit.rules import Gamma, Epsilon
+from zennit.rules import Gamma, Epsilon, ZBox

import cv2
import numpy
import utils

# the LRP-Epsilon from the tutorial
class GMontavonEpsilon(BasicHook):
-    def __init__(self, epsilon=1e-6, delta=0.25):
+    def __init__(self, stabilize_epsilon=1e-6, epsilon=0.25):
        super().__init__(
            input_modifiers=[lambda input: input],
            param_modifiers=[lambda param, _: param],
            output_modifiers=[lambda output: output],
-            gradient_mapper=(lambda out_grad, outputs: out_grad / stabilize(outputs[0] + delta * (outputs[0] ** 2).mean() ** .5, epsilon)),
+            gradient_mapper=(lambda out_grad, outputs: out_grad / stabilize(
+                outputs[0] + epsilon * (outputs[0] ** 2).mean() ** .5, stabilize_epsilon)),
            reducer=(lambda inputs, gradients: inputs[0] * gradients[0])
        )

-model = vgg16()
+
+# use the gpu if requested and available, else use the cpu
+device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
+

class BatchNormalize:
    def __init__(self, mean, std, device=None):
        self.mean = torch.tensor(mean, device=device)[None, :, None, None]
        self.std = torch.tensor(std, device=device)[None, :, None, None]

    def __call__(self, tensor):
        return (tensor - self.mean) / self.std


# mean and std of ILSVRC2012 as computed for the torchvision models
norm_fn = BatchNormalize((0.485, 0.456, 0.406),
-                         (0.229, 0.224, 0.225), device='cpu')
+                         (0.229, 0.224, 0.225), device=device)
+batch_size = 1
+# the maximal input shape, needed for the ZBox rule
+shape = (batch_size, 3, 224, 224)
+
+# the highest and lowest pixel values for the ZBox rule
+low = norm_fn(torch.zeros(*shape, device=device))
+high = norm_fn(torch.ones(*shape, device=device))
+
+
+model = vgg16(pretrained=True)
+model.eval()

# only these get rules, linear layers will be attributed by the gradient alone
-target_types = (Conv2d, AvgPool2d)
+target_types = (Conv2d, AvgPool2d, Linear)
# lookup module -> name
child_name = {module: name for name, module in model.named_modules()}
# the layers in sequential order without any containers etc.
layers = list(enumerate(collect_leaves(model)))

# list of tuples [([names..], rule)] as used by NameMapComposite
name_map = [
-    ([child_name[module] for n, module in layers if n <= 16 and isinstance(module, target_types)], Gamma(0.25)),
-    ([child_name[module] for n, module in layers if 17 <= n <= 30 and isinstance(module, target_types)], GMontavonEpsilon(1e-9, 0.25)),
-    ([child_name[module] for n, module in layers if 30 <= n and isinstance(module, target_types)], Epsilon(1e-9)),
+    ([child_name[module] for n, module in layers if n == 0 and isinstance(module, target_types)], ZBox(low=low, high=high)),
+    ([child_name[module] for n, module in layers if 1 <= n <= 16 and isinstance(module, target_types)], Gamma(0.25)),
+    ([child_name[module] for n, module in layers if 17 <= n <= 30 and isinstance(module, target_types)], GMontavonEpsilon(stabilize_epsilon=0, epsilon=0.25)),
+    ([child_name[module] for n, module in layers if 31 <= n and isinstance(module, target_types)], Epsilon(0)),
]
+
# look at the name_map and you will see that there is no layer for which the last condition holds
print(name_map)

# create the composite from the name map
composite = NameMapComposite(name_map)

R = None
with composite.context(model) as modified_model:
    # compute attribution
    # Returns a numpy array in BGR color space, not RGB
    img = cv2.imread('castle.jpg')

    # Convert from BGR to RGB color space
    img = img[..., ::-1]

    # img.shape is (224, 224, 3), where 3 corresponds to RGB channels
    # Divide by 255 (max. RGB value) to normalize pixel values to [0,1]
    img = img/255.0

    data = norm_fn(
        torch.FloatTensor(
            img[numpy.newaxis].transpose([0, 3, 1, 2])*1
        )
    )
    data.requires_grad = True

    output = modified_model(data)
    output[0].max().backward()

    # print absolute sum of attribution
    print(data.grad.abs().sum().item())

    R = data.grad

    utils.heatmap(R[0].sum(dim=0).detach().numpy(), 4,4)

Python code

import torch
from torch.nn import Conv2d, AvgPool2d, Linear
from torchvision.models import vgg16

from zennit.composites import NameMapComposite
from zennit.core import BasicHook, collect_leaves, stabilize
from zennit.rules import Gamma, Epsilon, ZBox

import cv2
import numpy
import utils

# the LRP-Epsilon from the tutorial
class GMontavonEpsilon(BasicHook):
    def __init__(self, stabilize_epsilon=1e-6, epsilon=0.25):
        super().__init__(
            input_modifiers=[lambda input: input],
            param_modifiers=[lambda param, _: param],
            output_modifiers=[lambda output: output],
            gradient_mapper=(lambda out_grad, outputs: out_grad / stabilize(
                outputs[0] + epsilon * (outputs[0] ** 2).mean() ** .5, stabilize_epsilon)),
            reducer=(lambda inputs, gradients: inputs[0] * gradients[0])
        )

# use the gpu if requested and available, else use the cpu
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Source: https://github.com/chr5tphr/zennit/blob/6251a9e17aa31c3381799de92f92b1d259b392b2/share/example/feed_forward.py#L32-L38
class BatchNormalize:
    def __init__(self, mean, std, device=None):
        self.mean = torch.tensor(mean, device=device)[None, :, None, None]
        self.std = torch.tensor(std, device=device)[None, :, None, None]

    def __call__(self, tensor):
        return (tensor - self.mean) / self.std


# mean and std of ILSVRC2012 as computed for the torchvision models
norm_fn = BatchNormalize((0.485, 0.456, 0.406),
                        (0.229, 0.224, 0.225), device=device)
batch_size = 1
# the maximal input shape, needed for the ZBox rule
shape = (batch_size, 3, 224, 224)

# the highest and lowest pixel values for the ZBox rule
low = norm_fn(torch.zeros(*shape, device=device))
high = norm_fn(torch.ones(*shape, device=device))


model = vgg16(pretrained=True)
model.eval()

# only these get rules, linear layers will be attributed by the gradient alone
# target_types = (Conv2d, AvgPool2d)
target_types = (Conv2d, AvgPool2d, Linear)
# lookup module -> name
child_name = {module: name for name, module in model.named_modules()}
# the layers in sequential order without any containers etc.
layers = list(enumerate(collect_leaves(model)))

# list of tuples [([names..], rule)] as used by NameMapComposite
name_map = [
    ([child_name[module] for n, module in layers if n == 0 and isinstance(module, target_types)], ZBox(low=low, high=high)),
    ([child_name[module] for n, module in layers if 1 <= n <= 16 and isinstance(module, target_types)], Gamma(0.25)),
    ([child_name[module] for n, module in layers if 17 <= n <= 30 and isinstance(module, target_types)], GMontavonEpsilon(stabilize_epsilon=0, epsilon=0.25)),
    ([child_name[module] for n, module in layers if 31 <= n and isinstance(module, target_types)], Epsilon(0)),
]

# look at the name_map and you will see that there is no layer for which the last condition holds
display(name_map)

# create the composite from the name map
composite = NameMapComposite(name_map)

R = None
with composite.context(model) as modified_model:
    # compute attribution
    # Returns a numpy array in BGR color space, not RGB
    img = cv2.imread('castle.jpg')

    # Convert from BGR to RGB color space
    img = img[..., ::-1]

    # img.shape is (224, 224, 3), where 3 corresponds to RGB channels
    # Divide by 255 (max. RGB value) to normalize pixel values to [0,1]
    img = img/255.0

    data = norm_fn(
        torch.FloatTensor(
            img[numpy.newaxis].transpose([0, 3, 1, 2])*1
        )
    )
    data.requires_grad = True

    output = modified_model(data)
    output[0].max().backward()

    # print absolute sum of attribution
    print(data.grad.abs().sum().item())

    R = data.grad

    utils.heatmap(R[0].sum(dim=0).detach().numpy(), 4,4)

Reproduce gmontavon/lrp-tutorial with zennit framework. Related issue chr5tphr/zennit#76.

rodrigobdz closed this as completed Dec 13, 2021

rodrigobdz added a commit to rodrigobdz/lrp that referenced this issue Dec 13, 2021

Notebook: Init zennit-lrp-tutorial.ipynb

2089fda

Reproduce gmontavon/lrp-tutorial with zennit framework. Related issue chr5tphr/zennit#76.

rodrigobdz mentioned this issue Jun 15, 2022

Numerical instability in ResNet50 heatmaps #148

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Assign rule by layer index #76

Feature request: Assign rule by layer index #76

rodrigobdz commented Dec 10, 2021 •

edited

Loading

chr5tphr commented Dec 10, 2021 •

edited

Loading

rodrigobdz commented Dec 13, 2021 •

edited

Loading

Feature request: Assign rule by layer index #76

Feature request: Assign rule by layer index #76

Comments

rodrigobdz commented Dec 10, 2021 • edited Loading

chr5tphr commented Dec 10, 2021 • edited Loading

rodrigobdz commented Dec 13, 2021 • edited Loading

rodrigobdz commented Dec 10, 2021 •

edited

Loading

chr5tphr commented Dec 10, 2021 •

edited

Loading

rodrigobdz commented Dec 13, 2021 •

edited

Loading