-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: update , change link to relative, add black and white assets, a…
…dd doc on operator and model's expectations
- Loading branch information
1 parent
82231f4
commit 1cc4ffd
Showing
23 changed files
with
797 additions
and
377 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Model expectations: What should be the model provided ? | ||
|
||
Even though we tried to cover a wide-range of models for the XAI methods to work we based our frameworks on some assumptions which we propose to see here. | ||
|
||
## The inputs have expected shape | ||
|
||
As a reminder any attribution methods are instanciated with at least three parameters: | ||
|
||
- `model`: the model from which we want to obtain attributions (e.g: InceptionV3, ResNet, ...) | ||
- `batch_size`: an integer which allows to either process inputs per batch (gradient-based methods) or process perturbed samples of an input per batch (inputs are therefore process one by one) | ||
- `operator`: function g to explain, see the [Operator documentation](../operator/) for more details | ||
|
||
And an explainer is called with the `explain` method that takes as parameters: | ||
|
||
- `inputs`: One of the following: a `tf.data.Dataset` (in which case you should not provide `targets`), a `tf.Tensor` or a `np.ndarray` | ||
|
||
- `targets`: One of the following: a `tf.Tensor` or a `np.ndarray` | ||
|
||
!!!tip | ||
In general, if you are doing classification tasks it is better to not include the final softmax layer in your model but to work with logits instead! | ||
|
||
### General | ||
|
||
In practice, we expect the `model` to be callable for the `inputs` parameters -- *i.e.* we can do `model(inputs)`. We expect this call to produce the `outputs` variables that is the predictions of the model on those inputs. As for most attribution methods we need to manipulate and/or link the `outputs` to the `inputs` we assume that the latter have conventional shape described in the sections below. | ||
|
||
### Images data | ||
|
||
If inputs are images, the expected shape of `inputs` is $(N, H, W, C)$ following the TF's conventions where: | ||
|
||
- $N$ is the number of inputs | ||
- $H$ is the height of the images | ||
- $W$ is the width of the images | ||
- $C$ is the number of channels (works for $C=3$ or $C=1$, other values might not work or need further customization) | ||
|
||
In the case where `inputs` is a `tf.data.Dataset` with images, then we expect each sample of the dataset to be a tuple `(image, target)` with `image` having $(H, W, C)$ shape and target being a one-hot encoding of the output you want an explanation of. | ||
|
||
!!!warning | ||
If your model is not following the same conventions it might lead to poor results or yield errors. | ||
|
||
### Tabular data | ||
|
||
If inputs are tabular data, the expected shape of `inputs` is $(N, W)$ where: | ||
|
||
- $N$ is the number of inputs | ||
- $W$ is the feature dimension of a single input | ||
|
||
In the case where `inputs` is a `tf.data.Dataset` with tabular data, then we expect each sample of the dataset to be a tuple `(features, target)` with `features` having $W$ shape and target being a one-hot encoding of the output you want an explanation of. | ||
|
||
!!!info | ||
All attribution methods does not work well with tabular data. | ||
|
||
!!!tip | ||
Please refer to the [table](../../../#whats-included) to see which methods might work with Tabular Data | ||
|
||
### Time-Series data | ||
|
||
If inputs are Time Series, the expected shape of `inputs` is $(N, T, W)$ | ||
|
||
- $N$ is the number of inputs | ||
- $T$ is the temporal dimension of a single input | ||
- $W$ is the feature dimension of a single input | ||
|
||
!!!note | ||
By default `Lime` & `KernelShap` will treat such inputs as grey images. You will need to define a custom `map_to_interpret_space` when building such explainers. | ||
|
||
## What if my inputs and/or my model does not follow those assumptions ? | ||
|
||
!!!warning | ||
In any case, when you are out of the scope of the original API, you should take a deep look at the source code to be sure that your Use Case will make sense. | ||
|
||
### My inputs follow a different shape convention | ||
In the case where you want to handle images or time series data that does not follow the previous conventions, it is recommended to reshape the data to the expected shape for the explainers (attribution methods) to handle them correctly. Then, you can simply define a wrapper of your model so that data is reshape to your model convenience when it is called. | ||
|
||
For example, if you have a `model` that classifies images but want the images to be channel-first (*i.e.* with $(N, C, H, W)$ shape) then you should: | ||
|
||
- Move the axis so inputs are $(N, H, W, C)$ for the explainers | ||
- Write the following wrapper for your model: | ||
|
||
```python | ||
class ModelWrapper(tf.keras.models.Model): | ||
def __init__(self, nchw_model): | ||
super(ModelWrapper, self).__init__() | ||
self.model = nchw_model | ||
|
||
def __call__(self, nhwc_inputs): | ||
# transform the NHWC inputs (wanted for the explainers) back to NCHW inputs | ||
nchw_inputs = self._transform_inputs(nhwc_inputs) | ||
# make predictions | ||
outputs = self.nchw_model(nchw_inputs) | ||
|
||
return outputs | ||
|
||
def _transform_inputs(self, nhwc_inputs): | ||
# include in this function all transformation | ||
# needed for your model to work with NHWC inputs | ||
# , here for example we moveaxis from channels last | ||
# to channels first | ||
nchw_inputs = np.moveaxis(nhwc_inputs, [3, 1, 2], [1, 2, 3]) | ||
|
||
return nchw_inputs | ||
|
||
wrapped_model = ModelWrapper(model) | ||
explainer = Saliency(wrapped_model) | ||
# images should be (N, H, W, C) for the explain call | ||
explanations = explainer.explain(images, labels) | ||
``` | ||
|
||
### My inputs are a dictionnary (ex: Attention Model) | ||
|
||
**Work In Progress** | ||
|
||
### My model is neither for classification nor regression tasks | ||
|
||
If you have an object detector then you should have a look on the [Object Detector documentation](../object_detector/). | ||
|
||
In the case you want to do semantic/panoptic/binary segmentation or any other task you should have a look on the [documentation for the operator parameter](../operator/). | ||
|
||
!!!warning | ||
Using attribution methods on tasks different than their original ones might yield poor results. It is mainly an experimental feature and the relevance of outcomes is not at all guarantee. | ||
|
||
### I have a PyTorch model | ||
|
||
Then you should definetely have a look on the [dedicated documentation](../../../pytorch/)! | ||
|
||
### I have a model that is neither a tf.keras.Model nor a torch.nn.Module | ||
|
||
Then you should take a look on the [Callable documentation](../../../callable/)! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
### Work In Progress |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,191 @@ | ||
# Faire un enumérer | ||
# Operator | ||
|
||
`operator` is one of the main parameters for both attribution methods and metrics. It defines the function $g$ that we want to explain. *E.g.*: In the case we have a classifier model the function that we might want to explain is the one that given a target gives us the score of the model for that specific target -- *i.e* $model(input)[target]$. | ||
|
||
!!!note | ||
The `operator` parameter is a feature avaible for version > $1.$. The `operator` default values are the ones used before the introduction of this new feature! | ||
|
||
## Leitmotiv | ||
|
||
The `operator` parameter was introduced to offer users a flexible way to adapt current attribution methods or metrics. It should help them to empirically tackle new use-cases/new tasks. Broadly speaking, it should amplify the user's ability to experiment. However, this also imply that it is the user responsability to make sure that its derivationns are in-scope of the original method and make sense. | ||
|
||
## Operators' Signature | ||
|
||
An `operator` is a function $g$ that we want to explain. This function take as input $3$ parameters: | ||
|
||
- `model`, the model under investigation | ||
- `inputs`: One of the following: a `tf.data.Dataset` (in which case you should not provide `targets`), a `tf.Tensor` or a `np.ndarray` | ||
- `targets`: One of the following: a `tf.Tensor` or a `np.ndarray` | ||
|
||
!!!info | ||
More specification concerning `model` or `inputs` can be found in the [model's documentation](../model/) | ||
|
||
This function $g$ should return a **vector of scalar value** of size $(N,)$ where $N$ is the number of input in `inputs` -- *i.e* a scalar score per input. | ||
|
||
## Providing custom operator | ||
|
||
If you provide a custom operator you should be aware that: | ||
|
||
- An assertion will be made to ensure it respects the signature describe in the previous section | ||
- Your operator will go through the `get_gradient_of_operator` method if you use any white-box explainer | ||
|
||
```python | ||
def get_gradient_of_operator(operator): | ||
""" | ||
Get the gradient of an operator. | ||
Parameters | ||
---------- | ||
operator | ||
Operator to compute the gradient of. | ||
Returns | ||
------- | ||
gradient | ||
Gradient of the operator. | ||
""" | ||
@tf.function | ||
def gradient(model, inputs, targets): | ||
with tf.GradientTape() as tape: | ||
tape.watch(inputs) | ||
scores = operator(model, inputs, targets) | ||
|
||
return tape.gradient(scores, inputs) | ||
|
||
return gradient | ||
``` | ||
|
||
!!!tip | ||
Writing your operator with only tensorflow functions should increase your chance that this method does not yield any errors. | ||
|
||
## How is the operator used in Xplique ? | ||
|
||
### Black-box attribution methods | ||
|
||
For attribution approaches that do not require gradient computation we mostly need to query the model. Thus, those methods need an inference function. If you provide an `operator`, it will be the inference function. | ||
|
||
More concretely, for this kind of approach you want to compare some valued function for an original input and perturbed version of it: | ||
|
||
```python | ||
original_scores = operator(model, original_inputs, original_targets) | ||
|
||
# depending on the attribution method this `perturbation_function is different` | ||
perturbed_inputs, perturbed_targets = perturbation_function(original_inputs, original_targets) | ||
perturbed_scores = operator(model, perturbed_inputs, perturbed_targets) | ||
|
||
# exemple of comparison of interest | ||
diff_scores = math.sqrt((original_scores - perturbed_scores)**2) | ||
``` | ||
|
||
### White-box attribution methods | ||
|
||
Those methods usually require some gradients computation. The gradients that will be used are the one of the operator function (see the `get_gradient_of_operator` method in the previous section). | ||
|
||
## Default Behavior | ||
|
||
### Attribution methods | ||
|
||
A lot of attribution methods are initially intended for classification tasks. Thus, the default operator `predictions_operator` assume such a setting | ||
|
||
```python | ||
@tf.function | ||
def predictions_operator(model: Callable, | ||
inputs: tf.Tensor, | ||
targets: tf.Tensor) -> tf.Tensor: | ||
""" | ||
Compute predictions scores, only for the label class, for a batch of samples. | ||
Parameters | ||
---------- | ||
model | ||
Model used for computing predictions. | ||
inputs | ||
Input samples to be explained. | ||
targets | ||
One-hot encoded labels, one for each sample. | ||
Returns | ||
------- | ||
scores | ||
Predictions scores computed, only for the label class. | ||
""" | ||
scores = tf.reduce_sum(model(inputs) * targets, axis=-1) | ||
return scores | ||
``` | ||
|
||
That is a setting where the variable `model(inputs)` is a vector of size $(N, C)$ where: $N$ is the number of input and $C$ is the number of class. | ||
|
||
!!!info | ||
Explaining the logits is to explain the class, while explaining the softmax is to explain why this class is more likely. Thus, it is recommended to explain the logit and exclude the softmax layer if any. | ||
|
||
### Metrics | ||
|
||
It is recommended when one initialize a metric to use the same `operator` than the one used for the attribution methods. **HOWEVER** it should be pointed out that the default behavior **add a softmax** as faithfulness metrics measure a "drop in probability". Indeed, as it is better to look at attributions for models that "dropped" the final softmax layer, it is assumed that it should be added when using metrics object. | ||
|
||
```python | ||
def classif_metrics_operator(model: Callable, | ||
inputs: tf.Tensor, | ||
targets: tf.Tensor) -> tf.Tensor: | ||
""" | ||
Compute predictions scores, only for the label class, for a batch of samples. However, this time | ||
softmax or sigmoid are needed to correctly compute metrics this time while it was remove to | ||
compute attributions values so we add it here. | ||
Parameters | ||
---------- | ||
model | ||
Model used for computing predictions. | ||
inputs | ||
Input samples to be explained. | ||
targets | ||
One-hot encoded labels or regression target (e.g {+1, -1}), one for each sample. | ||
Returns | ||
------- | ||
scores | ||
Probability scores computed, only for the label class. | ||
""" | ||
scores = tf.reduce_sum(tf.nn.softmax(model(inputs)) * targets, axis=-1) | ||
return scores | ||
``` | ||
|
||
!!!warning | ||
For classification tasks, you should remove the final softmax layer of your model if you did not do it when computing attribution scores as a softmax will be apply after the call of the model on the inputs! | ||
|
||
### Existing operators and how to use them | ||
|
||
At present, there are at present 4 (+1) operators available in the library: | ||
|
||
- The `predictions_operator` (name='CLASSIFICATION') which is the default operator and the one designed for classification tasks | ||
- (The `classif_metrics_operator` (name='CLASSIFICATION') which is the operator for classification tasks for metrics object) | ||
- The `regression_operator` (name='REGRESSION') which compute the the mean absolute error between model's prediction and the target. Target should be the model prediction on non-perturbed input. This operator can be used to compute attributions for all outputs of a regression model. | ||
- The `binary_segmentation_operator` (name='BINARY_SEGMENTATION') which is an operator thought for binary segmentation tasks with images. **More details are to come** | ||
- The `segmentation_operator` (name='SEGMENTATION') which is an operator thought for segmentation tasks with images. **More details are to come** | ||
|
||
You can build attribution methods with those operator in two ways: | ||
|
||
- Explicitly importing them | ||
|
||
```python | ||
from xplique.attributions import Saliency | ||
from xplique.metrics import Deletion | ||
from xplique.commons.operators import binary_segmentation_operator | ||
|
||
explainer = Saliency(model, operator=binary_segmentation_operator) | ||
explanations = explainer(inputs, targets) | ||
``` | ||
|
||
- Use their name | ||
|
||
```python | ||
from xplique.attributions import Saliency | ||
from xplique.metrics import Deletion | ||
|
||
explainer = Saliency(model, operator='BINARY_SEGMENTATION') | ||
explanations = explainer(inputs, targets) | ||
``` | ||
|
||
## Examples of applications | ||
|
||
**WIP** | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.