-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend IntegratedGradients
target
parameter selection to support rank > 2 model outputs
#635
Conversation
Codecov Report
@@ Coverage Diff @@
## master #635 +/- ##
==========================================
+ Coverage 82.06% 82.10% +0.03%
==========================================
Files 77 77
Lines 10519 10570 +51
==========================================
+ Hits 8632 8678 +46
- Misses 1887 1892 +5
|
|
raise ValueError(f"First dimension in target must be egual to nb of samples. " | ||
f"Found target 1st dimension {target.shape[0]}; nb samples: {nb_samples}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
egual -> equal
dimension -> dimension:
if len(target.shape) > 2: | ||
raise ValueError("Targets must be 1-d or 2-d arrays. In 2-d arrays, each column must contain " | ||
"the target index of the corresponding dimension in the model's output tensor.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly confused why target
can be 2d, can you elaborate? Probably would need to update the explain
docstring then too as it only talks about None, int, List, np.ndarray
and no dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think I used the word dimension when I should have used the word rank.
With gather_nd
, you can actually get the correct target for output tensors with rank higher than 2.
For examples, if you have a classification problem with 10 classes, your output tensor will be a rank 2 tensor (a matrix) of dimension nb_samples X 10 . Now in this case you target must be rank 1 (1-d), of length nb_samples, and with range 0-9, each value representing the column position in the output tensor.
If you have a tensor of higher rank as output, for example an auto encoder with (nb_samples, 28, 28, 3) rank 3 tensor in the output (ignoring the nb_samples dimension), you need a rank 2 target (a matrix) nb_samples X 3 , where the first 2 columns have range 0-27 and the last column have range 0-2.
In other words, for output ranks > 1 , if you have a rank-n tensor as output, you need a rank 2 tensor with dimensions nb_samples x n. Each row represents the location of an element in the output tensor.
Not sure if it's clear enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense, perhaps worth writing a bit about it in the docstrings (not necessary to be comprehensive as we don't currently have examples on outputs of >2 rank. captum
describes it quite well: https://captum.ai/api/integrated_gradients.html
tmax, tmin = target.max(axis=0), target.min(axis=0) | ||
|
||
if tmax > 1: | ||
raise ValueError(f"Targets values {tmax} out of range for output shape {output_shape[-1]} ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check grammar here and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also perhaps need to make the message less confusing in general (here and elsewhere), e.g. consider:
ValueError: Targets value 4 out of range for output shape 2
But output_shape
is the whole tuple not the length of the last dimension.
raise ValueError("Targets must be 1-d or 2-d arrays. In 2-d arrays, each column must contain " | ||
"the target index of the corresponding dimension in the model's output tensor.") | ||
|
||
if len(output_shape) == 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole branch is only valid if the (implicit) task is classification and outputs are labels rather than probabilities (hence output shape is 1-d), right? Can you add a comment here to clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's valid when you have the binary classification with single squash output (probability of class 1 but returns tensor of shape (nb_samples) instead of (nb_samples, 1) ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm remind me, is it valid to pass a model that outputs labels instead of probabilities? Or does the model always have to be probabilistic (in the classification case that is)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about that.
{'preds': np.array([[[0.0, 0.1], [1.0, 1.1]], | ||
[[2.0, 2.1], [3.0, 4.1]]]), | ||
'target': np.array([[0, 0], | ||
[0, 0]]), | ||
'expected': np.array([[0.0], | ||
[2.0]])}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that this test case tests a 3-dimensional output and a 2-dimensional target which partially answers my question above why 2D targets are allowed. However, I though we only support 2-dimensional outputs anyway?
Does this mean that this PR implicitly allows up to 3D output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's right. Any rank for the output tensor are allowed now, not only 3D. Refer to comment above
For regression models whose output is a scalar, target should not be provided. | ||
For classification models `target` can be either the true classes or the classes predicted by the model. | ||
It must be provided if the model output dimension is higher than 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the language consistent to always talk about the rank of the output (instead of dimesnion)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, but in this case dimension doesn't refer to the rank of the tensor, it refers to the number of classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so in that case the sentence should be changed so it's clear it applies to classification and talks about classes (as opposed to the niche case of 2-class classification with 1-class output probability?).
For regression models whose output is a scalar, target should not be provided. | ||
For classification models `target` can be either the true classes or the classes predicted by the model. | ||
It must be provided if the model output dimension is higher than 1. | ||
If the model's output is a rank-n tensor with n > 2, | ||
the target must a rank-2 numpy array or a list of lists (a matrix) with dimensions nb_samples X (n-1) . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must -> must be
tmax, tmin = target.max(axis=0), target.min(axis=0) | ||
|
||
if tmax > 1: | ||
raise ValueError(f"Target value {tmax} out of range for output shape = 1 ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be either out of range for output shape {output_shape}
or out of range for rank-1 output
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I think it's actually better to put for output shape {output_shape}
everywhere
if out_rank != target_rank: | ||
raise ValueError(f"The last dimension of target must match the rank of the model's output tensor. " | ||
f"Found target last dimension: {target_rank}; model's output rank: {out_rank}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be done as the very first check as it could mess up everything immediately if not passed right?
Although I'm confused since it seems target_rank
is not always len(target.shape)
as it's defined as target.shape[-1]
for rank>2 outputs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case with output_shape > 2 is different because the last dimension of the target must match the rank of the output tensor (excluding batch dimension )
raise ValueError(f"Target value {tmax} out of range for output shape = 1 ") | ||
|
||
elif len(output_shape) == 2: | ||
out_rank, target_rank = 1, len(target.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fee like out_rank
should be 2
here but it seems you are using different conventions in different places which makes it more confusing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e. are we ignoring batch dimension in some cases but not others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because out_rank doesn't include the dimension referring to the number of samples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not ignoring it only in the case of the squash output (output_shape = 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if len(output_shape) == 1
then out_rank=1
so we are not ignoring batch dimension in this case, but are ignoring it in all other cases?
It's just confusing to read that out_rank=1
in both len(output_shape) == 1
and len(output_shape) == 2
... (and similar for the target_rank
). Is there a better way to code this up or at least comment in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think supporting squash-output for 2-class classification is giving a lot of headache here and elsewhere...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well it gave me a headache yesterday when I was writing these warnings and getting errors for squash output
IntegratedGradients
target
parameter selection to support rank > 2 model outputs
This pull request addresses the issue related to target dimensionality in the IntegratedGradients class.
The present code behaves correctly for the most common use cases (classification and regression models with output dimensionality <= 2) when the targets passed are correctly formatted. However, the
_select_target
function returns incorrect outputs for output dimensions > 2 and in some case doesn't raise errors when unusual targets are passed.This pull request should address this issue with the following changes
_check_target
ensuring that the target dimensionality is compatible with the model's output dimensionality