-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow multiple outputs for agg_mode=True in Feature Ablation #425
Conversation
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 786acb543c9249465e132f65713693ad3d89101d
c32c797
to
7a9fef6
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 67ca51aa79de0dee137ac90e3057dc4127a288ad
7a9fef6
to
7c983fa
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 1b3d41e8096acb0dbdf0f9fd173c3cf46ecbe680
7c983fa
to
7fdbbac
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Differential Revision: D22416476 fbshipit-source-id: 0d08ca990a1e999339e51f0a7fa50be197d2f3b9
7fdbbac
to
070efc8
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: eff7da94323e1e3c01d73ea377902df1bc6a4e76
070efc8
to
eae51dc
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: 344bc6db17e1bb04570e68ebc20a0a3da7c09c73
eae51dc
to
1cfc5e3
Compare
This pull request was exported from Phabricator. Differential Revision: D22416476 |
This pull request has been merged in eb3e758. |
…#425) Summary: Pull Request resolved: pytorch#425 ## Description What is aggregation output mode? It can be defined as: When there is no 1:1 correspondence with the `num_examples` (`batch_size`) and the amount of outputs your model produces, i.e. the model output size does not grow in size as the `batch_size` becomes larger. This allows for an arbitrary sized tensor to be output from the `forward_func` for feature ablation. --- ## Implementation Details We assume `aggregation_output_mode` to be the case if: `perturbations_per_eval == 1` and [ `feature_mask is None` __or__ is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model has `batch_size` outputs) and `agg_output_mode=True` If `agg_output_mode == True`: - Feature ablation will output a tensor of shape `1xOxF` where `O` is the number of output features and `F` is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D). If we are not in `agg_output_mode` we must ensure the number of elements is `n` (`batch_size`). If it is not, we output an error to the user. Here we could actually check if the element size is at least `n`, but for simplicity I am not doing this. ## Tests Added tests to check for: `agg_mode=True`: - Incorrect feature mask (i.e. where `fm.shape[0] > 1`) - Output a `Fx1` tensor where `F` is the number of features in the input - The above but for a feature mask with the first two features treated as one feature - Output a `2x3x5` constant tensor (not associated to outputs) - internally this will be interpreted as a `1x30` 2D tensor `agg_mode=False`: - Check there is exactly `n` outputs where `n == batch_size` => if not then check that we throw an exception (assertion error). **This already exists in `test_error_perturbations_per_eval_limit_batch_scalar`** ## Notes I created a new function rather than modifying `_find_output_mode_and_verify`; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR. Reviewed By: vivekmig Differential Revision: D22416476 fbshipit-source-id: d9094754ec31152a0a2199403a8b709b39a92d04
Description
What is aggregation output mode? It can be defined as:
When there is no 1:1 correspondence with the
num_examples
(batch_size
) and the amount of outputs your model produces, i.e. the model output size does not grow in size as thebatch_size
becomes larger.This allows for an arbitrary sized tensor to be output from the
forward_func
for feature ablation.Implementation Details
We assume
aggregation_output_mode
to be the case if:perturbations_per_eval == 1
and [feature_mask is None
or is of length 1 (i.e. associated to all inputs) ]. This is not perfect but for feature ablation the underlying logic is the same if there is a 1:1 correspondence (i.e. the model hasbatch_size
outputs) andagg_output_mode=True
If
agg_output_mode == True
:1xOxF
whereO
is the number of output features andF
is the number of input features under aggregation mode. Thus, if the model outputs a tensor > 2D the user must reshape it (as we treat the output as a 2D tensor in the implementation); thus it is recommended to only output a 2D tensor (i.e. the implementation allows for >2D).If we are not in
agg_output_mode
we must ensure the number of elements isn
(batch_size
). If it is not, we output an error to the user. Here we could actually check if the element size is at leastn
, but for simplicity I am not doing this.Tests
Added tests to check for:
agg_mode=True
:fm.shape[0] > 1
)Fx1
tensor whereF
is the number of features in the input2x3x5
constant tensor (not associated to outputs)1x30
2D tensoragg_mode=False
:n
outputs wheren == batch_size
=> if not then check that we throw an exception (assertion error). This already exists intest_error_perturbations_per_eval_limit_batch_scalar
Notes
I created a new function rather than modifying
_find_output_mode_and_verify
; as otherwise this breaks shapley value sampling. Will have to fix this in a separate PR.Differential Revision: D22416476