Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribution Classes refactoring, Aggregator for postponed score aggregation #130

Merged
merged 15 commits into from
Apr 4, 2022

Conversation

gsarti
Copy link
Member

@gsarti gsarti commented Mar 16, 2022

Description

Currently, attributions are aggregated at the token level during the step-by-step computation, providing only the aggregated per-token scores as output. Ideally, we want to make the aggregation step happen as late as possible (TBD) to enable various analyses (e.g. per-neuron attributions, different aggregation strategies) that are currently not supported.

The changes performed are the following:

  • Refactor the classes in batch.py to use a shared TensorWrapper backbone class for common operations.
  • Decouple gradient attribution-specific logics from the main attribution flow via inheritance, creating specific output subclasses containing e.g. converge delta parameters. This will be useful to seamlessly add the per-layer logic of attention attribution.
  • Preserve the full attribution tensors until the end of the attribution process. This in turn involves:
    • Rethinking the structure of FeatureAttributionRawStepOutput and FeatureAttributionStepOutput and the transition method FeatureAttribution.make_attribution_output to account for preserving the attribution tensors.
    • Modifying FeatureAttributionSequenceOutput.from_step_attributions to take max_target_seq_len step-attributions of shape batch_size, max_source_seq_len, hidden_size and produce source-target attributions of sizes source_seq_len, target_seq_len, hidden_size and target_seq_len, target_seq_len, hidden_size respectively, where the sequence lengths are variable for every generated object (i.e. remove end padding by truncating on token seq length).
  • Create an Aggregator abstract class and a SumNormAggregator (current strategy, will be used as default). Add an aggregator field in FeatureAttributionSequenceOutput that doesn't get picked up when saving and uses SumNormAggregator as default. The aggregator attached to the class is used as default behavior to aggregate attributions when printing the object, using .show(), using maximum, minimum, etc.

Idea for the Aggregator design: Aggregator contains a mapping from FeatureAttributionSequenceOutput field names to functions that need to be applied to them. Some checks need to be performed after the full aggregation process to ensure that the show method will work:

  • source_attributions have a shape corresponding to source_tokens x target_tokens
  • target_attributions have a shape corresponding to target_tokens x target_tokens
  • probabilities, if present, have a shape of target_tokens
  • all other per-step outputs have a shape of target_tokens (e.g. deltas)

@gsarti
Copy link
Member Author

gsarti commented Mar 17, 2022

Final touches missing, save/load on disk broken due to TokenWithId unserializability

@gsarti
Copy link
Member Author

gsarti commented Mar 21, 2022

Adding ContiguousSpanAggregator directly here, allowing aggregation for predefined source or target spans. Aggregation works but need fixing in the aggregation_fn used for attributions & probabilities. Save/load moved to pickle with only info saved as JSON to make them user-readable, but still not working for now.

EDIT: thanks to json-tricks we are able to save the FeatureAttributionOutput object to JSON and preserve a user-readable info section at the top of the file!

@gsarti gsarti changed the title Postpone attribution aggregation and data classes refactoring [WIP] Postpone attribution aggregation and data classes refactoring Mar 21, 2022
@gsarti
Copy link
Member Author

gsarti commented Mar 23, 2022

Everything should be in place at this point, only some minimal testing for the two aggregators is still missing before the merge.

@gsarti gsarti changed the title [WIP] Postpone attribution aggregation and data classes refactoring Postpone attribution aggregation and data classes refactoring Apr 4, 2022
@gsarti gsarti changed the title Postpone attribution aggregation and data classes refactoring Attribution classes refactoring, Aggregator for postponed score aggregation Apr 4, 2022
@gsarti
Copy link
Member Author

gsarti commented Apr 4, 2022

Finished tests, merging. Summary of the changes:

  • Centralized shared methods from tensor-holding classes (in inseq.data.batch and inseq.data.attribution) into a new abstract TensorWrapper class in inseq.data.data_utils.
  • Aggregation from full-scale gradient attribution (i.e. one score per weight) is no longer performed during attribution steps but is instead postponed and delegated to classes of the newly added Aggregator family. Providing a default aggregator that is triggered when calling FeatureAttributionSequenceOutput.show() allows us to preserve the original two-step procedure of attribute-show, enabling more flexibility. See examples of usage in test_aggregator.py
  • Removed the intermediate FeatureAttributionRawStepOutput, while now the final FeatureAttributionStepOutput is built from the beginning and simply filled with content.

@gsarti gsarti closed this Apr 4, 2022
@gsarti gsarti reopened this Apr 4, 2022
@gsarti gsarti changed the title Attribution classes refactoring, Aggregator for postponed score aggregation Attribution Classes refactoring, Aggregator for postponed score aggregation Apr 4, 2022
@gsarti gsarti merged commit 034a46d into main Apr 4, 2022
@gsarti gsarti deleted the full-attribution-output branch April 4, 2022 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant