Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch wrapper & Documentation update #132

Merged
merged 27 commits into from Sep 7, 2023
Merged

Pytorch wrapper & Documentation update #132

merged 27 commits into from Sep 7, 2023

Conversation

lucashervier
Copy link
Collaborator

@lucashervier lucashervier commented Jul 18, 2023

New Features

A PyTorch wrapper

The main object of this PR is to provide within Xplique a convenient wrapper for Pytorch's model 4788b94. The tests were built in 693e3fa and mainly checked that it works for most attribution methods and is compatible with metrics. In addition, f5138ae provides the related documentation and points to a Getting Started tutorial. Finally, a test pipeline for cross-configuration between tensorflow and pytorch was added in 82231f4. This last commit also put all the configuration files into a single setup.cfg. The commit e1a29b7 include all the requested/recommended change after the first round of reviews.

Add an Enum for tasks

95ac086 and dc8484f add the Tasks enum which includes the operators for classification and regression tasks. The possibility to choose from the existing operator by their name was added. It also fixes the regression operator.

Add an activation option for metrics

While we recommend using the logits to generate explanations, it might be more relevant to look at the probability (after a softmax or sigmoid layer) of a prediction when computing metrics for instance if it measures 'a drop in probability for the classification of an input occluded in the more relevant part'. Thus, e60a5ff introduces this option when you build a metric. activation can be either None, 'softmax' or 'sigmoid'. This commit also includes an update of the documentation and the corresponding tests.

Warning: @fel-thomas this means to check the tutorials related to metrics

Documentation Enhancement

There was some important documentation missing. Especially, there was no clear description of the model parameter. Furthermore, since the V1.x the operator made its apparition but did not have its dedicated section. That was added in 1cc4ffd and further enhanced with the feedback on this PR in dc8484f. Additionally, it also provides better assets for dark theme, makes justify contents in the docs and add a collapsable section for both the README and the documentation homepage. It is thought to make those two pages easier to navigate quickly.

Continuous Integration

Docs workflow

7cad1c4 is dedicated to building a pipeline that automatically versions, builds and deploys documentation on releases. Thus, it avoids "forgetting" building the documentation and versioning will allow mapping each release version to its dedicated documentation.

Better issues templates

adc363b

@lucashervier lucashervier added documentation Improvements or additions to documentation enhancement New feature or request continuous-integration Improvement relative to the CI pipeline labels Jul 18, 2023
@lucashervier lucashervier requested a review from dv-ai July 18, 2023 14:19
Copy link
Collaborator

@dv-ai dv-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests:
• Python version: The library was tested with python 3.6 (python-lints). Any good reason to remove it?

ReadMe :
• In the table of attribution methods: Sobol et Hsci are compatible with “TF, Pytorch” but these methods shall be compatible with any “Callable Model”. For my point of view, when the method is compatable with any “Callable Model”, we should replace “Callable” tag by “TF, Pytorch, Callable”. It will be more understandable for the user because a dedicated wrapper has be develop for Pytorch.
• Why metrics are only compatible with TF and Pytorch and not callable model?

Api_attributions.md
• Gradient-based approaches : We should explain why the whiteboxes methods are compatible or not: Support Gradient based whitebox methods when the method doesn’t need to “modify” some part of the model

Model.md
• In the section “The inputs have expected shape”, I recommend to add a section “Input format” in the same level of “General”. In this new section, I will put in sub-section “Images data”, “Tabular data”, “Time-Series data”.
• In the section “The inputs have expected shape”, I will add a new section “Task” where for each task, included segmentation and detection, there is a description of the output of the model and a description of the target parameter (cf. following comment).
• The targets parameter is always a source of confusion for the user. It shall be better describe in this section. We should explain how it used by the library and how the user should fill it. It’s done for image case in different tasks explanation but I think we should have a dedicated paragraph to explain the “target parameter”
• « Time-Series data » : The note is not so clear for me to how to solve the issue for “Lime” and “Kernel shap”
• The “trick” that have be done to adapt Xplique with pytorch should also work for other kind of library like sklearn. Maybe we should mention it.
• We should explain the different king of enum values that the user can select for the “operator parameter” (Task enum not describe)

Operator.md
• It’s not clear that the operator could be a enum value. We shall explain it and describe it (Task enum not describe)

Pytorch.py
• Why do we need “outputs.cpu()” in the call of backward ?

Copy link
Collaborator

@AntoninPoche AntoninPoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work ! The pytorch wrapper works well even for the segmentation task I tested. Then the documentation update is more than welcomed !

I have few other comments on the tutorial:

  • Verbosity, in the beginning, could be improved by:
    • Adding a -q in !pip install xplique
    • Adding _verbose=False or something like this in torch.hub.load
  • To get to the point I would collapse cells regarding downloading and preprocessing images and model.
  • For explanations I would highlight more that the only different is the wrapping of the model. And maybe talk a bit about the expected shape of images.
  • Regarding metrics, I do not understand why you used different images than the first ones. Is it to respect the guidelines? If so, you should say it. Otherwise, I think it makes the tutorial heavier.

README.md Outdated Show resolved Hide resolved
docs/api/attributions/api_attributions.md Outdated Show resolved Hide resolved
mkdocs.yml Outdated Show resolved Hide resolved
docs/api/attributions/operator.md Outdated Show resolved Hide resolved
docs/pytorch.md Outdated Show resolved Hide resolved
docs/index.md Show resolved Hide resolved
xplique/commons/operators.py Show resolved Hide resolved
xplique/attributions/deconvnet.py Outdated Show resolved Hide resolved
xplique/attributions/base.py Show resolved Hide resolved
xplique/metrics/base.py Show resolved Hide resolved
Copy link
Member

@fel-thomas fel-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is remarkable work, Lucas – undoubtedly the most impressive pull request in Xplique since the library's creation.

There's a lot to verify, and given the substantial changes, it's crucial to ensure a thorough review. To start, I suggest splitting the pull request into two parts: one for the PyTorch wrapper implementation and associated documentation.

Regarding the other aspects such as continuous integration (CI), the remaining documentation, setup configuration (setup.cfg), the updated documentation, and the CI templates – let's address those in a separate pull request.

I propose you to start adressing my first comments and splitting the PR, then we should meet to finish and merge the Pytorch part (this week?)

Again, super job ! ;)

xplique/wrappers/pytorch.py Show resolved Hide resolved
xplique/wrappers/pytorch.py Show resolved Hide resolved
xplique/wrappers/pytorch.py Outdated Show resolved Hide resolved
xplique/wrappers/pytorch.py Outdated Show resolved Hide resolved
xplique/wrappers/pytorch.py Outdated Show resolved Hide resolved
xplique/attributions/base.py Outdated Show resolved Hide resolved
tests/wrappers/test_pytorch_wrapper.py Outdated Show resolved Hide resolved
tests/wrappers/test_pytorch_wrapper.py Outdated Show resolved Hide resolved
setup.py Show resolved Hide resolved
setup.cfg Show resolved Hide resolved
docs/pytorch.md Outdated Show resolved Hide resolved
…. With this new getter we do not need to rechange the model when using metrics for classification
…ctionnality in the metric instead), fix regression_operator, update the tests and enhance the documentation
…oid) to the model when computing metrics (but not for generating explanations), add the corrsponding tests and documentation
…TorchWrapper class without the need of installing pytorch, change methods name to more explicit ones, improve the grad method, assert evaluation mode of pytorch module, update the test for the refactoring of the wrapper and also following operator and metric refactoring, add more information in the documentation
@lucashervier
Copy link
Collaborator Author

@dv-ai, @AntoninPoche and @fel-thomas the last push should address all your remarks. @fel-thomas and @AntoninPoche I directly answered your comments and resolved all the conversation. Do not hesitate to unresolve them if you think that the issue was not properly addressed.

The documentation was heavily changed so please read the new one carefully before approving :)

@AntoninPoche I took note of your feedback on the notebook. I will modify them ASAP to take it into account!

@dv-ai as you only add a general comment I address your points below:

Tests:
• Python version: The library was tested with python 3.6 (python-lints). Any good reason to remove it?

-> Python 3.6 started being deprecated

ReadMe :
• In the table of attribution methods: Sobol et Hsci are compatible with “TF, Pytorch” but these methods shall be compatible with any “Callable Model”. For my point of view, when the method is compatable with any “Callable Model”, we should replace “Callable” tag by “TF, Pytorch, Callable”. It will be more understandable for the user because a dedicated wrapper has be develop for Pytorch.
-> Done

• Why metrics are only compatible with TF and Pytorch and not callable model?
-> I think it is possible to use most metrics with any Callable model. However, I think that is not fast-forward and should be addressed in the future. I will open an issue once the PR is validated

Api_attributions.md
• Gradient-based approaches : We should explain why the whiteboxes methods are compatible or not: Support Gradient based whitebox methods when the method doesn’t need to “modify” some part of the model
-> I added an !!!info box at the end

Model.md
• In the section “The inputs have expected shape”, I recommend to add a section “Input format” in the same level of “General”. In this new section, I will put in sub-section “Images data”, “Tabular data”, “Time-Series data”.
• In the section “The inputs have expected shape”, I will add a new section “Task” where for each task, included segmentation and detection, there is a description of the output of the model and a description of the target parameter (cf. following comment).
• The targets parameter is always a source of confusion for the user. It shall be better describe in this section. We should explain how it used by the library and how the user should fill it. It’s done for image case in different tasks explanation but I think we should have a dedicated paragraph to explain the “target parameter”

-> For the 3 bullet points above: I changed a lot of the documentation for the model and the operator for those points. Do not hesitate to tell me if it's better now

• « Time-Series data » : The note is not so clear for me to how to solve the issue for “Lime” and “Kernel shap”
-> I modified the note to: "By default Lime & KernelShap will treat such inputs as grey images. You will need to define a custom map_to_interpret_space function when instantiating those methods in order to create a meaningful mapping of Time-Series data into an interpretable space when building such explainers."

• The “trick” that have be done to adapt Xplique with pytorch should also work for other kind of library like sklearn. Maybe we should mention it.
-> I modified the last section of the model.md file into: "Then you should take a look on the Callable documentation or you could take inspiration on the PyTorch Wrapper to write a wrapper that will integrate your model into our API!"

• We should explain the different king of enum values that the user can select for the “operator parameter” (Task enum not describe)
-> It's done in the new documentation of operator

Operator.md
• It’s not clear that the operator could be a enum value. We shall explain it and describe it (Task enum not describe)
-> Done in the new Operator's documentation

Pytorch.py
• Why do we need “outputs.cpu()” in the call of backward ?
-> I modified it so it is not necessary anymore

@lucashervier
Copy link
Collaborator Author

The last two commits are for you @AntoninPoche! They are related to your feedback on the notebook. I took your remarks concerning brevity and I split the tutorial into two (one Getting Started and another one for the Metrics). I also add those tutorials to the different place of interest in the documentation, readme, etc..

Copy link
Collaborator

@AntoninPoche AntoninPoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work as always! Most of my comments are little documentation modifications.

But I highlight the comment on the metrics/base.py file on activations. Their computation should only be done along the class axis and not the batch axis.

README.md Show resolved Hide resolved
docs/api/attributions/api_attributions.md Show resolved Hide resolved
docs/api/attributions/api_attributions.md Outdated Show resolved Hide resolved
docs/api/attributions/model.md Outdated Show resolved Hide resolved
docs/api/attributions/operator.md Outdated Show resolved Hide resolved
mkdocs.yml Outdated Show resolved Hide resolved
tests/metrics/test_common.py Show resolved Hide resolved
xplique/commons/operators.py Outdated Show resolved Hide resolved
xplique/metrics/base.py Outdated Show resolved Hide resolved
xplique/wrappers/pytorch.py Outdated Show resolved Hide resolved
Copy link
Member

@fel-thomas fel-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall review

Once again, congratulations, Lucas, you've done an amazing job. I believe the library will definitely level up after your pull request.

I'm sorry for all the comments, but it's better that we try to do things right now so we can continue to progress smoothly later on.

Once the comments are addressed, I'm okay with merging.

Excellent work once again. 👍

.bumpversion.cfg Show resolved Hide resolved
.github/ISSUE_TEMPLATE/bug_report.yml Outdated Show resolved Hide resolved
.github/ISSUE_TEMPLATE/bug_report.yml Show resolved Hide resolved
.github/ISSUE_TEMPLATE/feature_request.yml Outdated Show resolved Hide resolved
.github/ISSUE_TEMPLATE/feature_request.yml Show resolved Hide resolved
xplique/commons/operators.py Show resolved Hide resolved
xplique/metrics/base.py Outdated Show resolved Hide resolved
xplique/metrics/base.py Show resolved Hide resolved
xplique/metrics/fidelity.py Show resolved Hide resolved
xplique/wrappers/pytorch.py Outdated Show resolved Hide resolved
@lucashervier
Copy link
Collaborator Author

@fel-thomas, @AntoninPoche thank you again for your numerous feedbacks. Every change concerning the docs are in the commit #3852bcf. The other requested changes were addressed in the dedicated commits which have explicit names I believe. I answer all your questions and mark the conversation as resolved but once again if you are not satisfied with one of my answers do not hesitate to re-open it!

Hope you like it!

Copy link
Collaborator

@AntoninPoche AntoninPoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Really nice job Lucas!

It just seems that the remarks for pytorch.md were not applied to the version on this PR. I can do it in my future PR if you want ;)

docs/pytorch.md Outdated Show resolved Hide resolved
docs/pytorch.md Outdated Show resolved Hide resolved
@lucashervier lucashervier merged commit 4753517 into master Sep 7, 2023
15 checks passed
@lucashervier lucashervier deleted the lucas/torch branch September 7, 2023 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
continuous-integration Improvement relative to the CI pipeline documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants