New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pytorch wrapper & Documentation update #132
Conversation
92b47f2
to
3d21ffa
Compare
203ac7e
to
37770d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests:
• Python version: The library was tested with python 3.6 (python-lints). Any good reason to remove it?
ReadMe :
• In the table of attribution methods: Sobol et Hsci are compatible with “TF, Pytorch” but these methods shall be compatible with any “Callable Model”. For my point of view, when the method is compatable with any “Callable Model”, we should replace “Callable” tag by “TF, Pytorch, Callable”. It will be more understandable for the user because a dedicated wrapper has be develop for Pytorch.
• Why metrics are only compatible with TF and Pytorch and not callable model?
Api_attributions.md
• Gradient-based approaches : We should explain why the whiteboxes methods are compatible or not: Support Gradient based whitebox methods when the method doesn’t need to “modify” some part of the model
Model.md
• In the section “The inputs have expected shape”, I recommend to add a section “Input format” in the same level of “General”. In this new section, I will put in sub-section “Images data”, “Tabular data”, “Time-Series data”.
• In the section “The inputs have expected shape”, I will add a new section “Task” where for each task, included segmentation and detection, there is a description of the output of the model and a description of the target parameter (cf. following comment).
• The targets parameter is always a source of confusion for the user. It shall be better describe in this section. We should explain how it used by the library and how the user should fill it. It’s done for image case in different tasks explanation but I think we should have a dedicated paragraph to explain the “target parameter”
• « Time-Series data » : The note is not so clear for me to how to solve the issue for “Lime” and “Kernel shap”
• The “trick” that have be done to adapt Xplique with pytorch should also work for other kind of library like sklearn. Maybe we should mention it.
• We should explain the different king of enum values that the user can select for the “operator parameter” (Task enum not describe)
Operator.md
• It’s not clear that the operator could be a enum value. We shall explain it and describe it (Task enum not describe)
Pytorch.py
• Why do we need “outputs.cpu()” in the call of backward ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work ! The pytorch wrapper works well even for the segmentation task I tested. Then the documentation update is more than welcomed !
I have few other comments on the tutorial:
- Verbosity, in the beginning, could be improved by:
- Adding a
-q
in!pip install xplique
- Adding
_verbose=False
or something like this intorch.hub.load
- Adding a
- To get to the point I would collapse cells regarding downloading and preprocessing images and model.
- For explanations I would highlight more that the only different is the wrapping of the model. And maybe talk a bit about the expected shape of images.
- Regarding metrics, I do not understand why you used different images than the first ones. Is it to respect the guidelines? If so, you should say it. Otherwise, I think it makes the tutorial heavier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is remarkable work, Lucas – undoubtedly the most impressive pull request in Xplique since the library's creation.
There's a lot to verify, and given the substantial changes, it's crucial to ensure a thorough review. To start, I suggest splitting the pull request into two parts: one for the PyTorch wrapper implementation and associated documentation.
Regarding the other aspects such as continuous integration (CI), the remaining documentation, setup configuration (setup.cfg), the updated documentation, and the CI templates – let's address those in a separate pull request.
I propose you to start adressing my first comments and splitting the PR, then we should meet to finish and merge the Pytorch part (this week?)
Again, super job ! ;)
…. With this new getter we do not need to rechange the model when using metrics for classification
…setup and adapt worflows for the wrapper
…dd doc on operator and model's expectations
… a typo error in setup
…PyTorch for Callable methods, fix pep y
…ctionnality in the metric instead), fix regression_operator, update the tests and enhance the documentation
…oid) to the model when computing metrics (but not for generating explanations), add the corrsponding tests and documentation
…TorchWrapper class without the need of installing pytorch, change methods name to more explicit ones, improve the grad method, assert evaluation mode of pytorch module, update the test for the refactoring of the wrapper and also following operator and metric refactoring, add more information in the documentation
37770d4
to
39d2fd2
Compare
@dv-ai, @AntoninPoche and @fel-thomas the last push should address all your remarks. @fel-thomas and @AntoninPoche I directly answered your comments and resolved all the conversation. Do not hesitate to unresolve them if you think that the issue was not properly addressed. The documentation was heavily changed so please read the new one carefully before approving :) @AntoninPoche I took note of your feedback on the notebook. I will modify them ASAP to take it into account! @dv-ai as you only add a general comment I address your points below: Tests: -> Python 3.6 started being deprecated ReadMe : • Why metrics are only compatible with TF and Pytorch and not callable model? Api_attributions.md Model.md -> For the 3 bullet points above: I changed a lot of the documentation for the model and the operator for those points. Do not hesitate to tell me if it's better now • « Time-Series data » : The note is not so clear for me to how to solve the issue for “Lime” and “Kernel shap” • The “trick” that have be done to adapt Xplique with pytorch should also work for other kind of library like sklearn. Maybe we should mention it. • We should explain the different king of enum values that the user can select for the “operator parameter” (Task enum not describe) Operator.md Pytorch.py |
The last two commits are for you @AntoninPoche! They are related to your feedback on the notebook. I took your remarks concerning brevity and I split the tutorial into two (one Getting Started and another one for the Metrics). I also add those tutorials to the different place of interest in the documentation, readme, etc.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work as always! Most of my comments are little documentation modifications.
But I highlight the comment on the metrics/base.py file on activations. Their computation should only be done along the class axis and not the batch axis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall review
Once again, congratulations, Lucas, you've done an amazing job. I believe the library will definitely level up after your pull request.
I'm sorry for all the comments, but it's better that we try to do things right now so we can continue to progress smoothly later on.
Once the comments are addressed, I'm okay with merging.
Excellent work once again. 👍
… between zero and one
… on which axis the computation is done
@fel-thomas, @AntoninPoche thank you again for your numerous feedbacks. Every change concerning the docs are in the commit #3852bcf. The other requested changes were addressed in the dedicated commits which have explicit names I believe. I answer all your questions and mark the conversation as resolved but once again if you are not satisfied with one of my answers do not hesitate to re-open it! Hope you like it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Really nice job Lucas!
It just seems that the remarks for pytorch.md
were not applied to the version on this PR. I can do it in my future PR if you want ;)
New Features
A PyTorch wrapper
The main object of this PR is to provide within Xplique a convenient wrapper for Pytorch's model 4788b94. The tests were built in 693e3fa and mainly checked that it works for most attribution methods and is compatible with metrics. In addition, f5138ae provides the related documentation and points to a Getting Started tutorial. Finally, a test pipeline for cross-configuration between tensorflow and pytorch was added in 82231f4. This last commit also put all the configuration files into a single
setup.cfg
. The commit e1a29b7 include all the requested/recommended change after the first round of reviews.Add an Enum for tasks
95ac086 and dc8484f add the
Tasks
enum which includes the operators for classification and regression tasks. The possibility to choose from the existingoperator
by their name was added. It also fixes the regression operator.Add an activation option for metrics
While we recommend using the logits to generate explanations, it might be more relevant to look at the probability (after a softmax or sigmoid layer) of a prediction when computing metrics for instance if it measures 'a drop in probability for the classification of an input occluded in the more relevant part'. Thus, e60a5ff introduces this option when you build a metric.
activation
can be either None, 'softmax' or 'sigmoid'. This commit also includes an update of the documentation and the corresponding tests.Warning: @fel-thomas this means to check the tutorials related to metrics
Documentation Enhancement
There was some important documentation missing. Especially, there was no clear description of the
model
parameter. Furthermore, since the V1.x theoperator
made its apparition but did not have its dedicated section. That was added in 1cc4ffd and further enhanced with the feedback on this PR in dc8484f. Additionally, it also provides better assets for dark theme, makes justify contents in the docs and add a collapsable section for both the README and the documentation homepage. It is thought to make those two pages easier to navigate quickly.Continuous Integration
Docs workflow
7cad1c4 is dedicated to building a pipeline that automatically versions, builds and deploys documentation on releases. Thus, it avoids "forgetting" building the documentation and versioning will allow mapping each release version to its dedicated documentation.
Better issues templates
adc363b