-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] make identifying region names more robust #4289
Conversation
👋 @Remi-Gau Thanks for creating a PR! Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft. Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.
For new features:
For bug fixes:
We will review it as quick as possible, feel free to ping us with questions if needed. |
@@ -58,10 +59,12 @@ class NiftiLabelsMasker(BaseMasker, _utils.CacheMixin): | |||
Region definitions, as one image of labels. | |||
|
|||
labels : :obj:`list` of :obj:`str`, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the input type of labels is never checked
this was not a big issue for version < 0.10.3 as those were only used when generating masker reports, but this can now cause trouble
we should probably update the doc string to say that we can accept sequence of strings
though this may not be enough as some nilearn atlases have their labels in nump arrays
tmp.py
Outdated
if "labels" in atlas: | ||
labels = atlas.labels | ||
elif "rsn_indices" in atlas: | ||
labels = atlas.rsn_indices | ||
elif f == fetch_atlas_basc_multiscale_2015: | ||
labels = range(64) | ||
elif f == fetch_atlas_yeo_2011: | ||
labels = range(17) | ||
|
||
if f == fetch_atlas_schaefer_2018: | ||
labels = np.insert(labels, 0, "Background") | ||
|
||
labels_img = ( | ||
atlas["thick_17"] if f == fetch_atlas_yeo_2011 else atlas.maps | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in a separate PR, we should probably further standardize what our atlases return to avoid forcing users to have to wrangle atlas outputs this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this has been a longstanding goal ; e.g., #2037 !
f"{len(labels_after_resampling)} labels " | ||
"(including background)." | ||
) | ||
self._resample_labels(imgs_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this refactoring is unrelated but this transform_single_imgs
was getting way too long...
@pytest.mark.parametrize( | ||
"with_background", | ||
[True, False], # In case the list of labels includes one for background | ||
) | ||
@pytest.mark.parametrize( | ||
"dtype", ["int32", "float32"] # In case regions are labelled with floats | ||
) | ||
@pytest.mark.parametrize( | ||
"affine_data", | ||
[ | ||
None, | ||
np.diag( | ||
(4, 4, 4, 4) | ||
), # region_names_ matches signals after resampling drops labels | ||
], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this test parametrization should cover all the different use cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least all of those that we have in our atlases and that fail
tmp.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file does not need to be kept in the end but was useful to make sure all our atlases can be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file must be removed before merging, though it is potentially useful to keep around to make sure our atlases can be used with our maskers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @mtorabi59 we figured it may be good to keep this script "around" and run it at regular interval (but not on every PR).
Currently our tests:
- test the label maskers on dummy atlases
- test the fetchers with mocks
But we don't check (I think) that all Nilearn atlases work with our maskers, so this script basically, run the label maskers on the atlases provided by Nilearn.
The logic of the script also shows the kind of branchic logic hoops users have to go through because of the inconsistent structure of our atlases.
Why not run it on every PR? Because the fetchers may fail when the download request fail (the reason why we mock them during testing)
What do you think of the idea? If we think this is valuable to keep around, where should this go? nilearn/maint_tools
? in the sandbox repo? somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nilearn/maint_tools is probably the best thing to do. I'm just afraid of seeing this kind of thing growing.
What should actually be done with it should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK with me if we put it in maint_tools: this would be done in a sepatate PR to make it easier to review and see how to best document it.
@Remi-Gau you can ignore my reviewing request :) |
Sorry, what is the status of this PR ? |
|
self.region_names_ = None | ||
if self.labels is not None: | ||
lower_case_labels = {x.lower() for x in self.labels} | ||
knwon_backgrounds = {"background"} | ||
background_in_labels = any( | ||
knwon_backgrounds.intersection(lower_case_labels) | ||
) | ||
offset = 1 if background_in_labels else 0 | ||
self.region_names_ = { | ||
key: self.labels[region_id] | ||
key: self.labels[key + offset] | ||
for key, region_id in region_ids.items() | ||
if region_id != self.background_label | ||
} | ||
else: | ||
self.region_names_ = None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtorabi59
this is refactored version of your suggestion
if self.labels is not None: | ||
lower_case_labels = {x.lower() for x in self.labels} | ||
knwon_backgrounds = {"background"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be expanded if we encounter atlases that use another "keyword" for the background (bckgrd, bg...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed with @mtorabi59:
- to make our life easier we should probably require that list of labels passed to the constructors MUST include
background
orBackground
and that this should be the first item in the list. This cannot be done in this PR and should be part of some refactoring to standardize our atlases with deprecation cycle: won't break the API but will change some of the objects returned by the fetchers.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4289 +/- ##
==========================================
+ Coverage 91.85% 92.15% +0.29%
==========================================
Files 144 143 -1
Lines 16419 16496 +77
Branches 3434 3463 +29
==========================================
+ Hits 15082 15202 +120
+ Misses 792 749 -43
Partials 545 545
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
OK I don't think this specific issue with that atlas (or atlases with similar problems) will get resolved until we clean up what our atlases fetchers return so this should be tackled in another issue / PR. |
# Number of regions excluding the background | ||
number_of_regions = np.sum( | ||
np.unique(labels_image_data) != self.background_label | ||
) | ||
# Basic safety check to ensure we have as many labels as we | ||
# have regions (plus background). | ||
if ( | ||
self.labels is not None | ||
and len(self.labels) != number_of_regions + 1 | ||
): | ||
raise ValueError( | ||
"Mismatch between the number of provided labels " | ||
f"({len(self.labels)}) and the number of regions in " | ||
f"provided label image ({number_of_regions + 1})." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extracted this in separate methods so they can be used for checking the labels
- the constructor
- and also when checking
self.labels_
in thetransform_single_imgs
method
@bthirion So I would suggest:
|
self.background_label = background_label | ||
self._original_region_ids = self._get_labels_values(self.labels_img) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
introducing a new private attribute to keep track of the labels of the image:
- this allows to check that the labels and number of regions match at instantiation
- should later allow to know which regions were dropped during resampling
failure of the pre-release workflow is unrelated |
Can you clarify what you mean with " some of the region names are still wrong for some of our atlases" ? How do you diagnose that ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM
Sorry that was a reference to a message above. Copying the important bit below. For the destrieux atlas, some regions are dropped (values 42 and 117 are dropped: see in However 3 regions that should be in
|
Actually was also checking what the reports would look with these "misnamed" regions and it turns out that we get failures to generate reports with several atlases.
|
OK just checked and the report generation problem is not due to the latest release but could reproduce with older versions (at least 0.10.1, have not checked further). Will open an issue for this one as well, even if it is related to problem mentioned above. |
Changes proposed in this pull request: