New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] test examples to replace ADHD with MAIN datasets #1887
Conversation
…ns like check_niimg. Adding copy_header to load_iimg fixes this
…I timeout - Increased CircleCI timeout to 7hrs due to recent increase in build time.
* 'master' of https://github.com/nilearn/nilearn: (138 commits) Fixed: Full builds do not create a testenv before installation, causing failure Reformatted commands for clarity after discussion with Gael Removed junk text that was added for testing circleci cached builds Added junk text for testing circleci cached builds, to be removed later Added back the inadvertently removed command for installation of linux packages Added back build for py2.7 without matplotlib after discussion wit Kamalaker Trivial change to make CircleCI reappear Removed 2 of 3 Py2.7 builds (unnecessary), added Py3.6 build Added comment to trigger change to make the disappeared CircleCI reappear in PR Improved: CircleCI builds are more efficient Bumped up the version to 0.5.0 Corrected example images; small tweak in whats_new.rst Updated commits count for contributors Removed the erring comma in view_img parameters, fixed 2 example images Incorporated most of the feedback from Kamalaker's first review of this PR Updated contributor commits count detail improve warning vmin=0 when using threshold add vmin to view_stat_map ...
Codecov Report
@@ Coverage Diff @@
## master #1887 +/- ##
==========================================
- Coverage 95.23% 95.23% -0.01%
==========================================
Files 135 135
Lines 17320 17407 +87
==========================================
+ Hits 16495 16577 +82
- Misses 825 830 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #1887 +/- ##
==========================================
+ Coverage 95.23% 95.32% +0.08%
==========================================
Files 135 137 +2
Lines 17320 17856 +536
==========================================
+ Hits 16495 17021 +526
- Misses 825 835 +10
Continue to review full report at Codecov.
|
Overall, this seems to work: some examples probably require tweaks, but nothing seems fundamentally broken.
|
There are few examples which are broken when I ran them on my box. Basically, we need to change n_subjects or tweak some parameters. Thats why I left them unchanged to get some opinion on current push. I will push with all examples changed.
What would you like to change here ? Probably also increase more subjects ? |
@@ -90,7 +90,7 @@ | |||
from nilearn.image import index_img | |||
|
|||
# Selecting specific maps to display: maps were manually chosen to be similar | |||
indices = {dict_learning: 25, canica: 33} | |||
indices = {dict_learning: 24, canica: 32} | |||
# We select relevant cut coordinates for displaying | |||
cut_component = index_img(components_imgs[0], indices[dict_learning]) | |||
cut_coords = find_xyz_cut_coords(cut_component) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KamalakerDadi , on this example, some of the visualizations of regions are broken: the contours are present, but not filled. Do you have an idea of why that might be the case?
https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_compare_resting_state_decomposition.html#sphx-glr-auto-examples-03-connectivity-plot-compare-resting-state-decomposition-py
Do you have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well yeah I saw that and needs to be investigated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is OK, but the CanICA components is probably not 32.
Also the images are extremely smooth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the images are extremely smooth.
Is this a bad sign or good sign ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the images downloaded smoothed? @emdupre @illdopejake , do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !), but they were smoothed prior to distributing as OpenNeuro derivatives. From the derivatives description:
All data were smoothed using a Gaussian filter (5mm kernel).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, we did not do any additional smoothing. It's possible that we could reach out to the Saxe lab to see if they would mind posted the unsmoothed data?
AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any news on this non-smoothed data sharing aspect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.
If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....
adhd_dataset = datasets.fetch_adhd(n_subjects=20) | ||
func_filenames = adhd_dataset.func | ||
confounds = adhd_dataset.confounds | ||
main_dataset = datasets.fetch_main(n_subjects=20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this example, I think that we could increase the number of components from 5 to 8.
Side note: I won't bother about fixing AppVeyor right now. My current aim is to make examples going good. |
Side note: I won't bother about fixing AppVeyor right now. My current aim is to make examples going good.
Agreed!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great PR. Thx !
Besides the small comments I added, I endorse those of Gael.
|
||
Notes | ||
----- | ||
This functional MRI datasets are used as part of teaching how to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is used
---------- | ||
data_dir: str | ||
Path of the data directory. Used to force data storage in a specified | ||
location. If None is given, data is stored in home directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data are stored in the home directory
The original data is downloaded from OpenNeuro | ||
https://openneuro.org/datasets/ds000228/versions/1.0.0 | ||
|
||
This fetcher downloads downsampled data which is uploaded to Open |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... data that are available on Open...
Huh. |
We can change the filenames on OSF -- it's from the original directory flattening. I don't develop in Windows (nor does anyone else who was in the MAIN tutorial, to my knowledge ?), so we likely just missed this entirely ! Would it be helpful for us to update the files ? |
@emdupre That will be best. Perhaps a hyphen? |
Thanks @emdupre ! |
You guys already have a bunch of hyphens in there, so I will leave it to you. The following are reserved characters in windows:
Source: https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file |
OK, I just kept the filename and removed the (flattened) directory structure. Please let me know if there are other issues ! |
I want to try some things before changing the filenames. |
@KamalakerDadi I think changing the filenames is the right call, so that other users running windows who directly access and download the files do not face this complication. |
Why AppVeyor is building ? |
I actually just updated them -- sorry for the miscommunication @KamalakerDadi ! I have the old ones and can roll back, but I'll wait for an explicit green light one way or the other. |
Ok. No worries. |
@emdupre Thanks for being so quick, you are the coolest, and sorry for the inconvenience you have to bear. |
@KamalakerDadi I triggerred a rebuild to see if that is the only error. |
Neither, but qualitatively it gives the impression of slightly too much smoothing on the data.
Agreed. Is the data that we are downloading already smoothed (ping
@illdopejake)?
|
but they were smoothed prior to distributing as OpenNeuro derivatives. From the derivatives description:
All data were smoothed using a Gaussian filter (5mm kernel).
Ah! What a pity!
We find that it is very useful to have the data not smoothed, amongst
other things for didactic reasons. I, for one, believe that the data
should never be smoothed as part of preprocessing: it's something so easy
to do on the fly.
Any chance that we can replay the preprocessing without the smoothing
option? (it feels like a major amount of work, but if we want this
dataset to become nilearn's standard dataset, it's important).
|
Ah, sorry, we have this conversation ongoing in two different places ! Maybe we can consolidate here, @KamalakerDadi @illdopejake ? Update from one of the comment threads:
|
That would be great. Thanks a lot! Once you are done. Please let me know. I will update my PR. |
I left a message on their openneuro dataset, but it doesn't seem like they check it very often. I will try to reach out directly to Dr. Saxe.
Yes I have an account we can use. I finally return home this weekend so we can work on this next week if you want/have time, @emdupre ? |
Thanks to @KamalakerDadi for the reminder on this ! All of the subjects have now been re-processed with fMRIPrep (version 1.3.0.post2). I applied the provided brainmask, downsampled to 4mm isotropic resolution, and re-cast to int8. The updated files are now available on OSF: https://osf.io/5hju4/files/ Hopefully this will work ! Let me know if I should make any modifications. |
Excellent! So, the data uploaded there is now non-smooth ? Thanks a lot! I will start working on this. |
Right, fMRIPrep does not smooth. I'll add a README with the boilerplate text from fMRIPrep. EDIT: I've added a README, viewable here: https://osf.io/wjtyq/ |
[fix] update MAIN fetcher for new OSF filenames
Closing this in favour of #1953 |
This PR follows up based on discussion happened on issue #1864 to test using light weight MAIN datasets in all examples which used
fetch_adhd
. This PR helps to see if they results good with this MAIN data. If the results are good then ADHD datasets will be replaced with MAIN.See issue #1864 for more details and efforts made to make this data light weight.
Still to do:
fetch_adhd