[WIP] test examples to replace ADHD with MAIN datasets #1887

KamalakerDadi · 2018-12-22T13:20:35Z

This PR follows up based on discussion happened on issue #1864 to test using light weight MAIN datasets in all examples which used fetch_adhd. This PR helps to see if they results good with this MAIN data. If the results are good then ADHD datasets will be replaced with MAIN.

See issue #1864 for more details and efforts made to make this data light weight.

Still to do:

Replace few more examples which uses fetch_adhd

…ns like check_niimg. Adding copy_header to load_iimg fixes this

…I timeout - Increased CircleCI timeout to 7hrs due to recent increase in build time.

* 'master' of https://github.com/nilearn/nilearn: (138 commits) Fixed: Full builds do not create a testenv before installation, causing failure Reformatted commands for clarity after discussion with Gael Removed junk text that was added for testing circleci cached builds Added junk text for testing circleci cached builds, to be removed later Added back the inadvertently removed command for installation of linux packages Added back build for py2.7 without matplotlib after discussion wit Kamalaker Trivial change to make CircleCI reappear Removed 2 of 3 Py2.7 builds (unnecessary), added Py3.6 build Added comment to trigger change to make the disappeared CircleCI reappear in PR Improved: CircleCI builds are more efficient Bumped up the version to 0.5.0 Corrected example images; small tweak in whats_new.rst Updated commits count for contributors Removed the erring comma in view_img parameters, fixed 2 example images Incorporated most of the feedback from Kamalaker's first review of this PR Updated contributor commits count detail improve warning vmin=0 when using threshold add vmin to view_stat_map ...

codecov · 2018-12-22T15:30:14Z

Codecov Report

Merging #1887 into master will decrease coverage by <.01%.
The diff coverage is 94.25%.

@@            Coverage Diff             @@
##           master    #1887      +/-   ##
==========================================
- Coverage   95.23%   95.23%   -0.01%     
==========================================
  Files         135      135              
  Lines       17320    17407      +87     
==========================================
+ Hits        16495    16577      +82     
- Misses        825      830       +5

Impacted Files	Coverage Δ
nilearn/datasets/__init__.py	`100% <ø> (ø)`	⬆️
nilearn/datasets/tests/test_func.py	`100% <100%> (ø)`	⬆️
nilearn/datasets/func.py	`89.12% <91.07%> (+0.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2a5975...948ab37. Read the comment docs.

codecov · 2018-12-22T15:30:14Z

Codecov Report

Merging #1887 into master will increase coverage by 0.08%.
The diff coverage is 96.73%.

@@            Coverage Diff             @@
##           master    #1887      +/-   ##
==========================================
+ Coverage   95.23%   95.32%   +0.08%     
==========================================
  Files         135      137       +2     
  Lines       17320    17856     +536     
==========================================
+ Hits        16495    17021     +526     
- Misses        825      835      +10

Impacted Files	Coverage Δ
nilearn/datasets/__init__.py	`100% <ø> (ø)`	⬆️
nilearn/datasets/tests/test_func.py	`100% <100%> (ø)`	⬆️
nilearn/plotting/displays.py	`95.45% <100%> (+0.43%)`	⬆️
nilearn/datasets/func.py	`89.51% <94.73%> (+0.62%)`	⬆️
nilearn/tests/test_init.py	`94.44% <0%> (-5.56%)`	⬇️
nilearn/__init__.py	`90.9% <0%> (-1.1%)`	⬇️
nilearn/plotting/html_stat_map.py	`98.64% <0%> (-0.01%)`	⬇️
nilearn/plotting/tests/test_html_connectome.py	`100% <0%> (ø)`	⬆️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2a5975...1025c84. Read the comment docs.

GaelVaroquaux · 2018-12-23T13:11:09Z

Overall, this seems to work: some examples probably require tweaks, but nothing seems fundamentally broken.

I would replace the name "main" by something else. Maybe "fetch_rsfmri_development"? What do you think @illdopejake?
The first EPI image of the data, visible on https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_rest_parcellations.html#sphx-glr-auto-examples-03-connectivity-plot-rest-parcellations-py, looks very strange to me. On an EPI, I expect the ventricles to stand out as the regions with the most signal
The data is now probably of smaller resolution. As a result, the number of components can be increased in https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_extract_regions_dictlearning_maps.html#sphx-glr-auto-examples-03-connectivity-plot-extract-regions-dictlearning-maps-py. Maybe from 5 to 8?

KamalakerDadi · 2018-12-23T13:16:06Z

Overall, this seems to work: some examples probably require tweaks, but nothing seems fundamentally broken.

There are few examples which are broken when I ran them on my box. Basically, we need to change n_subjects or tweak some parameters. Thats why I left them unchanged to get some opinion on current push. I will push with all examples changed.

The first EPI image of the data, visible on https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_rest_parcellations.html#sphx-glr-auto-examples-03-connectivity-plot-rest-parcellations-py, looks very strange to me. On an EPI, I expect the ventricles to stand out as the regions with the most signal

What would you like to change here ? Probably also increase more subjects ?

GaelVaroquaux · 2018-12-23T13:17:21Z

examples/03_connectivity/plot_compare_resting_state_decomposition.py

@@ -90,7 +90,7 @@
 from nilearn.image import index_img

 # Selecting specific maps to display: maps were manually chosen to be similar
-indices = {dict_learning: 25, canica: 33}
+indices = {dict_learning: 24, canica: 32}
 # We select relevant cut coordinates for displaying
 cut_component = index_img(components_imgs[0], indices[dict_learning])
 cut_coords = find_xyz_cut_coords(cut_component)


@KamalakerDadi , on this example, some of the visualizations of regions are broken: the contours are present, but not filled. Do you have an idea of why that might be the case?
https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_compare_resting_state_decomposition.html#sphx-glr-auto-examples-03-connectivity-plot-compare-resting-state-decomposition-py
Do you have

Well yeah I saw that and needs to be investigated.

Does this looks good now ?

https://3645-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_compare_resting_state_decomposition.html#sphx-glr-auto-examples-03-connectivity-plot-compare-resting-state-decomposition-py

It is OK, but the CanICA components is probably not 32.
Also the images are extremely smooth.

Also the images are extremely smooth.

Is this a bad sign or good sign ?

Are the images downloaded smoothed? @emdupre @illdopejake , do you know?

AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !), but they were smoothed prior to distributing as OpenNeuro derivatives. From the derivatives description:

All data were smoothed using a Gaussian filter (5mm kernel).

That's correct, we did not do any additional smoothing. It's possible that we could reach out to the Saxe lab to see if they would mind posted the unsmoothed data?

AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !)

Any news on this non-smoothed data sharing aspect ?

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

GaelVaroquaux · 2018-12-23T13:18:10Z

examples/03_connectivity/plot_extract_regions_dictlearning_maps.py

-adhd_dataset = datasets.fetch_adhd(n_subjects=20)
-func_filenames = adhd_dataset.func
-confounds = adhd_dataset.confounds
+main_dataset = datasets.fetch_main(n_subjects=20)


On this example, I think that we could increase the number of components from 5 to 8.

KamalakerDadi · 2018-12-23T13:19:38Z

Side note: I won't bother about fixing AppVeyor right now. My current aim is to make examples going good.

GaelVaroquaux · 2018-12-23T20:04:14Z

Side note: I won't bother about fixing AppVeyor right now. My current aim is to make examples going good.

Agreed!

bthirion

This is a great PR. Thx !
Besides the small comments I added, I endorse those of Gael.

bthirion · 2018-12-27T17:36:37Z

nilearn/datasets/description/main.rst

+
+Notes
+-----
+This functional MRI datasets are used as part of teaching how to use


bthirion · 2018-12-27T17:37:35Z

nilearn/datasets/func.py

+    ----------
+    data_dir: str
+        Path of the data directory. Used to force data storage in a specified
+        location. If None is given, data is stored in home directory.


data are stored in the home directory

bthirion · 2018-12-27T17:40:00Z

nilearn/datasets/func.py

+    The original data is downloaded from OpenNeuro
+    https://openneuro.org/datasets/ds000228/versions/1.0.0
+
+    This fetcher downloads downsampled data which is uploaded to Open


... data that are available on Open...

kchawla-pi · 2019-02-20T17:40:05Z

Huh.
Do these files work on their windows systems?
Any other datasets with colons in the filename that have not produced an error in AppveyorCI?

emdupre · 2019-02-20T18:54:49Z

We can change the filenames on OSF -- it's from the original directory flattening. I don't develop in Windows (nor does anyone else who was in the MAIN tutorial, to my knowledge ?), so we likely just missed this entirely !

Would it be helpful for us to update the files ?

kchawla-pi · 2019-02-20T18:56:30Z

@emdupre That will be best. Perhaps a hyphen?

kchawla-pi · 2019-02-20T18:56:59Z

Thanks @emdupre !

kchawla-pi · 2019-02-20T19:05:24Z

You guys already have a bunch of hyphens in there, so I will leave it to you.

The following are reserved characters in windows:

    < (less than)
    > (greater than)
    : (colon)
    " (double quote)
    / (forward slash)
    \ (backslash)
    | (vertical bar or pipe)
    ? (question mark)
    * (asterisk)

Source: https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file

emdupre · 2019-02-20T19:19:28Z

OK, I just kept the filename and removed the (flattened) directory structure. Please let me know if there are other issues !

KamalakerDadi · 2019-02-20T19:26:43Z

We can change the filenames on OSF

I want to try some things before changing the filenames.

kchawla-pi · 2019-02-20T19:28:30Z

@KamalakerDadi I think changing the filenames is the right call, so that other users running windows who directly access and download the files do not face this complication.

KamalakerDadi · 2019-02-20T19:28:30Z

Why AppVeyor is building ?

emdupre · 2019-02-20T19:28:32Z

I actually just updated them -- sorry for the miscommunication @KamalakerDadi ! I have the old ones and can roll back, but I'll wait for an explicit green light one way or the other.

KamalakerDadi · 2019-02-20T19:29:14Z

I actually just updated them -- sorry for the miscommunication @KamalakerDadi

Ok. No worries.

kchawla-pi · 2019-02-20T19:30:04Z

@emdupre Thanks for being so quick, you are the coolest, and sorry for the inconvenience you have to bear.

kchawla-pi · 2019-02-20T19:30:37Z

@KamalakerDadi I triggerred a rebuild to see if that is the only error.

GaelVaroquaux · 2019-02-20T21:08:36Z

Neither, but qualitatively it gives the impression of slightly too much smoothing on the data.

Agreed. Is the data that we are downloading already smoothed (ping @illdopejake)?

GaelVaroquaux · 2019-02-21T05:54:11Z

but they were smoothed prior to distributing as OpenNeuro derivatives. From the derivatives description: All data were smoothed using a Gaussian filter (5mm kernel).

Ah! What a pity! We find that it is very useful to have the data not smoothed, amongst other things for didactic reasons. I, for one, believe that the data should never be smoothed as part of preprocessing: it's something so easy to do on the fly. Any chance that we can replay the preprocessing without the smoothing option? (it feels like a major amount of work, but if we want this dataset to become nilearn's standard dataset, it's important).

emdupre · 2019-02-26T16:22:18Z

Ah, sorry, we have this conversation ongoing in two different places ! Maybe we can consolidate here, @KamalakerDadi @illdopejake ?

Update from one of the comment threads:

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

KamalakerDadi · 2019-02-26T18:35:10Z

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

That would be great. Thanks a lot!

Once you are done. Please let me know. I will update my PR.

illdopejake · 2019-02-26T20:55:07Z

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

I left a message on their openneuro dataset, but it doesn't seem like they check it very often. I will try to reach out directly to Dr. Saxe.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

Yes I have an account we can use. I finally return home this weekend so we can work on this next week if you want/have time, @emdupre ?

emdupre · 2019-03-18T19:47:29Z

Thanks to @KamalakerDadi for the reminder on this !

All of the subjects have now been re-processed with fMRIPrep (version 1.3.0.post2). I applied the provided brainmask, downsampled to 4mm isotropic resolution, and re-cast to int8. The updated files are now available on OSF: https://osf.io/5hju4/files/

Hopefully this will work ! Let me know if I should make any modifications.

KamalakerDadi · 2019-03-18T20:02:16Z

The updated files are now available on OSF: https://osf.io/5hju4/files/

Excellent!

So, the data uploaded there is now non-smooth ?
It would be great if you could add simple README.txt there to read about pre-processing steps.

Thanks a lot! I will start working on this.

emdupre · 2019-03-18T20:04:52Z

Right, fMRIPrep does not smooth. I'll add a README with the boilerplate text from fMRIPrep.

EDIT: I've added a README, viewable here: https://osf.io/wjtyq/

[fix] update MAIN fetcher for new OSF filenames

KamalakerDadi · 2019-03-30T12:17:53Z

Closing this in favour of #1953

kchawla-pi and others added 14 commits November 5, 2018 19:33

Merge branch 'master' of https://github.com/nilearn/nilearn

55e253a

New OrthoSlicer Class

f8d7c55

Added two tests for display class TiledSlicer

2ede117

added example

50dc979

If dtype!=None then information in the header gets deleted in functio…

e26a66d

…ns like check_niimg. Adding copy_header to load_iimg fixes this

Add call to update header's dtype

68385f5

Added a DeprecationWarning for Python 3.4

d49290b

Added tests for DeprecationWarnings for Python 3.4; increased CircleC…

18b3a93

…I timeout - Increased CircleCI timeout to 7hrs due to recent increase in build time.

adressing requested changes

e52b5d6

improved docstring and formatting

bb2a460

fixed issue: not all cuts displayed

3887e83

added option 'display_mode = tiled' to docstring of plotting functions

c8ac52b

[WIP] test examples to replace ADHD with MAIN datasets

70ee303

KamalakerDadi mentioned this pull request Dec 22, 2018

Adding a new machine learning tutorial to Nilearn #1864

Closed

tests in AppVeyor

948ab37

GaelVaroquaux reviewed Dec 23, 2018

View reviewed changes

KamalakerDadi added 4 commits December 24, 2018 17:38

Changed fetch_adhd to main functional datasets in all examples

7f0659c

DOC: fix pattern not found in masker_objects.rst

9f7b14b

DOC: Added in modules reference.rst

43feabb

DOC: Fix title underline too short issue

fb9dd79

bthirion reviewed Dec 27, 2018

View reviewed changes

added demo for TiledSlicer

b625164

emdupre and others added 5 commits March 29, 2019 17:18

[fix] update MAIN fetcher for new OSF filenames

b1d2007

[temp] fake sub63

ec407a3

Merge pull request #9 from emdupre/fix/update-osf

b5aab12

[fix] update MAIN fetcher for new OSF filenames

confl

4eccb80

confl mess

1025c84

KamalakerDadi mentioned this pull request Mar 30, 2019

Update ADHD dataset in examples to MAIN #1953

Merged

KamalakerDadi closed this Mar 30, 2019

KamalakerDadi deleted the downloader_main branch April 13, 2019 20:11

[WIP] test examples to replace ADHD with MAIN datasets #1887

[WIP] test examples to replace ADHD with MAIN datasets #1887

Conversation

KamalakerDadi commented Dec 22, 2018

codecov bot commented Dec 22, 2018

Codecov Report

codecov bot commented Dec 22, 2018 • edited

Codecov Report

GaelVaroquaux commented Dec 23, 2018

KamalakerDadi commented Dec 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

illdopejake Feb 21, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emdupre Feb 26, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KamalakerDadi commented Dec 23, 2018

GaelVaroquaux commented Dec 23, 2018 via email

bthirion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kchawla-pi commented Feb 20, 2019 • edited

emdupre commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019 • edited

emdupre commented Feb 20, 2019

KamalakerDadi commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019

KamalakerDadi commented Feb 20, 2019

emdupre commented Feb 20, 2019

KamalakerDadi commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019

kchawla-pi commented Feb 20, 2019

GaelVaroquaux commented Feb 20, 2019 via email

GaelVaroquaux commented Feb 21, 2019 via email

emdupre commented Feb 26, 2019

KamalakerDadi commented Feb 26, 2019

illdopejake commented Feb 26, 2019

emdupre commented Mar 18, 2019 • edited

KamalakerDadi commented Mar 18, 2019

emdupre commented Mar 18, 2019 • edited

KamalakerDadi commented Mar 30, 2019

codecov bot commented Dec 22, 2018 •

edited

illdopejake Feb 21, 2019 •

edited

emdupre Feb 26, 2019 •

edited

kchawla-pi commented Feb 20, 2019 •

edited

kchawla-pi commented Feb 20, 2019 •

edited

emdupre commented Mar 18, 2019 •

edited

emdupre commented Mar 18, 2019 •

edited