Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] test examples to replace ADHD with MAIN datasets #1887

Closed
wants to merge 130 commits into from

Conversation

KamalakerDadi
Copy link
Contributor

This PR follows up based on discussion happened on issue #1864 to test using light weight MAIN datasets in all examples which used fetch_adhd. This PR helps to see if they results good with this MAIN data. If the results are good then ADHD datasets will be replaced with MAIN.

See issue #1864 for more details and efforts made to make this data light weight.

Still to do:

  • Replace few more examples which uses fetch_adhd

kchawla-pi and others added 14 commits November 5, 2018 19:33
…ns like check_niimg. Adding copy_header to load_iimg fixes this
…I timeout

 - Increased CircleCI timeout to 7hrs due to recent increase in build time.
* 'master' of https://github.com/nilearn/nilearn: (138 commits)
  Fixed: Full builds do not create a testenv before installation, causing failure
  Reformatted commands for clarity after discussion with Gael
  Removed junk text that was added for testing circleci cached builds
  Added junk text for testing circleci cached builds, to be removed later
  Added back the inadvertently removed command for installation of linux packages
  Added back build for py2.7 without matplotlib after discussion wit Kamalaker
  Trivial change to make CircleCI reappear
  Removed 2 of 3 Py2.7 builds (unnecessary), added Py3.6 build
  Added comment to trigger change to make the disappeared CircleCI reappear in PR
  Improved: CircleCI builds are more efficient
  Bumped up the version to 0.5.0
  Corrected example images; small tweak in whats_new.rst
  Updated commits count for contributors
  Removed the erring comma in view_img parameters, fixed 2 example images
  Incorporated most of the feedback from Kamalaker's first review of this PR
  Updated contributor commits count
  detail
  improve warning
  vmin=0 when using threshold
  add vmin to view_stat_map
  ...
@codecov
Copy link

codecov bot commented Dec 22, 2018

Codecov Report

Merging #1887 into master will decrease coverage by <.01%.
The diff coverage is 94.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1887      +/-   ##
==========================================
- Coverage   95.23%   95.23%   -0.01%     
==========================================
  Files         135      135              
  Lines       17320    17407      +87     
==========================================
+ Hits        16495    16577      +82     
- Misses        825      830       +5
Impacted Files Coverage Δ
nilearn/datasets/__init__.py 100% <ø> (ø) ⬆️
nilearn/datasets/tests/test_func.py 100% <100%> (ø) ⬆️
nilearn/datasets/func.py 89.12% <91.07%> (+0.23%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2a5975...948ab37. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 22, 2018

Codecov Report

Merging #1887 into master will increase coverage by 0.08%.
The diff coverage is 96.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1887      +/-   ##
==========================================
+ Coverage   95.23%   95.32%   +0.08%     
==========================================
  Files         135      137       +2     
  Lines       17320    17856     +536     
==========================================
+ Hits        16495    17021     +526     
- Misses        825      835      +10
Impacted Files Coverage Δ
nilearn/datasets/__init__.py 100% <ø> (ø) ⬆️
nilearn/datasets/tests/test_func.py 100% <100%> (ø) ⬆️
nilearn/plotting/displays.py 95.45% <100%> (+0.43%) ⬆️
nilearn/datasets/func.py 89.51% <94.73%> (+0.62%) ⬆️
nilearn/tests/test_init.py 94.44% <0%> (-5.56%) ⬇️
nilearn/__init__.py 90.9% <0%> (-1.1%) ⬇️
nilearn/plotting/html_stat_map.py 98.64% <0%> (-0.01%) ⬇️
nilearn/plotting/tests/test_html_connectome.py 100% <0%> (ø) ⬆️
... and 21 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c2a5975...1025c84. Read the comment docs.

@GaelVaroquaux
Copy link
Member

Overall, this seems to work: some examples probably require tweaks, but nothing seems fundamentally broken.

@KamalakerDadi
Copy link
Contributor Author

Overall, this seems to work: some examples probably require tweaks, but nothing seems fundamentally broken.

There are few examples which are broken when I ran them on my box. Basically, we need to change n_subjects or tweak some parameters. Thats why I left them unchanged to get some opinion on current push. I will push with all examples changed.

The first EPI image of the data, visible on https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_rest_parcellations.html#sphx-glr-auto-examples-03-connectivity-plot-rest-parcellations-py, looks very strange to me. On an EPI, I expect the ventricles to stand out as the regions with the most signal

What would you like to change here ? Probably also increase more subjects ?

@@ -90,7 +90,7 @@
from nilearn.image import index_img

# Selecting specific maps to display: maps were manually chosen to be similar
indices = {dict_learning: 25, canica: 33}
indices = {dict_learning: 24, canica: 32}
# We select relevant cut coordinates for displaying
cut_component = index_img(components_imgs[0], indices[dict_learning])
cut_coords = find_xyz_cut_coords(cut_component)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KamalakerDadi , on this example, some of the visualizations of regions are broken: the contours are present, but not filled. Do you have an idea of why that might be the case?
https://3567-1235740-gh.circle-artifacts.com/0/home/circleci/project/doc/_build/html/auto_examples/03_connectivity/plot_compare_resting_state_decomposition.html#sphx-glr-auto-examples-03-connectivity-plot-compare-resting-state-decomposition-py
Do you have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well yeah I saw that and needs to be investigated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is OK, but the CanICA components is probably not 32.
Also the images are extremely smooth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the images are extremely smooth.

Is this a bad sign or good sign ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the images downloaded smoothed? @emdupre @illdopejake , do you know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !), but they were smoothed prior to distributing as OpenNeuro derivatives. From the derivatives description:

All data were smoothed using a Gaussian filter (5mm kernel).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, we did not do any additional smoothing. It's possible that we could reach out to the Saxe lab to see if they would mind posted the unsmoothed data?

AFAIK we did not additionally smooth them (please correct me if that's incorrect @illdopejake !)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any news on this non-smoothed data sharing aspect ?

Copy link
Member

@emdupre emdupre Feb 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

adhd_dataset = datasets.fetch_adhd(n_subjects=20)
func_filenames = adhd_dataset.func
confounds = adhd_dataset.confounds
main_dataset = datasets.fetch_main(n_subjects=20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this example, I think that we could increase the number of components from 5 to 8.

@KamalakerDadi
Copy link
Contributor Author

Side note: I won't bother about fixing AppVeyor right now. My current aim is to make examples going good.

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Dec 23, 2018 via email

Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great PR. Thx !
Besides the small comments I added, I endorse those of Gael.


Notes
-----
This functional MRI datasets are used as part of teaching how to use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is used

----------
data_dir: str
Path of the data directory. Used to force data storage in a specified
location. If None is given, data is stored in home directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data are stored in the home directory

The original data is downloaded from OpenNeuro
https://openneuro.org/datasets/ds000228/versions/1.0.0

This fetcher downloads downsampled data which is uploaded to Open
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... data that are available on Open...

@kchawla-pi
Copy link
Collaborator

kchawla-pi commented Feb 20, 2019

Huh.
Do these files work on their windows systems?
Any other datasets with colons in the filename that have not produced an error in AppveyorCI?

@emdupre
Copy link
Member

emdupre commented Feb 20, 2019

We can change the filenames on OSF -- it's from the original directory flattening. I don't develop in Windows (nor does anyone else who was in the MAIN tutorial, to my knowledge ?), so we likely just missed this entirely !

Would it be helpful for us to update the files ?

@kchawla-pi
Copy link
Collaborator

@emdupre That will be best. Perhaps a hyphen?

@kchawla-pi
Copy link
Collaborator

Thanks @emdupre !

@kchawla-pi
Copy link
Collaborator

kchawla-pi commented Feb 20, 2019

You guys already have a bunch of hyphens in there, so I will leave it to you.

The following are reserved characters in windows:

    < (less than)
    > (greater than)
    : (colon)
    " (double quote)
    / (forward slash)
    \ (backslash)
    | (vertical bar or pipe)
    ? (question mark)
    * (asterisk)

Source: https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file

@emdupre
Copy link
Member

emdupre commented Feb 20, 2019

OK, I just kept the filename and removed the (flattened) directory structure. Please let me know if there are other issues !

@KamalakerDadi
Copy link
Contributor Author

We can change the filenames on OSF

I want to try some things before changing the filenames.

@kchawla-pi
Copy link
Collaborator

@KamalakerDadi I think changing the filenames is the right call, so that other users running windows who directly access and download the files do not face this complication.

@KamalakerDadi
Copy link
Contributor Author

Why AppVeyor is building ?

@emdupre
Copy link
Member

emdupre commented Feb 20, 2019

I actually just updated them -- sorry for the miscommunication @KamalakerDadi ! I have the old ones and can roll back, but I'll wait for an explicit green light one way or the other.

@KamalakerDadi
Copy link
Contributor Author

I actually just updated them -- sorry for the miscommunication @KamalakerDadi

Ok. No worries.

@kchawla-pi
Copy link
Collaborator

@emdupre Thanks for being so quick, you are the coolest, and sorry for the inconvenience you have to bear.

@kchawla-pi
Copy link
Collaborator

@KamalakerDadi I triggerred a rebuild to see if that is the only error.

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Feb 20, 2019 via email

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Feb 21, 2019 via email

@emdupre
Copy link
Member

emdupre commented Feb 26, 2019

Ah, sorry, we have this conversation ongoing in two different places ! Maybe we can consolidate here, @KamalakerDadi @illdopejake ?

Update from one of the comment threads:

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

@KamalakerDadi
Copy link
Contributor Author

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

That would be great. Thanks a lot!

Once you are done. Please let me know. I will update my PR.

@illdopejake
Copy link
Contributor

I think it would be easiest to reprocess them ourselves, but not sure if @illdopejake is in contact with Saxe lab.

I left a message on their openneuro dataset, but it doesn't seem like they check it very often. I will try to reach out directly to Dr. Saxe.

If we do reprocess them ourselves I can run them through fMRIPrep on Compute Canada, but I need to set up an account. Not sure if you already have one we could submit through, Jake ? I have an fMRIPrep singularity image....

Yes I have an account we can use. I finally return home this weekend so we can work on this next week if you want/have time, @emdupre ?

@emdupre
Copy link
Member

emdupre commented Mar 18, 2019

Thanks to @KamalakerDadi for the reminder on this !

All of the subjects have now been re-processed with fMRIPrep (version 1.3.0.post2). I applied the provided brainmask, downsampled to 4mm isotropic resolution, and re-cast to int8. The updated files are now available on OSF: https://osf.io/5hju4/files/

Hopefully this will work ! Let me know if I should make any modifications.

@KamalakerDadi
Copy link
Contributor Author

The updated files are now available on OSF: https://osf.io/5hju4/files/

Excellent!

So, the data uploaded there is now non-smooth ?
It would be great if you could add simple README.txt there to read about pre-processing steps.

Thanks a lot! I will start working on this.

@emdupre
Copy link
Member

emdupre commented Mar 18, 2019

Right, fMRIPrep does not smooth. I'll add a README with the boilerplate text from fMRIPrep.

EDIT: I've added a README, viewable here: https://osf.io/wjtyq/

@KamalakerDadi
Copy link
Contributor Author

Closing this in favour of #1953

@KamalakerDadi KamalakerDadi deleted the downloader_main branch April 13, 2019 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet