Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Modify fetch_development_fmri to fetch adults or children #2035

Merged
merged 33 commits into from
Nov 9, 2019

Conversation

illdopejake
Copy link
Contributor

This updates the fetch_development_fmri to do the following:

  • Added an argument adult_or_child that lets the user decide if they want to pull only adults or only children
  • Made it so the first subject(s) pulled are adults, unless adult_or_child=='child'

@emdupre if you want to look over it. I have not made any project on the n<155 issue
@gkiar I did my first test if you want to have a look.

@illdopejake illdopejake changed the title Sprint modify fetch dev Modify fetch_development_fmri to fetch adults or children Apr 19, 2019
nilearn/datasets/func.py Outdated Show resolved Hide resolved
illdopejake and others added 2 commits April 19, 2019 17:33
Co-Authored-By: illdopejake <jacobwvogel@gmail.com>
Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

nilearn/datasets/func.py Outdated Show resolved Hide resolved
Copy link
Member

@emdupre emdupre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Great catch on this. I'll look into the 155 thing, too !

nilearn/datasets/func.py Outdated Show resolved Hide resolved
nilearn/datasets/func.py Outdated Show resolved Hide resolved
@@ -2076,35 +2083,44 @@ def fetch_development_fmri(n_subjects=None, reduce_confounds=True,
# Participants data: ids, demographics, etc
participants = _fetch_development_fmri_participants(data_dir=data_dir,
url=None,
verbose=verbose)
verbose=verbose
)[::-1] # so adults come first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this given L2121 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- unfortunately the next line does not actually pay attention to the order of the hstack. When I was doing it locally, it wasn't actually changing the order

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd.... maybe I can dig into that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the "problem" comes in at np.in1d. It returns a sorted array, and since children have the earlier IDs, they're returned first.One idea, then would be:

participants = participants[np.in1d(participants['participant_id'], ids)][ids.argsort()]

Since we'd be passing the indices of the unsorted array, this would return the list with the ids in the ordering generated from np.hstack.

WDYT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm into it! I'll make the adjustment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry should've been more specific. I did:

ds = datasets.fetch_development_fmri(n_subjects=5)

and I got

array([('sub-pixar002',  4.8569473 , '4yo', 'child', 'F', 'R'),
       ('sub-pixar003',  4.15331964, '4yo', 'child', 'F', 'R'),
       ('sub-pixar004',  4.47364819, '4yo', 'child', 'F', 'R'),
       ('sub-pixar123', 27.06      , 'Adult', 'adult', 'F', 'R'),
       ('sub-pixar001',  4.77481177, '4yo', 'child', 'M', 'R')],
      dtype=[('participant_id', '<U12'), ('Age', '<f8'), ('AgeGroup', '<U6'), ('Child_Adult', '<U5'), ('Gender', '<U4'), ('Handedness', '<U4')])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That doesn't match what I'm seeing...

But this is really small so I don't want to hold this PR up over it !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔
I'm too jetlagged to figure this out. i'll try again in the morning perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this question get resolved? (and are y'all over your jetlag? 😄 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emdupre and I met and solved it! I think this one is all set @GaelVaroquaux !

@codecov
Copy link

codecov bot commented Apr 22, 2019

Codecov Report

Merging #2035 into master will decrease coverage by 0.11%.
The diff coverage is 93.05%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2035      +/-   ##
==========================================
- Coverage   92.46%   92.35%   -0.12%     
==========================================
  Files         149      149              
  Lines       18945    18991      +46     
  Branches     2301     2307       +6     
==========================================
+ Hits        17518    17539      +21     
- Misses        911      932      +21     
- Partials      516      520       +4
Impacted Files Coverage Δ
nilearn/datasets/tests/test_func.py 99.44% <100%> (+0.03%) ⬆️
nilearn/datasets/func.py 82.46% <90.19%> (+1%) ⬆️
nilearn/reporting/rm_file.py 26.66% <0%> (-53.34%) ⬇️
nilearn/_utils/testing.py 78.76% <0%> (-4.8%) ⬇️
nilearn/plotting/__init__.py 95.65% <0%> (-4.35%) ⬇️
nilearn/plotting/tests/test_js_plotting_utils.py 97.6% <0%> (-1.6%) ⬇️
nilearn/datasets/tests/test_neurovault.py 95.84% <0%> (-0.98%) ⬇️
nilearn/image/image.py 95.63% <0%> (-0.73%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c117b1...0c58906. Read the comment docs.

@emdupre
Copy link
Member

emdupre commented Apr 22, 2019

@illdopejake I just ran this fetcher and am getting all 155 subjects -- not sure why it wouldn't have worked for you previously... Can you confirm it still isn't working on your machine with the master branch ?

nilearn/datasets/func.py Outdated Show resolved Hide resolved
@illdopejake
Copy link
Contributor Author

Can you confirm it still isn't working on your machine with the master branch ?

Still getting the 83, even in master. However, I passed a different directory to data_dir, and it downloaded all 155 to that directory...

@kchawla-pi
Copy link
Collaborator

I passed a different directory to data_dir, and it downloaded all 155 to that directory...

@illdopejake @emdupre
Could it be that the fetcher sees some (already downloaded) files in the default directory and decides that the dataset is already downloaded? Could you move/rename the existing data in the default download directory and try re-fetching and check?

@illdopejake
Copy link
Contributor Author

Could it be that the fetcher sees some (already downloaded) files in the default directory and decides that the dataset is already downloaded?

I don't think so. I tried deleting a few files in the directory. Nilearn found the directory and downloaded the missing files.

Could you move/rename the existing data in the default download directory and try re-fetching and check?

So to make matters more odd, I went to the directory that nilearn found, and in fact, all 155 images are there. This is really bizarre but I don't know what to tell you to be able to reproduce this. I think we can assume its just idiosyncratic to something I'm doing / have done locally, and may not be worth looking into..

data = func.fetch_development_fmri(n_subjects=1, reduce_confounds=False,
verbose=1)
age_group = data.phenotypic['Child_Adult'][0]
assert_equal(age_group,'adult')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the right test!!

@GaelVaroquaux
Copy link
Member

Can you merge master, so that circle runs.
Thanks!

@illdopejake
Copy link
Contributor Author

Can you merge master, so that circle runs.
Thanks!

Hmm.. CircleCI still failing..

@GaelVaroquaux
Copy link
Member

Circle is passing. Azure pipelines is failing.

@kchawla-pi
Copy link
Collaborator

kchawla-pi commented Apr 26, 2019 via email

@emdupre
Copy link
Member

emdupre commented Oct 17, 2019

there is a bug here: when n_subjects == 1 it always returns an adult even if adults_or_children == "adults"there is a bug here: when n_subjects == 1 it always returns an adult even if adults_or_children == "adults"

This is now fixed with 18eb907 !

@illdopejake also caught another corner case -- when n_subjects=2 and adults_or_children=both, only two children were returned since the ratio of children to adults is so lopsided (even though a user might reasonably expect one of each when asking for both age groups and fetching only two subjects). He fixed this and added some relevant documentation in 7470f67 🚀

Even still, I think this is on hold until #2177 is merged because of the numerical instability in the example @jeromedockes and @kchawla-pi identified above.

But, one question for the group: I really don't love the adults_or_children parameter -- I find it's quite long and I often mistype the children value as child when working with it. WDYT of updating to age_group with possible values ['adult', 'child', 'both'] ?

@kchawla-pi
Copy link
Collaborator

WDYT of updating to age_group with possible values ['adult', 'child', 'both'] ?

+1 I like this. We are not a stickler for 'adults', I don't see why we have to be for 'children'.

kchawla-pi and others added 5 commits October 17, 2019 14:42
…nt_modify_fetch_dev

* 'master' of https://github.com/nilearn/nilearn:
  Removed sub-example due to unfit for lasso dataset - unstable float values (nilearn#2177)
  Fix Flake8 errors overlooked when merging PR nilearn#2028 (Plot connectome strength) (nilearn#2174)
[sty] rename adults_or_children to age_group, add test
Comment on lines 2088 to 2091
if age_group not in ['both', 'child', 'adult']:
raise ValueError("Wrong value for age_group={0}. "
"Valid arguments are 'adult', 'child' "
"or 'both'".format(age_group))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if age_group not in ['both', 'child', 'adult']:
raise ValueError("Wrong value for age_group={0}. "
"Valid arguments are 'adult', 'child' "
"or 'both'".format(age_group))
valid_groups = ['both', 'child', 'adult']
if age_group not in valid_groups:
raise ValueError("Wrong value for age_group={0}. "
"Valid arguments are {1}".format(age_group. valid_groups))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this !

.format(n_subjects, max_subjects))
"value (for age_group={1}) will be used instead: "
"n_subjects={2}"
.format(n_subjects, age_group, max_subjects))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a test where this warning is actually raised & verified.

@emdupre emdupre added the Blocker label Nov 4, 2019
@emdupre
Copy link
Member

emdupre commented Nov 4, 2019

I think this is blocking the beta release, as it will change the behavior of the development_fmri fetcher.

... but, it seems that we're just missing tests, here ! Would you be able to look into these some time this week, @illdopejake ?

…nt_modify_fetch_dev

* 'master' of https://github.com/nilearn/nilearn:
  Verbose doc building to ease tracking of progress & diagnose stalls (nilearn#2203)
  New conda env is created once conda path has been activated
  Conda environment is created for full-builds
  Refactor CircleCI config for reduced redundancy (nilearn#2204)
  Installation should fail on Python < 3.5 (nilearn#2198)
  [MRG] Add get data function (nilearn#2172)
  fix error when colorscale given boolean array (nilearn#2193)
  remove is_valid filter (nilearn#2169)
  Moved new entries to next release
  FIX: marker size issue in plot_connectome nilearn#2185 (nilearn#2186)
  Updated Appveyor status badge
  Run tests on local Windows machines & Azure Pipelines (nilearn#2191)
  Updated requirements list for devs (nilearn#2190)
  Update azure-pipelines.yml for Azure Pipelines
  Release Nilearn 0.6.0 alpha (nilearn#2164)
  Making fetch_localizer_button_task backwards compatibile (nilearn#2182)
  [DOC] Update whats_new to reference nilearn#2013 (Merging of several examples) (nilearn#2183)
@kchawla-pi kchawla-pi merged commit 019e372 into nilearn:master Nov 9, 2019
kchawla-pi added a commit to kchawla-pi/nilearn that referenced this pull request Nov 9, 2019
kchawla-pi added a commit to kchawla-pi/nilearn that referenced this pull request Nov 9, 2019
@kchawla-pi
Copy link
Collaborator

Thanks for the phenomenal work you did for Niearn @illdopejake !
Thanks @emdupre !

@bthirion
Copy link
Member

This is awesome work indeed. Thx !

kchawla-pi added a commit to kchawla-pi/nilearn that referenced this pull request Nov 19, 2019
* 'master' of github.com:nilearn/nilearn:
  Fixed the redundant & missing test case in merged PR nilearn#2035 (nilearn#2205)
  Modify fetch_development_fmri to fetch adults or children (nilearn#2035)
  Verbose doc building to ease tracking of progress & diagnose stalls (nilearn#2203)
  New conda env is created once conda path has been activated
  Conda environment is created for full-builds
  Refactor CircleCI config for reduced redundancy (nilearn#2204)
kchawla-pi added a commit to kchawla-pi/nilearn that referenced this pull request Nov 20, 2019
* 'master' of github.com:nilearn/nilearn:
  Rel 060b0 (nilearn#2208)
  Nilearn 0.6.0b0 release (nilearn#2206)
  Fixed the redundant & missing test case in merged PR nilearn#2035 (nilearn#2205)
  Modify fetch_development_fmri to fetch adults or children (nilearn#2035)
  Verbose doc building to ease tracking of progress & diagnose stalls (nilearn#2203)
  New conda env is created once conda path has been activated
kchawla-pi added a commit to pausz/nilearn that referenced this pull request Nov 21, 2019
…smooth-image

* 'master' of https://github.com/nilearn/nilearn: (25 commits)
  Add testing for Nilearn setup & install & Fix the broken dependency installation (nilearn#2201)
  Fix uniform ball cloud test for sklearn >= 0.22 (nilearn#2175)
  MAINT: sklearn Deprecations (nilearn#2219)
  DOC: title in bold
  [DOC] Add note about decreasing memory usage (nilearn#2223)
  Rel 060b0 (nilearn#2208)
  Nilearn 0.6.0b0 release (nilearn#2206)
  Fixed the redundant & missing test case in merged PR nilearn#2035 (nilearn#2205)
  Modify fetch_development_fmri to fetch adults or children (nilearn#2035)
  Verbose doc building to ease tracking of progress & diagnose stalls (nilearn#2203)
  New conda env is created once conda path has been activated
  Conda environment is created for full-builds
  Refactor CircleCI config for reduced redundancy (nilearn#2204)
  Installation should fail on Python < 3.5 (nilearn#2198)
  [MRG] Add get data function (nilearn#2172)
  fix error when colorscale given boolean array (nilearn#2193)
  remove is_valid filter (nilearn#2169)
  Moved new entries to next release
  FIX: marker size issue in plot_connectome nilearn#2185 (nilearn#2186)
  Updated Appveyor status badge
  ...
kchawla-pi added a commit to darya-chyzhyk/nilearn that referenced this pull request Nov 24, 2019
…s-craddock

* 'master' of https://github.com/nilearn/nilearn: (620 commits)
  CI: Test pre-release versions of dependencies (nilearn#2224)
  Fix:  nilearn.image.smooth_img fails if fwhm is an ndarray (nilearn#2107)
  Update Python version warnings for Py35 deprecation (nilearn#2214)
  Fixing failure in minimum build of TravisCI (nilearn#2227)
  Updated missing whats_new entry
  Add testing for Nilearn setup & install & Fix the broken dependency installation (nilearn#2201)
  Fix uniform ball cloud test for sklearn >= 0.22 (nilearn#2175)
  MAINT: sklearn Deprecations (nilearn#2219)
  DOC: title in bold
  [DOC] Add note about decreasing memory usage (nilearn#2223)
  Rel 060b0 (nilearn#2208)
  Nilearn 0.6.0b0 release (nilearn#2206)
  Fixed the redundant & missing test case in merged PR nilearn#2035 (nilearn#2205)
  Modify fetch_development_fmri to fetch adults or children (nilearn#2035)
  Verbose doc building to ease tracking of progress & diagnose stalls (nilearn#2203)
  New conda env is created once conda path has been activated
  Conda environment is created for full-builds
  Refactor CircleCI config for reduced redundancy (nilearn#2204)
  Installation should fail on Python < 3.5 (nilearn#2198)
  [MRG] Add get data function (nilearn#2172)
  ...

# Conflicts:
#	nilearn/datasets/atlas.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants