[FIX] Replace numpy with pandas in data loaders #2829

achamma723 · 2021-05-07T17:24:17Z

This PR is fixing the issue of replacing Numpy with Pandas in data loaders:
Closes #2621

codecov · 2021-05-07T17:30:16Z

Codecov Report

Merging #2829 (65fe185) into main (d91545d) will decrease coverage by 0.01%.
The diff coverage is 90.32%.

@@            Coverage Diff             @@
##             main    #2829      +/-   ##
==========================================
- Coverage   90.45%   90.44%   -0.02%     
==========================================
  Files         122      122              
  Lines       14562    14593      +31     
  Branches     2973     2982       +9     
==========================================
+ Hits        13172    13198      +26     
  Misses        824      824              
- Partials      566      571       +5

Impacted Files	Coverage Δ
nilearn/datasets/__init__.py	`100.00% <ø> (ø)`
nilearn/datasets/atlas.py	`92.08% <87.17%> (-0.75%)`	⬇️
nilearn/datasets/func.py	`76.99% <93.33%> (+0.05%)`	⬆️
nilearn/_utils/docs.py	`91.93% <100.00%> (+0.06%)`	⬆️
nilearn/datasets/struct.py	`91.54% <100.00%> (+0.26%)`	⬆️
nilearn/decoding/space_net.py	`88.18% <0.00%> (-0.55%)`	⬇️
nilearn/reporting/_get_clusters_table.py	`100.00% <0.00%> (+3.12%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d91545d...65fe185. Read the comment docs.

NicolasGensollen · 2021-05-07T17:43:42Z

Thanks @achamma723 !
FYI: this change to pandas is a breaking change and has been scheduled for release 0.9 (see also #2655)

bthirion

LGTM, thx !

NicolasGensollen · 2021-12-07T15:09:29Z

Hi @achamma723
Could you rebase this PR ad we are approaching release 0.9.0 ?

Fixes c87847b / nilearn#3003.

achamma723 · 2021-12-16T16:05:37Z

Hello @NicolasGensollen , Is there an error with test_fetch_atlas_destrieux_2009?

jeromedockes · 2021-12-16T16:54:44Z

nilearn/datasets/atlas.py

@@ -237,7 +238,7 @@ def fetch_atlas_destrieux_2009(lateralized=True, data_dir=None, url=None,
    files_ = _fetch_files(data_dir, files, resume=resume,
                          verbose=verbose)

-    params = dict(maps=files_[1], labels=np.recfromcsv(files_[0]))
+    params = dict(maps=files_[1], labels=pd.read_csv(files_[0]))


This would need to be something like

params = dict(maps=files_[1], labels=pd.read_csv(files_[0], index_col=0).to_records())

jeromedockes · 2021-12-16T16:56:47Z

Hello @NicolasGensollen , Is there an error with test_fetch_atlas_destrieux_2009?

there isn't an error with the test; the PR changes the output of the function: it is now a pd DataFrame with 2 columns (one repeating the index) rather than a 1d np recarray

jeromedockes · 2021-12-16T17:19:25Z

.ipynb_checkpoints/Untitled-checkpoint.ipynb

@@ -0,0 +1,6 @@
+{
+ "cells": [],


could you remove the checkpoints and the notebook? thanks!

nilearn/datasets/atlas.py

nilearn/datasets/func.py

nilearn/datasets/struct.py

nilearn/datasets/tests/test_func.py

jeromedockes · 2021-12-16T17:33:49Z

@NicolasGensollen I couldn't tell from the issue description: did we just want to start using pandas internally, or do we want to start returning DataFrames or Series instead of recarrays wherever they were used? In the latter case we need a deprecation cycle

NicolasGensollen · 2021-12-16T17:38:42Z

@jeromedockes there is already a deprecation cycle running, initiated with #2663, and supposed to end with release 0.9.

NicolasGensollen · 2021-12-16T17:40:59Z

Also linking related #2655

jeromedockes · 2021-12-16T17:47:16Z

@jeromedockes there is already a deprecation cycle running, initiated with #2663, and supposed to end with release 0.9.

Ok thanks! in that case we need to track down those that still return recarrays, such as fetch_atlas_difumo

NicolasGensollen · 2022-01-10T11:26:58Z

@achamma723 @jeromedockes I might be missing something, but I don't understand why we would sometimes return pandas dataframes and sometimes rec arrays... 🤔
My understanding of the goal of this PR was to get rid or rec arrays in all the fetchers, and use (and return) pandas dataframes instead, right?
I see even a few fetchers where we load through Pandas to get a dataframe, and then convert to records.

jeromedockes · 2022-01-10T12:44:50Z

My understanding of the goal of this PR was to get rid or rec arrays in all the fetchers, and use (and return) pandas dataframes instead, right?

yes, IIUC it will always be dataframes or series

NicolasGensollen

@achamma723 Could you rebase or merge master to get the doc changes of #3083 ?

nilearn/datasets/atlas.py

NicolasGensollen

Thanks for merging master @achamma723
In the last meeting, we decided to extend the deprecation cycle for rec arrays as we realized the current warning might have been a little light for such a change.
The idea would be to add an optional parameter to affected fetchers (return_dataframe or something alike) defaulting to the current behavior of returning rec arrays for backward compatibility.
Let me know if you want to take care of this, otherwise I can commit to your branch.

jeromedockes · 2022-01-17T16:25:08Z

so many fetchers may have a part that roughly looks like:

if legacy_format:
    result["confounds"] = result["confounds"].to_records()
    ...

* Add failing test * get rid of scaling_axis attribute * deprecate scaling_axis * [circle full] add whatsnew entry * Fix

) * Including Hierarchical Kmeans in doc and example * Adding hierarchical k_means and make it available in parcellations * test hierarchical_kmeans and test Parcellation using it * fix incompatible apis of fit() and transform() methods * Improve docstring and example layout * "Upgrade to get_data function" * "changing nose function to pytest function as the rest of the library evolved" * "solving sklearn.utils.testing deprecation" * "improve flake8" * "flake8, missing import, adding test exception" * Apply suggestions from code review Including B.T. suggestions Co-authored-by: bthirion <bertrand.thirion@inria.fr> * "adding tests for 'scaling' and warning ; fixed scaling bug" * "test_refactoring" * "flake8 fixes" * "clarify example, fix PEP8" * Apply suggestions from code review Co-authored-by: bthirion <bertrand.thirion@inria.fr> * "apply suggestions from N.G. review" * Apply suggestions from code review Co-authored-by: Gensollen <nicolas.gensollen@gmail.com> * "apply B.T refactoring" * "flake8" * "flake8" * "codespell" * "enforcing final number of clusters is the one asked by the user" * "solving bugging test_case" * "solving edge case" * "missing float casting" * "check that adjusted clusters are int" * "flake8 fix" * Update nilearn/regions/hierarchical_kmeans_clustering.py Co-authored-by: bthirion <bertrand.thirion@inria.fr> * Update examples/03_connectivity/plot_data_driven_parcellations.py Co-authored-by: bthirion <bertrand.thirion@inria.fr> * Update examples/03_connectivity/plot_data_driven_parcellations.py Co-authored-by: bthirion <bertrand.thirion@inria.fr> * "fix code review comments" * 'fix bug introduced by review suggestion' * Update nilearn/regions/hierarchical_kmeans_clustering.py Co-authored-by: Gensollen <nicolas.gensollen@gmail.com> * Update nilearn/regions/hierarchical_kmeans_clustering.py Co-authored-by: Gensollen <nicolas.gensollen@gmail.com> * Update nilearn/regions/hierarchical_kmeans_clustering.py Co-authored-by: Gensollen <nicolas.gensollen@gmail.com> * Update nilearn/regions/hierarchical_kmeans_clustering.py Co-authored-by: Gensollen <nicolas.gensollen@gmail.com> * "apply code review suggestions" * "add whats_new" * fixing docstring * addressing comments of jdockes * trying to fix pep8 violations * simplifying hierarchical k-means avoiding useless checks * pep8 * Addressing other comments by jdockes Co-authored-by: BAZEILLE Thomas <thomas.bazeille@inria.fr> Co-authored-by: bthirion <bertrand.thirion@inria.fr> Co-authored-by: Gensollen <nicolas.gensollen@gmail.com>

* added deploy key * clean deploy key * clean deploy key * clean cricreci.yml * clean circleci.yml * clean circleci.yml * clean circleci.yml * fixed Hommel function + added test * pep8 fixes * pep8 fixes * pep8 fixes * correct use of assert * revert unrelated change * minor fixed on hommel value computation * addedwhatsnew entry Co-authored-by: Gensollen <nicolas.gensollen@gmail.com>

* Add gh_substitutions * Add links to contributor profiles (unfinished) * Use contributor links in AUTHORS * Break and refactor changelog (unfinished...) * [circle full] Fix typo and run full build in strict mode. * Add more links * iter * [circle full] request full build * Add more links again * [circle full] Iter. * Update AUTHORS.rst * gh_role --> _gh_role * [circle full] Add fundings. * [circle full] Fix wrong ref. * [circle full] update niconnect link * [circle full] fix ref * fix refs: input_data -> maskers * [circle full] request full build * [circle full] fix old ref to fetch_cobre * [circle full] fix broken ref * [circle full] Fix error in rebase

NicolasGensollen · 2022-01-26T12:04:59Z

Thanks @achamma723 !
@jeromedockes do you want to take another look at it before we merge?

jeromedockes

thanks @achamma723 and @NicolasGensollen
@NicolasGensollen if you had another look at the fetchers and all the changes in column names have been fixed I think we're good!

my only remaining question is about the whatsnew, not sure if there was an issue with the rebase due to #3049

nilearn/datasets/struct.py

doc/whats_new.rst

* bump dependencies * [circle full] Add whatsnew.

bthirion · 2022-01-27T21:01:27Z

requirements-min.txt

@@ -1,8 +1,8 @@
-numpy==1.16


Hm, this should not be here...

@achamma723 could you rebase to clean this? Thanks!

bthirion · 2022-01-27T21:02:10Z

besides the requirement thing, this looks OK.

* Add tests min req with matplotlib * remove useless variable * [circle full] update README

NicolasGensollen · 2022-01-28T11:59:06Z

Thanks @achamma723 but it looks like there are still unrelated changes. I'm not sure how you rebase, but you should be able to drop the unrelated commits.
If you cannot clean it, I'll merge the PR anyway since it is not a big deal, it just makes the diff more complicated than it should be...

achamma723 · 2022-01-28T12:01:52Z

Thanks @achamma723 but it looks like there are still unrelated changes. I'm not sure how you rebase, but you should be able to drop the unrelated commits. If you cannot clean it, I'll merge the PR anyway since it is not a big deal, it just makes the diff more complicated than it should be...

Hello @NicolasGensollen , I'm pulling from main, rebasing and then pushing again. Should I drop something unrelated?

NicolasGensollen · 2022-01-28T12:14:16Z

Hello @NicolasGensollen , I'm pulling from main, rebasing and then pushing again. Should I drop something unrelated?

Hmm, it's weird that you get some commits from main in here. You can try to manually drop them when rebasing. I think the following commits shouldn't be there:

This reverts commit 26de3fe.

)" This reverts commit 27e00f0.

This reverts commit 304e782.

NicolasGensollen

LGTM, thx!

NicolasGensollen · 2022-01-28T13:16:48Z

Thanks everyone! Merging...

Repalce numpy with pandas

1d04072

NicolasGensollen added this to the 0.9 milestone May 7, 2021

bthirion approved these changes May 7, 2021

View reviewed changes

tsalo changed the title ~~[FIX]: Repalce numpy with pandas~~ [FIX] Replace numpy with pandas in data loaders Jul 5, 2021

achamma723 and others added 3 commits December 16, 2021 15:30

rebase

4fae766

[MAINT] Fix new typos found by codespell (nilearn#3101)

58ff41c

Fixes c87847b / nilearn#3003.

Merge branch 'main' of https://github.com/nilearn/nilearn into test_5

68e19e5

jeromedockes reviewed Dec 16, 2021

View reviewed changes

Fix pep8 + destrieux

03e1f58

jeromedockes reviewed Dec 16, 2021

View reviewed changes

Fix pep8

1bac472

NicolasGensollen reviewed Jan 10, 2022

View reviewed changes

nilearn/datasets/atlas.py Outdated Show resolved Hide resolved

nilearn/datasets/atlas.py Outdated Show resolved Hide resolved

nilearn/datasets/atlas.py Show resolved Hide resolved

nilearn/datasets/atlas.py Outdated Show resolved Hide resolved

achamma723 added 2 commits January 12, 2022 11:47

Merge branch 'main' of https://github.com/nilearn/nilearn into test_5

382e6a4

Merge docs

aeaf3ef

NicolasGensollen reviewed Jan 17, 2022

View reviewed changes

NicolasGensollen added 3 commits January 19, 2022 11:47

continue work (unfinished)

24024b1

Iter

539ab52

Fix test

c74c356

NicolasGensollen and others added 7 commits January 26, 2022 12:13

[FIX] FirstLevelModel signal_scaling (nilearn#3135)

737c986

* Add failing test * get rid of scaling_axis attribute * deprecate scaling_axis * [circle full] add whatsnew entry * Fix

[DOC] Fix wrong whats_new entry (nilearn#3142)

da18fdc

Merge branch 'main' of https://github.com/nilearn/nilearn into test_5

e1657b1

Rebase

0d2bc7f

jeromedockes reviewed Jan 26, 2022

View reviewed changes

nilearn/datasets/struct.py Outdated Show resolved Hide resolved

doc/whats_new.rst Outdated Show resolved Hide resolved

NicolasGensollen added 4 commits January 27, 2022 10:18

[circle full] fix whats_new bug

26e92ff

[MAINT] Bump dependencies for release 0.9.0 (nilearn#3143)

26de3fe

* bump dependencies * [circle full] Add whatsnew.

Jerome's review

55db108

remove warning

fb5c56b

bthirion mentioned this pull request Jan 27, 2022

Release 0.9.0 #3147

Merged

bthirion reviewed Jan 27, 2022

View reviewed changes

NicolasGensollen added 2 commits January 28, 2022 12:27

[INFRA] Add tests min requirements with Matplotlib (nilearn#3144)

27e00f0

* Add tests min req with matplotlib * remove useless variable * [circle full] update README

2021 -> 2022 (nilearn#3146)

304e782

achamma723 added 3 commits January 28, 2022 13:18

Revert "[MAINT] Bump dependencies for release 0.9.0 (nilearn#3143)"

1ba2487

This reverts commit 26de3fe.

Revert "[INFRA] Add tests min requirements with Matplotlib (nilearn#3144

f1dd5d8

)" This reverts commit 27e00f0.

Revert "2021 -> 2022 (nilearn#3146)"

65fe185

This reverts commit 304e782.

NicolasGensollen approved these changes Jan 28, 2022

View reviewed changes

NicolasGensollen merged commit 7566d3d into nilearn:main Jan 28, 2022

NicolasGensollen mentioned this pull request Jan 28, 2022

Fetchers return byte strings #1181

Closed

jeromedockes mentioned this pull request Feb 8, 2022

fetch_abide_pcp returns empty results #3160

Closed

achamma723 deleted the test_5 branch July 8, 2022 09:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FIX] Replace numpy with pandas in data loaders #2829

[FIX] Replace numpy with pandas in data loaders #2829

achamma723 commented May 7, 2021

codecov bot commented May 7, 2021 •

edited

NicolasGensollen commented May 7, 2021

bthirion left a comment

NicolasGensollen commented Dec 7, 2021

achamma723 commented Dec 16, 2021

jeromedockes Dec 16, 2021

jeromedockes commented Dec 16, 2021

jeromedockes Dec 16, 2021

jeromedockes commented Dec 16, 2021

NicolasGensollen commented Dec 16, 2021

NicolasGensollen commented Dec 16, 2021

jeromedockes commented Dec 16, 2021

NicolasGensollen commented Jan 10, 2022

jeromedockes commented Jan 10, 2022

NicolasGensollen left a comment

NicolasGensollen left a comment

jeromedockes commented Jan 17, 2022

NicolasGensollen commented Jan 26, 2022

jeromedockes left a comment

bthirion Jan 27, 2022

NicolasGensollen Jan 28, 2022

bthirion commented Jan 27, 2022

NicolasGensollen commented Jan 28, 2022

achamma723 commented Jan 28, 2022

NicolasGensollen commented Jan 28, 2022

NicolasGensollen left a comment

NicolasGensollen commented Jan 28, 2022

[FIX] Replace numpy with pandas in data loaders #2829

[FIX] Replace numpy with pandas in data loaders #2829

Conversation

achamma723 commented May 7, 2021

codecov bot commented May 7, 2021 • edited

Codecov Report

NicolasGensollen commented May 7, 2021

bthirion left a comment

Choose a reason for hiding this comment

NicolasGensollen commented Dec 7, 2021

achamma723 commented Dec 16, 2021

jeromedockes Dec 16, 2021

Choose a reason for hiding this comment

jeromedockes commented Dec 16, 2021

jeromedockes Dec 16, 2021

Choose a reason for hiding this comment

jeromedockes commented Dec 16, 2021

NicolasGensollen commented Dec 16, 2021

NicolasGensollen commented Dec 16, 2021

jeromedockes commented Dec 16, 2021

NicolasGensollen commented Jan 10, 2022

jeromedockes commented Jan 10, 2022

NicolasGensollen left a comment

Choose a reason for hiding this comment

NicolasGensollen left a comment

Choose a reason for hiding this comment

jeromedockes commented Jan 17, 2022

NicolasGensollen commented Jan 26, 2022

jeromedockes left a comment

Choose a reason for hiding this comment

bthirion Jan 27, 2022

Choose a reason for hiding this comment

NicolasGensollen Jan 28, 2022

Choose a reason for hiding this comment

bthirion commented Jan 27, 2022

NicolasGensollen commented Jan 28, 2022

achamma723 commented Jan 28, 2022

NicolasGensollen commented Jan 28, 2022

NicolasGensollen left a comment

Choose a reason for hiding this comment

NicolasGensollen commented Jan 28, 2022

codecov bot commented May 7, 2021 •

edited