Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cannot use dataframe as input for second level GLM #3871

Closed
3 of 9 tasks
gurraburra opened this issue Aug 1, 2023 · 7 comments · Fixed by #3879
Closed
3 of 9 tasks

[BUG] cannot use dataframe as input for second level GLM #3871

gurraburra opened this issue Aug 1, 2023 · 7 comments · Fixed by #3879
Labels
Bug for bug reports

Comments

@gurraburra
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Operating system

  • Linux
  • Mac
  • Windows

Operating system version

  • Mac OS Version 13.4.1 "ventura"

Python version

  • 3.11
  • 3.10
  • 3.9
  • 3.8
  • 3.7

nilearn version

  • 0.10.1

Expected behavior

Using unlearn.glm.second_level.SecondLevelModel with panda.DataFrame as input I get an error when computing the contrast because the code checks the first element of the second_level_input to see if it is a FirstLevelModel. The problem is when the second_level_input is a DataFrame, a key error is generated. See below:

Current behavior & error messages

This is what I got:

KeyError                                  Traceback (most recent call last)
File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/indexes/base.py:3652](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/indexes/base.py:3652), in Index.get_loc(self, key)
   3651 try:
-> 3652     return self._engine.get_loc(casted_key)
   3653 except KeyError as err:

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/_libs/index.pyx:147](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/_libs/index.pyx:147), in pandas._libs.index.IndexEngine.get_loc()

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/_libs/index.pyx:176](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/_libs/index.pyx:176), in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 0

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[169], line 1
----> 1 slm.compute_contrast(first_level_contrast="insula-l")

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/nilearn/glm/second_level/second_level.py:543](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/nilearn/glm/second_level/second_level.py:543), in SecondLevelModel.compute_contrast(self, second_level_contrast, first_level_contrast, second_level_stat_type, output_type)
    540     raise ValueError("The model has not been fit yet")
    542 # check first_level_contrast
--> 543 _check_first_level_contrast(
    544     self.second_level_input_, first_level_contrast
    545 )
    547 # check contrast and obtain con_val
    548 con_val = _get_con_val(second_level_contrast, self.design_matrix_)

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/nilearn/glm/second_level/second_level.py:181](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/nilearn/glm/second_level/second_level.py:181), in _check_first_level_contrast(second_level_input, first_level_contrast)
    180 def _check_first_level_contrast(second_level_input, first_level_contrast):
--> 181     if isinstance(second_level_input[0], FirstLevelModel):
    182         if first_level_contrast is None:
    183             raise ValueError(
    184                 "If second_level_input was a list of "
    185                 "FirstLevelModel, then first_level_contrast "
   (...)
    188                 "compute_contrast method of FirstLevelModel"
    189             )

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/frame.py:3761](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/frame.py:3761), in DataFrame.__getitem__(self, key)
   3759 if self.columns.nlevels > 1:
   3760     return self._getitem_multilevel(key)
-> 3761 indexer = self.columns.get_loc(key)
   3762 if is_integer(indexer):
   3763     indexer = [indexer]

File [/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/indexes/base.py:3654](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/Caskroom/miniforge/base/envs/rs-analysis/lib/python3.11/site-packages/pandas/core/indexes/base.py:3654), in Index.get_loc(self, key)
   3652     return self._engine.get_loc(casted_key)
   3653 except KeyError as err:
-> 3654     raise KeyError(key) from err
   3655 except TypeError:
   3656     # If we have a listlike key, _check_indexing_error will raise
   3657     #  InvalidIndexError. Otherwise we fall through and re-raise
   3658     #  the TypeError.
   3659     self._check_indexing_error(key)

KeyError: 0

Steps and code to reproduce bug

# Paste your code here
import pandas as pd
from nilearn.glm.second_level import SecondLevelModel

# need a random file tmp.nii.gz
df = pd.DataFrame(data={"effects_map_path" : ["tmp.nii.gz"], "map_name" : ["tmp"], "subject_label" : ["sub1"]})
slm = SecondLevelModel().fit(second_level_input=df)
slm.compute_contrast()
@gurraburra gurraburra added the Bug for bug reports label Aug 1, 2023
@Remi-Gau Remi-Gau changed the title [BUG] [BUG] cannot use dataframe as input for second level GLM Aug 1, 2023
@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Aug 1, 2023

I think this reproduces the bug

import pandas as pd
from nilearn.glm.second_level import SecondLevelModel
from nilearn._utils.data_gen import write_fake_fmri_data_and_design

# need a random file fmri_run0.nii
write_fake_fmri_data_and_design(shapes=((7, 8, 7, 15), (7, 8, 7, 16)))
df = pd.DataFrame(data={"effects_map_path" : ["fmri_run0.nii"], "map_name" : ["tmp"], "subject_label" : ["sub1"]})
slm = SecondLevelModel().fit(second_level_input=df)
slm.compute_contrast()
Traceback (most recent call last):
  File "/home/remi/github/nilearn/env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3653, in get_loc
    return self._engine.get_loc(casted_key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/remi/github/nilearn/tmp.py", line 10, in <module>
    slm.compute_contrast()
  File "/home/remi/github/nilearn/nilearn/glm/second_level/second_level.py", line 544, in compute_contrast
    _check_first_level_contrast(
  File "/home/remi/github/nilearn/nilearn/glm/second_level/second_level.py", line 182, in _check_first_level_contrast
    if isinstance(second_level_input[0], FirstLevelModel):
                  ~~~~~~~~~~~~~~~~~~^^^
  File "/home/remi/github/nilearn/env/lib/python3.11/site-packages/pandas/core/frame.py", line 3761, in __getitem__
    indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/remi/github/nilearn/env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3655, in get_loc
    raise KeyError(key) from err
KeyError: 0

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Aug 1, 2023

Went and looked into our test suite to see how we test this and here is something that works

import pandas as pd
from nilearn.glm.second_level import SecondLevelModel
from nilearn._utils.data_gen import write_fake_fmri_data_and_design

shapes = ((7, 8, 9, 10),)
_, FUNCFILE, _ = write_fake_fmri_data_and_design(shapes)
FUNCFILE = FUNCFILE[0]

dfcols = ["subject_label", "map_name", "effects_map_path"]
dfrows = [
    ["01", "a", FUNCFILE],
    ["02", "a", FUNCFILE],
    ["03", "a", FUNCFILE],
]
niidf = pd.DataFrame(dfrows, columns=dfcols)

# dataframes as input
SecondLevelModel().fit(niidf)

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Aug 1, 2023

extracted from here:

def test_fmri_inputs():

@Remi-Gau
Copy link
Collaborator

Remi-Gau commented Aug 1, 2023

Went and looked into our test suite to see how we test this and here is something that works

import pandas as pd
from nilearn.glm.second_level import SecondLevelModel
from nilearn._utils.data_gen import write_fake_fmri_data_and_design

shapes = ((7, 8, 9, 10),)
_, FUNCFILE, _ = write_fake_fmri_data_and_design(shapes)
FUNCFILE = FUNCFILE[0]

dfcols = ["subject_label", "map_name", "effects_map_path"]
dfrows = [
    ["01", "a", FUNCFILE],
    ["02", "a", FUNCFILE],
    ["03", "a", FUNCFILE],
]
niidf = pd.DataFrame(dfrows, columns=dfcols)

# dataframes as input
SecondLevelModel().fit(niidf)

my bad this does not work when you compute a contrast on the model

@bthirion
Copy link
Member

bthirion commented Aug 2, 2023

Thx for reporting.
Do you know how to fix that ?

@gurraburra
Copy link
Author

It seems that the bug comes from the call to _check_first_level_contrast, the function tries to check if the first element of second_level_input is an instance of FirstLevelModel since than the first_level_contrast argument is needed to compute the contrast. An easy fix would just to first check if the second_level_input is a list:
if isinstance(second_level_input, list) and isinstance(second_level_input[0], FirstLevelModel):

@bthirion
Copy link
Member

bthirion commented Aug 3, 2023

OK. Do you want to contribute it ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug for bug reports
Projects
None yet
3 participants