Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7th tutorial - Nilearn multilevel (two-level) glm #34

Merged
merged 65 commits into from Oct 7, 2022

Conversation

yibeichan
Copy link
Collaborator

Hello team!
Finally I can create PR for this 7th tutorial. This is a two-level nilearn glm, using the data from Balloon Analog Risk-taking Task, which has 16 subjects. I randomly chose 5 for our analysis. More details are in the notebook.

Here is a summary about the issues we encountered in the past month.
After endless debugging w/ @djarecka, we found that pydra is restrict with its task input types.

  1. if the input belongs to basic types such as list, dict, str, int, etc., works fine.
  2. if the input is from pydra defined class such as File, fine too
  3. if the input is a pandas.DataFram or FirstLevelModel and we define it as ty.Any, still okay.
  4. if the input is a nested structure of basic types such as list[dict] or list[str], still okay.
  5. HOWEVER, if the input is a list of unidentifiable types for pydra such as a list of pandas.DataFrame or FirstLevelModel, it will cause connection problems in the workflow that the output can't be collected or passed to the next node.

In this twolevel glm, the inputs for the secondlevel glm are lists. For example, that problematic node secondlevel_estimation needs the list of FirstLevelModel as inputs. More details about this node are in this issue #33

I don't know how to solve this problem yet, but I found a way to avoid it. If I need a list of pandas.DataFrame as input, I saved those dataframes into file then use the list of filepath (list[str]). Same for FirstLevelModel, I saved the firstlevel z_maps into files and then read those file paths.

This 7th tutorial notebook runs successfully on my laptop. Hope it can pass the test here.
Thanks to @djarecka and @htwangtw for help!!

@satra
Copy link
Contributor

satra commented Sep 1, 2022

HOWEVER, if the input is a list of unidentifiable types for pydra such as a list of pandas.DataFrame or FirstLevelModel, it will cause connection problems in the workflow that the output can't be collected or passed to the next node.

please open an issue in pydra regarding this. in pydra-ml we pass other non-standard objects (for example scikit-learn pipelines: https://github.com/nipype/pydra-ml/blob/master/pydra_ml/tasks.py#L103 - hence there must be something specific that's at work here)

@djarecka
Copy link
Contributor

djarecka commented Sep 1, 2022

I think that adding an extra elif block for input that is a data frame helped, at least in my simple example that I was using for testing. Will check for entire workflow later.

Still has no idea why I saw some issues only when running with jupyter notebook...

@djarecka djarecka mentioned this pull request Sep 2, 2022
@djarecka
Copy link
Contributor

djarecka commented Sep 4, 2022

@yibeichan - have you had a chance to check this failing tests? Looks like GA doesn't like the workflow for any version of python.. :(

@yibeichan
Copy link
Collaborator Author

@djarecka yes, the problem here is with datalad, which needs git-annex. So @effigies and I create this PR #36 trying to fix the workflow by switching the set-up from python to miniconda

@djarecka
Copy link
Contributor

djarecka commented Sep 5, 2022

ok, I see it now, let's see if that helps. I'm still not able to run the workflow, but would be interested in seeing if that works on GS

@yibeichan
Copy link
Collaborator Author

hello @djarecka @effigies I rewrote the first level GLM as we discussed - separate the firstlevel model and fixed-effect. The notebook works fine on my Mac (python 3.7 & 3.10). However, it still fails GHA. Maybe it's not a memory issue?

@djarecka
Copy link
Contributor

Thank you! I've just run the workflow on my laptop and I could see that sometimes memory was around 7GB, so it definitely could be still a memory issue..:( Perhaps we should just downsample??

@yibeichan
Copy link
Collaborator Author

hello Team! Finally, after one month, this notebook passed GHA. Yes, it was the memory issue. Thanks Dorota helping me find the solution for down sampling, which was a solution Chris provided on Neurostars :). @effigies once you get time, please check and merge. Thank you!

@yibeichan
Copy link
Collaborator Author

I made a few formatting/structural changes to have the jupyter book look prettier.
Then, I added one plot (unthresholded) for the second level GLM in the 7th notebook to prove that all other plots having nothing survived is due to the small sample size (n=5) not that things are wrong with the workflow.

@djarecka djarecka merged commit c918a4a into nipype:master Oct 7, 2022
@djarecka
Copy link
Contributor

djarecka commented Oct 7, 2022

Thank you! Great job!

@djarecka
Copy link
Contributor

djarecka commented Oct 7, 2022

@effigies @htwangtw - I've merged it, but if you have any comments, you can open an issue and we can fix it in a separate pr!

@yibeichan yibeichan deleted the nilearn-glm branch October 12, 2022 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants