Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Save model level output (r-squared) #224

Merged
merged 20 commits into from Apr 6, 2020
Merged

ENH: Save model level output (r-squared) #224

merged 20 commits into from Apr 6, 2020

Conversation

adelavega
Copy link
Collaborator

@adelavega adelavega commented Mar 26, 2020

  • Saves r-squared from model level outputs.

Closes #221

@adelavega
Copy link
Collaborator Author

@tyarkoni are r_squared and residuals maps what you for the BERT analysis?

Those are the two maps exposed by FirstLevelModel

@pep8speaks
Copy link

pep8speaks commented Mar 26, 2020

Hello @adelavega, Thank you for updating!

Line 141:17: E123 closing bracket does not match indentation of opening bracket's line

To test for issues locally, pip install flake8 and then run flake8 fitlins.

Comment last updated at 2020-04-06 12:59:55 UTC

@adelavega adelavega requested a review from effigies March 28, 2020 18:33
@adelavega
Copy link
Collaborator Author

hey @jdkent, you mind taking a look? should be fairly straight forward, but looks like you contributed adding r-squared and residuals to the FirstLevelModel object to nistats, so let me know what you think!

@adelavega
Copy link
Collaborator Author

Looks like tests are failing because of a MemoryError!

Unfortunately, the new images are surprisingly memory intensive to produce.

flm = level1.FirstLevelModel(
minimize_memory=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's this line that's killing us?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no.

Yes because that argument must be set to true for those images to be saved to that object. Otherwise, because they are not super necessary they are deleted along with a bunch of other stuff.

@jdkent maybe there's another way to access these images without triggering a bunch of other stuff to be kept around?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, that argument saves the SimpleRegressionResult object, and without it, you can't get the other model outputs.

nistats is also archived, so proposing an alternative for now is not possible.

Maybe we can just make this optional in fitlins? I'll raise an issue over there and see what they say.

@adelavega
Copy link
Collaborator Author

When I test locally, my docker container only hits around 400mb of memory usage. Looks like Circle CI's memory is probably quite low, and creating the residuals map (which is 4D) is momentarily very memory intensive.

I don't actually need the residuals, just the r-squared, so removing for now to see if tests pass. We could make some or all of these optional by user CLI flags.

@jdkent
Copy link
Contributor

jdkent commented Apr 5, 2020

popping in to say that I do not know of a currently builtin way to save only what we want, and delete other pieces of information, sorry! (working on filtering my notifications so I can see mentions sooner).

I'll take a look and see if I have anything helpful to add.

@adelavega
Copy link
Collaborator Author

Thanks @jdkent

Well, the important tests are passing now! So I think just computing the residuals is the most memory intensive part. For now, I'll just output r-squared as that's what I care about, but we could add a CLI option for the others.

@codecov-io
Copy link

codecov-io commented Apr 5, 2020

Codecov Report

Merging #224 into master will increase coverage by 0.29%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #224      +/-   ##
==========================================
+ Coverage   77.32%   77.62%   +0.29%     
==========================================
  Files          18       18              
  Lines        1045     1059      +14     
  Branches      188      189       +1     
==========================================
+ Hits          808      822      +14     
  Misses        149      149              
  Partials       88       88              
Flag Coverage Δ
#ds003 77.62% <100.00%> (+0.29%) ⬆️
Impacted Files Coverage Δ
fitlins/interfaces/abstract.py 100.00% <100.00%> (ø)
fitlins/interfaces/nistats.py 83.95% <100.00%> (+0.94%) ⬆️
fitlins/workflows/base.py 64.13% <100.00%> (+1.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d2fc55...29a2a0d. Read the comment docs.

@adelavega adelavega changed the title Save model level outputs ENH: Save model level outputs Apr 5, 2020
@adelavega adelavega changed the title ENH: Save model level outputs ENH: Save model level output (r-squared) Apr 5, 2020
@adelavega
Copy link
Collaborator Author

adelavega commented Apr 5, 2020

@effigies this is what the output file looks like. look reasonable? it's at the run-level for the whole model:

sub-01_task-SherlockMovie_space-MNI152NLin2009cAsym_stat-rSquare_statmap.nii.gz

arguably, it should just be rsquared

@jdkent
Copy link
Contributor

jdkent commented Apr 5, 2020

How you are getting r_square from nistats first level outputs looks reasonable to me.

To fix the memory issues, a more flexible interface within FirstLevelModel is needed. (Perhaps having users specify what model outputs they want when they call fit).

Similar to how effigies suggested the 'all' output for computing contrasts, I could see a parameter that saves r_square, residuals, predicted, or all.

@effigies
Copy link
Collaborator

effigies commented Apr 5, 2020

Either is fine with me.

@adelavega
Copy link
Collaborator Author

Looks like there's an issue with a duecredit dependency: brechtm/citeproc-py#94

May want to remove that for the time being.

@effigies effigies merged commit abe803d into master Apr 6, 2020
@effigies effigies deleted the enh/model_ouputs branch April 6, 2020 17:59
@adelavega
Copy link
Collaborator Author

hey @jdkent do you know if I can access other model attributes such as logL? Is there any reason that is not available at the FirstModelResult level?

I'm going to open another PR trying to get that file to be output as well.

@effigies Do you think these should always be output, or by CLI option? I can work that into the next PR as well.

@effigies
Copy link
Collaborator

effigies commented Apr 6, 2020

Are we running up against space constraints with FitLins outputs? Residuals will take up a ton, but if this is just one volume per run (as opposed to one volume per regressor) I'm not worried.

@adelavega
Copy link
Collaborator Author

Yeah it's just one volume per run. I suppose the main downside is we have to run with the save memory False for nistats. I'm not concerned about the output size too much.

@jdkent
Copy link
Contributor

jdkent commented Apr 6, 2020

you should be able to access LogL with the private method _get_voxelwise_model_attribute
like flm._get_voxelwise_model_attribute("LogL", result_as_time_series=False)

(LogL is just a 3-d output right?)

@adelavega
Copy link
Collaborator Author

yep, thanks! looked at your code and figured that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output model level outputs
5 participants