Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isc: how should we handle NaNs when pairwise=False? #398

Closed
snastase opened this issue Jan 7, 2019 · 3 comments

Comments

@snastase
Copy link
Contributor

commented Jan 7, 2019

Currently the leave-one-out (pairwise=False) ISC and ISFC functions use np.mean to average across N–1 subjects prior to computing correlation with the left-out subjects. This means a NaN for any single subject will kill the mean time series. NaNs may occur during z-scoring prior to ISC computation because some voxels have zero variance—especially at edges of the brain due to limited FoV for EPI scans and susceptibility dropout (especially bad for old datasets). See attached figure where the colored values indicate how many subjects (out of 20 total) have NaNs for a given voxel. Currently all colored voxels in this figure would yield a NaN after ISC computation. I propose using np.nanmean rather than np.mean when computing the average time series across N–1 subjects so that NaNs can be accommodated. For example, in the case corresponding to the attached figure, I may opt to completely exclude all voxels where 50% or 90% or more subjects have NaNs, but I would still like to include voxels where only a couple subjects have NaNs. This would require some special care by the user, so I wanted to run this past others. Not sure how accepting of NaNs we should be... I suppose we could also have a keyword argument like allow_nans=True for flexibility (or even allow_nans=.9). Any thoughts on this? @mihaic @qihongl @manojneuro

image

@snastase

This comment has been minimized.

Copy link
Contributor Author

commented Jan 7, 2019

After some offline conversations, seems like Hasson Lab convention is to threshold to keep only voxels where e.g. >= 80% of subjects have non-NaN values and use nanmean. There may be other reasons or methods for thresholding, so I'm inclined to leave that up to user prior to ISC computation and not build this sort of thresholding into the function. So I guess this comes down to either: (a) simply change np.mean to np.nanmean and warn the user in documentation; or (b) include a keyword argument allow_nans that is False by default but can be toggled on to accommodate NaNs. Let me know what you think and I can work up a PR.

@manojneuro

This comment has been minimized.

Copy link

commented Jan 7, 2019

@mihaic

This comment has been minimized.

Copy link
Contributor

commented Jan 7, 2019

I think the default should follow the literature. If thresholding is not compute intensive, I would add a threshold parameter with a default value of 80% and an option to turn it off by setting it to None. This would allow alternative thresholding by the user. I would add allow_nans and set it to True by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.