Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always specify subject level in BIDS Model #641

Closed
adelavega opened this issue Aug 23, 2019 · 39 comments · Fixed by #648
Closed

Always specify subject level in BIDS Model #641

adelavega opened this issue Aug 23, 2019 · 39 comments · Fixed by #648
Labels
Projects

Comments

@adelavega
Copy link
Collaborator

Need to check if this is true, but currently, if there is only 1 run per subject, the BIDS model doesnt specify a subject level (even its just a copy), and the first level gets passed to the group level.

Let's make it so subject level is always written. This is more explicit so you can always find the subject level maps.

Also need to check if this is how I construct models with pyns

@adelavega adelavega self-assigned this Aug 23, 2019
@adelavega
Copy link
Collaborator Author

There is a potential conflict. In HBN, where there is only one run, but importantly the single run has no run label, the output "looks like" a subject level map:
sub-NDARZL724HAX_task-movieDM_contrast-anyFaces_stat-p_statmap.nii.gz

because typically a subject level map is denoted by the lack of a run

This causes two problems:

  • If we were to force a subject level map (a "copy"), there would be a conflict
  • If there is only a single run available (or the model only includes one run), but there is a run label (e.g. "run-1"), then copying over is necessary to ensure a predictable place for "subject" level runs.

@tyarkoni
Copy link

I don't think we can safely assign run-1 to run-level stat maps when there's only one run. I suggest using the desc keyword for this. Maybe desc-levelSubject, desc-levelRun, etc. Alternatively, it also doesn't seem crazy to introduce a new level entity for this (i.e., level-run, level-subject, etc.). Tagging @effigies for potential comment from a fitlins perspective.

@tyarkoni
Copy link

On second thought, in the case where there's only one run, and no run ID is provided, I think it might actually be fine to only have one set of maps. Anyone trying to programmatically read either a run-level map or a subject-level one will still get the right image based on the entities they have available. It might be a little confusing if someone's browsing a directory like this, but I can live with that.

@adelavega
Copy link
Collaborator Author

I just confirmed that it is necessary for fitlins (as of now) to omit the subject level, if there wouldn't be more than 2 inputs (i.e. runs).

@effigies
Copy link

effigies commented Aug 28, 2019

Hey, sorry, swamped with non-FitLins and family stuff at the moment. Please let me know if this is urgent, otherwise unlikely to respond until next week.

Edit: Also, ping me next week to remind me, if I don't get to this quickly.

@adelavega
Copy link
Collaborator Author

Nope, not urgent.

@adelavega adelavega removed their assignment Sep 4, 2019
@adelavega
Copy link
Collaborator Author

@effigies you have the bandwidth for this now?

@effigies
Copy link

Yes. Let me look...

@effigies
Copy link

So my interpretation of this is that you want to always write a subject-level step in your model, not necessarily that FitLins has to implicitly build one? I think that's fine.

Possibly the easiest approach will be to make the first level subject if there's a single run for each subject.

@tyarkoni
Copy link

I agree that we definitely don't want to generate subject-level maps unless the model explicitly requests it. My understanding (which could be wrong) was that fitlins breaks in some way if you include subject in the model but there's only one run available. It should be able to handle this case gracefully (e.g., by implicitly copying the single run-level maps to subject-level and doing nothing else). If that's not the case, and it currently does spit out subject-level maps upon request, no matter how many runs are available, then there may be no problem at all from a fitlins standpoint.

@adelavega
Copy link
Collaborator Author

Tal's summary was my recollection of the outstanding issue with fitlins right now.

The idea is also that by always including a subject level, even when theres only 1 run, you always know where to look when you want a "subject" level map. Even if all fitlins does is copy the run level map to a file path without the run entity, that would work.

@adelavega adelavega added this to To Do (Frontend) in 0.9 via automation Sep 18, 2019
@effigies
Copy link

effigies commented Sep 18, 2019

No, we don't adjust to any special situations, such as individual inputs, to simply pass through. However, I think if we handled fixed effects at the subject level, the degenerate case of that would be pass-through. So maybe the correct approach is to decide how we're going to say "combine runs as fixed effects at the subject level" in the spec, implement that in pybids/FitLins, and then you add that flag or whatever in NeuroScout.

@adelavega
Copy link
Collaborator Author

Agreed, but not sure what you mean by "add that flag"? Shouldn't the inclusion of a subject level just be fixed effects?

@adelavega adelavega moved this from To Do (Frontend) to In Progress in 0.9 Sep 18, 2019
@effigies
Copy link

effigies commented Sep 18, 2019

Is that a universal desire, to never treat runs or sessions as random effects?

Edit: I thought from previous discussions that any given level should be able to be marked as a fixed or random effects combination.

@tyarkoni
Copy link

tyarkoni commented Sep 18, 2019

I think that's right. There are two separate issues, one for fitlins and one for the spec:

  • Fitlins needs to be able to handle the pass-through case at all levels so long as the model is identifiable. Depending on estimator, a model may not be identifiable if there's only a single sample of a random effect (e.g., only one subject in a group-level analysis), but there should never be a problem for a fixed-effects analysis, and it seems fine to treat this as a special case at fitlins level and just copy the output images directly instead of passing anything onto an estimator.
  • In the spec, we will eventually need a way to specify which variables should be treated as random effects. We already talked about this quite a bit, and there was a provisional solution in an earlier version of the spec, but I think I removed it until it was finalized (which hasn't happened yet).

Note though that for all of the models we (and I suspect anyone else) has been working with so far, the distinction between fixed and random is immaterial, because we never have repeated measures that need to be modeled as random effects. I.e., the conventional approach is to model runs within a subject-level analysis as fixed. At the group level, we also model subjects as fixed, because there's only one estimate per subject (the within-subject averaging in the previous step means that implicitly, the subject-level variability is being modeled as a random variable). So at the moment, as long as the fitlins issue is addressed, we should be able to fit pretty much any standard model that one might normally want to fit using a summary statistics approach. Down the line, when we have legit mixed-effects estimators we can use to handle multiple random effects (e.g., stimuli and subjects) in one giant dataset-level analysis, the spec level issue will need to be addressed.

@effigies
Copy link

Okay, this may be me being shit at keeping stats straight in my head, but my impression was that the nistats SecondLevelModel treats its inputs as random effects, and what we should generally be doing at the subject level is combining as fixed using a more algebraic approach, e.g. a weighted average using the variance estimate.

@tyarkoni
Copy link

Ah, yeah, the subject-level stat maps present an issue. I wasn't considering those. Let me think on it a bit.

@tyarkoni
Copy link

Okay, I thought about it some more, and also went back and looked at the spec (I was previously looking at an out-of-date draft, and had forgotten that we do in fact have draft language describing variance component specification). Here are my thoughts. Also tagging @nicholst and @cmaumet for any additional insights. (To save folks coming to the thread late the trouble of reading the whole thing, the key question here is how users should specify that test statistics should be computed using a fixed-effects rather than a random-effects analysis.)

The main issue as I see it is that we need a different approach to handle the fixed vs. random distinction in the context of generating summary statistics-based stat maps than in the standard mixed-effects models where variables are assumed to be fixed unless they're specified as part of a variance component. Consider the very common use case where we want to produce single-subject z-stat maps, given n runs as input. If we wanted to model runs as random, this would be trivially handled with an intercept-only model (i.e., y ~ 1, where the y's are the parameter estimates for each voxel). But the fixed-effects analysis is quite different—we probably want to do a inverse variance-weighted meta-analysis that takes as inputs the parameter estimates and the variances for all the runs (@cmaumet, I believe you brought this up in the spec document a long time ago, and I punted on it at the time—well, we can no longer punt :)).

In principle, we could introduce a new "type" of analysis one can specify in "Model" (maybe "FEMA", for fixed-effects meta-analysis). But the problem with that idea (aside from adding complexity) is that the "Model" specification currently assumes we're defining a single model that includes all of the available data, and not a separate model for each variable, which is what we want to do here. I.e., there is no way to say, in the model section, "run a fixed-effects meta-analysis, but do it separately for each variable".

What we could do instead, without having to blow things up, is deal with this in the "Contrasts" section. This also feels a bit weird to me, since a fixed-effects MA on a bunch of estimates is not really a contrast in the traditional sense. But it seems like maybe the least bad solution. My tentative proposal is to add a new "FEMA" value to the list of valid "Type" values (currently just "t" and "F"). This would be interpreted by the estimating software as a fixed-effects meta-analysis of all available estimates. More detailed configuration (e.g., how to weight observations) would either be left to the software's defaults, or specified somewhere else (e.g., in the software-specific configuration section). I'm not thrilled with this, because it leaves it quite unclear how one is supposed to handle the situation where the contrast involves more than one variable (e.g., [-1 1]), but it is at least easy to implement so we can move forward.

Setting aside whatever convention we end up with in the spec, there's also a fairly serious UX issue here, IMO. Thanks to the summary stats tradition, I'm guessing that most people will tend to simply assume that when they see a t statistic at subject level, it means fixed effects, and when they see a t statistic at group level, it means random effects, and it won't occur to them that if they want things to work this way, they'll need to specify "FEMA" as the contrast type in the former case. Then we'll end up with a lot of people wondering why their single-subject t-statistics are so tiny. I really don't know what to do about this other than have implementing software like fitlins issue a warning if someone doesn't specify "FEMA" for a subject-level model... I'm trying pretty hard to avoid building into the spec any conflation of analysis level with analysis type.

Thoughts?

@effigies
Copy link

To be clear, a "FEMA" applies to a contrast, not to an entire level, so a random effects "t" contrast would sit comfortably (if not necessarily coherently) next to a "FEMA" contrast?

If "FEMA" is undefined for weights that are not [1], then let's just state that, and the appropriate behavior of an implementation would be to throw an error.

@adelavega
Copy link
Collaborator Author

But to be clear, implementationally, if there is a "t" contrast, running a random effects model is required, whereas with "FEMA" there's no need to run that model, right? But like @effigies said, you could have both? That feels a bit messy, although easy enough to implement.

Is there another name we can come up with that's more intuitive? (i.e. not an acronym that makes me think of hurricanes). FEC- fixed effects contrast IVW- inverse variance weighted.
Something about meta-analysis at the subject level is not intuitive, even if correct.

Also, what's the implication for the way we've been doing things (random effects). Since we are only taking the effects + variances for the next level (not the t-stats), how much of a difference does it make?

@adelavega
Copy link
Collaborator Author

Also, in terms of passthrough being allowed if a model is identifiable, doesn't it now depends on the contrasts requested? (i.e. it wouldn't be allowed when theres only a single run or subject and a t contrast is requested, but if only FEMA contrasts are requested, a pass through is OK?)

@tyarkoni
Copy link

Yes, you could specify them alongside one another. I don't think it's completely crazy to want to see the results of treating runs as both fixed and random effects. They answer different questions.

@adelavega there's no real implication for the way we do things, because we don't care about single-subject inference. The propagated estimates remain the same (in principle one could maybe forward the inverse-weighted estimate, but in practice that's something we probably want to leave to the estimator to decide).

Ideally it wouldn't be our job to worry about identifiability; if an estimator can't handle a particular model, it just fails, and the pipeline breaks. In principle, an estimator that worries about edge cases should recognize a one-run case and just return the same stat maps when doing fixed-effects aggregation; when doing a random-effects analysis, breaking seems fine. The concern here is just that since we only support a single estimator right now, we should at least be able to gracefully handle the most common scenario we're likely to encounter (i.e., fixed-effects analysis at run level, where there's a non-trivial chance of getting just one run as input).

@adelavega
Copy link
Collaborator Author

Makes sense. Any example code for how to implement FEMA?

To move forward w/ NeuroScout (i.e. generate valid models), we do need to settle the spec terminology, so I guess if theres no major opposition we go with the plan above.

Otherwise, the lazy short term solution would be the passthrough in the one-run case even w/ random-effects.

@tyarkoni
Copy link

I don't think we need to implement any statistical routines in fitlins; this is already handled by nistats in a first-level analysis. The issue from our perspective is being able to map the desired analysis onto the right nistats routine—i.e., instead of automatically routing run-level analysis to FirstLevelModel and anything past that to SecondLevelModel, the choice of nistats model needs to depend on the specification.

I'd rather not rush finalization of the language in the spec, and am hoping @nicholst and/or @cmaumet will weigh in at some point. The expedient thing to do is probably to go with my suggestion above internally in fitlins, and we can always tweak things later. One way or another there needs to be logic internally that maps analyses onto nistats as appropriate; I'm guessing that changing the way the analysis type is detected in the spec will be fairly easy.

@adelavega
Copy link
Collaborator Author

I was under the impression that nistats didn't have explicit support for fixed effects, yet. Although @effigies says it may be in there somewhere. Agree that code shouldn't live in fitlins.

I'm fine with going w/ the suggestion internally for now. It wouldn't change how the previously defined t stats works, so it would be backwards compatible.

@tyarkoni
Copy link

The nistats FirstLevelModel can handle multiple runs, and will do a fixed-effects analysis. See here.

@tyarkoni
Copy link

The tricky part, from our perspective, is that taking advantage of nistats this way may require fitlins to look ahead a level to see what the model wants to have done with the run-level outputs, which is not so ideal.

@adelavega
Copy link
Collaborator Author

Ah, yeah that would be a bit of a pain.

There's an open issue to add explicit fixed effects to nistats: https://github.com/nistats/nistats/issues/281

@effigies
Copy link

Yeah, it's not that big a deal to do the fixed effects ourselves, as long as we know it at some point. It's like one or two lines of code to actually perform the combination. If we can convince Bertrand to expose it as a first-class interface, then cool, but my concern at this point is being able to deterministically derive what should be done from a model spec.

@adelavega
Copy link
Collaborator Author

I think from Tal's suggestion it would be clear what should be done.

I don't think using FirstLevelModel for multi-session fixed effects would be flexible enough (especially when handling potentially missing columns, since nistats current behavior is to crash in that situation).

@adelavega
Copy link
Collaborator Author

Allright, I looked into nistats, and I think it's possible to use the Contrast object and manually build it using effect and variance images that we load (rather than the ones it would compute before building it).

These Contrast objects can then be combined to get a fixed effect contrast and the associated stats.

The other thing that nistats does, which needs to be added, is using a learned masker for loading/writing out images.

I'll give this a go, and maybe if done well enough it could just be merged into nistats and we could import it.

@nicholst
Copy link

Coming late to the party here.

+1 for FEMA as a flag on a contrast to indicate that a meta-analysis type analysis is required. While I realise this is counter intuitive to brain imagers, it is actually an accurate description of the analysis.

At the risk of forking the thread, can inquire about the variance? If "varcopes" are present, but I want to do a "OLS" style analysis, i.e. ignore them, is it clear how the user makes this happen?

@tyarkoni
Copy link

@nicholst good point—we don't currently have a way to specify that. Do you have suggestions? If we wanted to build this into the spec, it could be an option in the Contrast specification that only applies when Type="FEMA". Maybe we can call it WeightingScheme or something like that.

The alternative is to leave it to the estimator, and let the user pass whatever they like via the Software section. (I guess even if we go this route, we would still need to come up with some convention for at least fitlins purposes.)

@nicholst
Copy link

I fear I feel too removed to make good suggestions. To be clear, wanting to do OLS ignoring variance is a special case that only applies when (at a given level) we have a single observation per subject, but it's a very high frequency "special" case in some large circles. Anyway, FEMA option IgnoreStderrs or something should capture it.

While we're talking variance, there's also the issue that both SPM and FSL have facilities for (and advocate) group-specific variance in multi-group (e.g. two-sample) analyses. Have we thought about how that will be articulated?

And for more fun, SPM's repeated measure model in fact by default assumes unstructured and heteroscedastic covariance. (Though I think we talked about repeated measures variance modelling like this to be a pain, better handled via random intercepts, slopes, etc).

@tyarkoni
Copy link

tyarkoni commented Sep 30, 2019

@nicholst I think unequal variances can in principle be accommodated by the VarianceStructure section in the draft spec. I believe there may even be some examples that include unequal variances. The main issue is the same one that cropped up for the FEMA issue: the natural place to put these kind of specifications is in the Model section... but we only have one Model section, not one per contrast/analysis. So we end up having to put it in Contrasts.

This seems pretty clunky/confusing, so I'm going to think about it some more and see if maybe there's some way to restructure things that makes sense.

EDIT: specifying the covariance structure seems more difficult... My inclination is to push that into the software-specific specification.

@adelavega
Copy link
Collaborator Author

adelavega commented Sep 30, 2019

The other way to do things would be to separate "steps/level" and "analyses". For example, if you're asking for both a FEMA and a t contrast at the subject level, you want two types of analyses to be run at the subject level: a random effects model + contrasts , and fixed-effects contrasts.

One solution would be the allow multiple "models/analyses" that specify the same "level". These two analyses would be given them same inputs but run independently, and then the results from both of these would be joined and passed to the next level.

This may cause some problems for the next level if not done carefully though (i.e. two similar inputs, such as the same contrast produced by two different analyses)

0.9 automation moved this from In Progress to Done Sep 30, 2019
@adelavega adelavega reopened this Sep 30, 2019
@adelavega
Copy link
Collaborator Author

How should AutoContrasts behave given FEMA? Previously, auto contrasts just generated t contrasts. Should pybids intelligently create the appropriate type based on the level?

@tyarkoni
Copy link

tyarkoni commented Oct 17, 2019

If we want to avoid conditioning major decisions on the level of analysis (which I think we do), then we have to assume that AutoContrasts still implies t. If you want FEMA, you need to say that explicitly. I would suggest allowing non-boolean values, so that people can say "AutoContrasts": "FEMA" or "AutoContrasts: "t". Actually, in the interest of clarity and maintainability, maybe we should mandate string values. We could also generalize the idea and maybe call it "DefaultContrast" and let users specify a full contrast using the same syntax. This would make it a little more work to write a model spec, but improve clarity and reduce cognitive load (easier to understand what "DefaultContrast" is and what it expects).

We should also be clear (I can't find this in the spec anywhere right now) that if the user doesn't explicitly say anything about aggregation at a given level, then all the outputs are simply collected and passed forward, without any aggregation. E.g., if there are multiple runs per subject, and the subject-level model doesn't specify contrasts (or auto contrasts), then the level after subject (typically dataset) will get a concatenation of all run-level estimates for all subjects passed in.

EDIT: Ignore the second paragraph; the spec does make the chaining behavior explicit.

@adelavega
Copy link
Collaborator Author

I like the idea of mandating strings to specify which kind of contrast. The issue is this doesn't let you specify which variables to create contrasts for.

Here's what I see as the common uses cases for AutoContrasts. This is mostly to increase readability, and short hand for human generated models:

  1. Compute contrasts for all variables in the name space. For example, all FEMA contrasts at the subject or session level, and all ts at the dataset.
  2. Short hand for humans to write their run level contrasts. In most cases this would be to say: create a t contrast for each non-confound variable. This requires specifying which variables you want, and which type of contrast you want.

For 1) we can allow either a string ("t", "FEMA"), which would create that type of contrast for all variables in the name space.
And for 2) we could allow an object which lets you say the condition names and type.
For example:

{
    "ConditionList": ["speech", "face"]
     "Type": "t"
}

Alternatively, since most of these models will be machine generated we could keep it simple and only allow 1).

adelavega added a commit that referenced this issue Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
0.9
  
Done
Development

Successfully merging a pull request may close this issue.

4 participants