-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ambiguity about how to deal with inherited tsv contents #337
Comments
Can't but help tag BIDS 2.0 suggestion Remove inheritance rule :) More constructively, I concur that more clarity is needed and no intra-column merging was implied. However, it seems like it could get quickly complex. If a lower-level TSV file has a column name already defined above, it may have a different set of onset times.... do we just fill in NA's for all other rows? Given this, I think I'm actually more in favor of Tal's first option: "the TSV file in the present level has precedence in its entity, others are disregarded". It will simplify the spec and parsing of BIDS data. This puts the onus of merging on the data-generator and not the spec. |
I think it's reasonable for non-time-series data to follow the rule that @nicholst proposes, but for time series, I'm not sure that this is as manageable. I'm particularly thinking of derivatives' In this context, a useful constraint would be "if a column has the same name in two files, it must have the same contents", or perhaps that columns can't appear in two files. The situation I'm thinking of is Perhaps this isn't a problem because we've already specified that this refers to "metadata" TSV files, and the thing to do is to explicitly identify which types of TSV are metadata. I would say there are three types:
We can set the rules as appropriate for each case. |
@effigies I'm far from the derivatives discussion, so this is useful. But your examples seem to relate to two files in the same directory, and so isn't about inheritance, right? I agree that there might be an issue about duplicate column names in multiple files in a single directory, but isn't that a distinct discussion? |
I feel like this is a pragmatic decision that avoids complicated rules about column or row merging across TSV files at different levels. --> If two TSV files are defined (one at the upper level, and one at the lower level) ... the one at the lower level has precedence for its purpose.
Going via this route seems to me to be a promising alternative to the pragmatic solution above. In that case we would have to define "merging" rules for a limited set of TSV files, like suggested (participants, scans, phenotype files) [note: I don't know what you mean by "session" TSV files, Chris] |
@nicholst Yes, sorry, this was a bit of a broadening of the topic. I was responding to this portion of the text:
|
On further thought, I think I'm now with @nicholst on this. It would greatly simplify life for everyone except the data generator to say that inheritance applies only to JSON files, and for TSVs, you only get one shot at it. @effigies I'm kind of against having multiple metadata files at a given level apply to a file. It would be a bit of a pain to implement in a clear way. E.g., in the case of tsv files with different |
Same. The point is I want to distinguish between metadata TSV files, such as participants.tsv, sessions.tsv and scans.tsv, and data TSV files, such as stim, physio and timeseries. I think you should be able to have multiple of the latter. Edit: I realize this may be off topic for this issue, and I would be glad to move it to a new issue, if we agree that this distinction exists and it's worth making clear in the spec. |
I also favor this option, but simply removing all files except JSON from the list of "inheritance-allowed-files" would be a backward incompatible change, so it's not a practical solution for our current BIDS version. Luckily (:confused:) the inheritance principle is currently not very clear about how inheritance is dealt with for TSV files (@tyarkoni's original post), so we can now try to clarify/specify it to be as simple as possible without violating backward compatibility. @effigies I agree that it's worth distinguishing different kinds of TSV files (meta, time-series, events, ...). A new issue for this would be nice to keep this one focused on inheritance. |
When discussing inheritance, the spec says this:
This is fine for JSON, but it's not obvious if/whether it also applies to TSV files. What's the expected interpretation here? Is it that only the first TSV file is valid, and the rest are disregarded? Or that the contents of the TSV files should be merged, with events defined at lower levels taking precedence over those at higher levels in the case of matching timing information?
The latter seems like the more sensible approach for at least
events.tsv
files, and I would like to preserve the ability to specify events at multiple levels (consider, e.g., a case where all participants get the same stimulus sequence, but responses are recorded separately for each run). But this requires some detail about the expected implementation... e.g., which columns define a unique event? Forevents.tsv
, is it the combination of onset and duration, and trial_type? This should all go in the spec.The text was updated successfully, but these errors were encountered: