Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Events type #301

Open
rly opened this issue Aug 22, 2019 · 6 comments
Open

Add Events type #301

rly opened this issue Aug 22, 2019 · 6 comments
Assignees
Labels
category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of NWB users

Comments

@rly
Copy link
Contributor

rly commented Aug 22, 2019

It is often useful to store events as a list of timestamps, without packaging them into a TimeSeries. For example, this list could represent when the animal receives reward, when the animal performs a particular behavior, etc. This is different from TimeIntervals because there is no stop time; these are instantaneous events. Also, TimeIntervals has the baggage of being a DynamicTable. This is simpler than AnnotationSeries which has all the baggage of being a TimeSeries.

@bendichter and I propose two new types for nwb.event.yaml:

  • an Events object for simple events (e.g. when the animal receives reward)
  • a LabeledEvents object for events that can take on multiple values (e.g. if there are multiple reward categories, the reward category can be encoded alongside the event time)
    • enumerated types would work great here, but are a pain to implement, so we propose a mapping type, where there is a 1-D dataset of integer labels (length equal to number of timestamps, value from 0 to number of labels - 1), and a 1-D dataset for the mapping of text (length equal to number of labels), such that the i-th element corresponds to a label value of i (zero-indexed)
@rly
Copy link
Contributor Author

rly commented Oct 4, 2019

See also related discussion about TTL pulses in #306

@rly
Copy link
Contributor Author

rly commented Oct 5, 2019

It is also worth noting that the schema includes neurodata_types BehavioralEpochs, BehavioralEvents, and BehavioralTimeSeries, which contain any number of IntervalSeries (a TimeSeries), TimeSeries, and TimeSeries objects, respectively. In particular, it seems BehavioralEvents is meant to store irregular event times in a collection of TimeSeries objects.

@rly rly self-assigned this Nov 20, 2019
@rly
Copy link
Contributor Author

rly commented Mar 11, 2020

The need for an Events/LabeledEvents/TTL pulse data type has come up multiple times recently. Here is a summary of possible options to move forward:

Option 1:
See #302 for a proposal to add:

  • An Events type for storing simple events. It has a description and a 1D array of timestamps (unit: 'seconds')
  • A LabeledEvents type that inherits from Events for storing events that can take on multiple values. Each timestamp is associated with a uint value in the 'label_keys' dataset. Each unique value of the 'label_keys' dataset is associated with a text label in the 'labels' dataset.
  • The TTLs type would inherit from LabeledEvents. The TTL pulse value would be used as the label key. This type groups all TTL values together in a single dataset, which is often how TTL data come in. If desired, users could separate the TTL events by pulse value and store those data in separate Events types in /processing/events.

Pros:

  • This format is easy to understand and the data would be easy to inspect.
  • The API code would be straightforward to maintain.

Cons:

  • This format does not allow for flexible addition of metadata for each event type, outside of the description attribute. However, it is not clear how often users would use this ability.
  • This format is inconsistent with how spike times are stored in Units.

Option 2:

  • Create a LabeledEvents type that inherits from DynamicTable. Each row corresponds to a different type of event and has an ID, a text label, a text description, and a 1D array of timestamps. The timestamps dataset would be a ragged array.
  • Create a TTLs type that inherits from LabeledEvents.

Pros:

  • This format allows for the addition of user-defined metadata for each event type.

Cons:

  • This provides users with perhaps too much flexibility in adding metadata, defeating the purpose of a good data standard.
  • DynamicTable data are more complicated and difficult to inspect outside of the API.
  • The code for DynamicTable is bulky and has historically required hacks to make them work for particular use cases.

Option 3:
A hybrid between Options 1 and 2.

  • Create the Events, LabeledEvents, and TTLs types as described in Option 1. These types would live in /acquisition.
  • Create an AnnotatedEvents type that inherits from DynamicTable. Users could separate TTL events by pulse value and store those values in this new table, like in Option 2. Users might want to store only the TTL events and event times that are relevant to their analyses here. This table might live in /processing/events.

Pros:

  • The format of the acquired event times is easy to understand and the data would be easy to inspect.
  • The format of the annotated events allows for the addition of user-defined metadata for each event type.

Cons:

  • Event times are duplicated between acquisition events and annotated events, but the latter is organized, annotated, and could be filtered, so I don't think this is a big deal.
  • See cons for option 2 regarding use of a DynamicTable.

Option 4:
Create a subtype of TimeSeries or BehavioralTimeSeries or AnnotationSeries where the 'data' dataset are all ones.

Pros:

  • We can do this already. And the community is already doing this for some datasets.

Cons:

  • This format does not allow for flexible addition of metadata for each event type, outside of the description attribute. However, it is not clear how often users would want this ability.
  • TimeSeries has a lot of baggage regarding the data dataset and uniformly sampled timestamps that are not relevant for event data. Ideally, the type hierarchy would be:
NWBContainer 
     |-> Events 
           |-> LabeledEvents
           |-> TimeSeries

Since both Events and TimeSeries share the timestamps dataset, but that would be a major breaking change.


I am in favor of option 3. Thoughts, @bendichter @oruebel @ajtritt?

Finally, whichever route we choose, should we make an extension or just build this into the next minor version of NWB? I am in favor of building it in, since we know this will have high usage.

@bendichter
Copy link
Contributor

Thanks for really thinking this through and laying it out. I also like 3 and I'd vote for extension because it will allow us to start using this sooner and to vet it before releasing it.

@oruebel
Copy link
Contributor

oruebel commented Mar 11, 2020

I think developing this as an extension first and then merging in once it has been evaluated is probably the way to go, given that there a few options. The extension can be versioned at 0.x throughout and it can be made explicit that it is not intended for production use, but as an evolving proposal for integration with NWB.

@rly
Copy link
Contributor Author

rly commented Apr 9, 2020

Update: The proposed Events and LabeledEvents types will be implemented in an NDX extension in development here: https://github.com/rly/ndx-events

@stephprince stephprince added category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of NWB users labels May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of NWB users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants