enable writing (MNE-Python) annotations #77

sappelhoff · 2021-08-05T13:38:45Z

Currently pybv has limited functionality to write to the VMRK file:

Events to write in the marker file. This array has either two or three columns. The first column is always the zero-based index of each event (corresponding to the “time” dimension of the data array). The second column is a number associated with the “type” of event. The (optional) third column specifies the length of each event (default 1 sample). Currently all events are written as type “Stimulus” and must be numeric. Defaults to None (not writing any events).

However, the BrainVision VMRK file would support much more, see:

; Each entry: Mk<Marker number>=<Type>,<Description>,<Position in data points>,
; <Size in data points>, <Channel number (0 = marker is related to all channels)>

We could think about a feature that can write annotations as used for example in MNE-Python: Each annotation having an onset, ~~offset~~ duration, and description. We could even ~~go beyond what MNE-Python currently (?) supports and~~ support channel specific annotations (although that could be done later).

Overall I envision that users could pass both "events" and "annotations" to write_raw_brainvision and that these two inputs would be internally merged in a reasonable way and all written to the VMRK file.

Together with #44 that'd bring us one step closer to being able to properly export MNE-Python raw objects to BrainVision, see also: https://github.com/mne-tools/mne-python/blob/53d0f5de8a8a99e3f4211a8dc1361930e3cc3cd8/mne/export/_export.py#L43-L44

WDYT?

The text was updated successfully, but these errors were encountered:

cbrnr · 2021-08-05T14:13:41Z

I didn't know that we don't support MNE annotations – we definitely should!

We could think about a feature that can write annotations as used for example in MNE-Python: Each annotation having an onset, offset, and description.

It should be onset, duration, and description.

We could even go beyond what MNE-Python currently (?) supports and support channel specific annotations (although that could be done later).

I think we actually support channel-specific annotations already, but it's not available in the plotting GUI yet (so users cannot add them other than programmatically).

sappelhoff · 2021-08-06T08:51:04Z

Do you perhaps want to give it a shot @cbrnr? I think it'd also benefit MNELAB 😏

cbrnr · 2021-08-06T12:29:23Z

Do you perhaps want to give it a shot @cbrnr? I think it'd also benefit MNELAB 😏

Give me some time and I can come up with something. Could it be as easy as using mne.events_from_annotations() and passing that to the existing writer?

sappelhoff · 2021-08-06T12:53:36Z

Could it be as easy as using mne.events_from_annotations() and passing that to the existing writer?

That is already supported as far as I can see: You'd just get events via mne.events_from_annotations() that you can then pass to the events parameter of write_brainvision.

In the VMRK these will then be written as such:

Mk3=Stimulus,S  1,6230,1,0
Mk4=Stimulus,S  3,6804,1,0
Mk5=Stimulus,S  4,7849,1,0
Mk6=Stimulus,S  6,8475,1,0

With the "Stimulus" event type, and the MNE event code (an integer) being transformed to a string S i, where i is some int.

In my example above, the duration of each of those events is 1 sample. With pybv we can currently also use the events parameter to write the actual duration in samples. However mne.events_from_annotations does not provide the duration of each event in samples, but only onset, "previous event value", and event value.

So indeed, (maybe what you had in mind) one way to improve the current state is to provide a convenience function that goes from MNE annotations to events as pybv can use them. Namely in the form onset, event value, duration in samples (see screenshot):

Having that said, I was interested in more advanced usage, because the way above still loses the actual annotation in favor of some integer code. And the actual annotation is not written to the BrainVision file then (only the integer code), and you'd have to keep the map somewhere else.

I thought about going beyond the Stimulus,S i style events that pybv can currently write. All is still in line with the BrainVision spec.

Something like this:

Mk2=Comment,ControlBox is not connected via USB,1,1,0

Starting from

; Each entry: Mk<Marker number>=<Type>,<Description>,<Position in data points>,
; <Size in data points>, <Channel number (0 = marker is related to all channels)>

We could have:

Mk --> This stays the marker number ( = event number in the file, so events should be temporally ordered)
Type --> Could be always set to "Comment" for writing annotations
Description --> This is the actual free form string annotation ... for example BAD_eyeblink, to stay close to the MNE usage ... but could be The assistant spilled coffee over the participant
position in data points --> onset in samples
size in data points --> duration in samples
channel number --> which channel the annotation refers to

And I imagine that participants could pass this information to pybv via a new parameter annotations. We could think about what form this parameter should take (dict?, pandas DF?, string array, ...), but overall I think this would be a great way to conserve MNE annotations in BrainVision files while staying true to the file format standard.

When both the events and annotations parameters are passed, both would be written to VMRK.

sappelhoff · 2021-08-07T05:51:48Z

Having slept over it, I think we should NOT introduce a new parameter "annotations" (or the like) and instead make a breaking API change to our parameter "events".

events should accept a dictionary of the following form:

{
    "onset": [],
    "duration": [],
    "description": [],
    "type": [],
    "channel": [],
}

With the following rules:

"onset" and "description" are REQUIRED, all other keys are OPTIONAL
"onset" and "duration" must be integers, reflecting sample indices
"onset" must be in ascending order
"channel" must be integers in the range range(0, n_chs+1), where 0 indicates that an event refers to all channels
"type" must be a string. This is commonly one of:
- Stimulus
- Response
- Comment
- New Segment
"description" may be a string, or an integer, but MUST be an integer for events of type Stimulus and Response, because here we will cast that integer to the form S i, and R i respectively

In the simplest scenario, supplying a dict as below works:

{
    "onset": [6230, 6804, 7849, 8475],
    "description": [1, 3, 4, 6],
}

and would result in:

Mk1=Stimulus,S  1,6230,1,0
Mk2=Stimulus,S  3,6804,1,0
Mk3=Stimulus,S  4,7849,1,0
Mk4=Stimulus,S  6,8475,1,0

That also shows that users can fix their pipelines easily to accommodate the new API:

# "old" events was a np.ndarray of shape (n, 2) or (n, 3)
new_events = {}
new_events["onset"] = events[:, 0]
new_events["description"] = events[:, 1]
if events.shape[1] == 3:
    new_events["duration"] = events[:, 2]

Here an example of a more complex scenario, which shows that users can specify much more information with the new API, facilitating export from MNE annotations:

{
    "onset": [1000, 2000, 3000],
    "duration": [25, 50, 75],
    "description": [1, "BAD signal dropout", 555],
    "type": ["Stimulus", "Comment", "Response"],
    "channel": [0, 64, 0],
}

which would result in:

Mk1=Stimulus,S  1,1000,25,0
Mk2=Comment,BAD signal dropout,2000,50,64
Mk3=Response,R555,3000,75,0

sappelhoff · 2021-08-07T05:55:33Z

Also, I can take this over @cbrnr before asking you in #77 (comment) I hadn't known that I feel strongly about this one 😆

cbrnr · 2021-08-07T06:45:07Z

Please go ahead!

hoechenberger · 2021-08-07T06:50:48Z

In JS and JSON, one would opt to have an array of objects instead; in Python terms: a list of dicts. This is what I would like to see here too. "channels" should be pluralized.

[
    {
        "onset": 1,
        "duration": 3,
        "description": "foo",
        "type": "bar",
        "channels": ["ch1", "ch2"],
    },
    {
        "onset": 5,
        "duration": 3,
        "description": "foo2",
        "type": "baz",
        "channels": ["ch1"],
    }
]

This makes the properties of each individual annotation very clear.

hoechenberger · 2021-08-07T07:02:05Z

I'd like to emphasize that I don't think "0" is a good pick to mean "all channels". I think

all affected channels should be listed explicitly
channels should be referred to by their names, not indices
if we figure listing all channels every time we wish tk add an annotation for all channels is not feasible, we should allow for the string "all" or even "None" instead.

sappelhoff · 2021-08-07T07:02:07Z

This makes the properties of each individual annotation very clear.

I like that notation, but it would make specifying an input for pybv much harder as a user (unwieldy), wouldn't it?

"channels" should be pluralized.

I agree in principle, but I don't think the BrainVision spec allows for that:

; Each entry: Mk<Marker number>=<Type>,<Description>,<Position in data points>,
; <Size in data points>, <Channel number (0 = marker is related to all channels)>

One could think about allowing it, and then internally in pybv write an event as below.

Example input:

{
    "onset": [1000],
    "description": ["BAD signal dropout"],
    "type": ["Comment"],
    "channel": [[1, 23, 55]],
}

Corresponding output:

Mk1=Comment,BAD signal dropout,1000,1,1
Mk2=Comment,BAD signal dropout,1000,1,23
Mk3=Comment,BAD signal dropout,1000,1,55

Note how the same event is written 3 times with only the channel that it refers to differing.

It's ugly, but I don't see another spec-conformant way (apart from not supporting it at all), do you?

sappelhoff · 2021-08-07T07:06:19Z

In response to #77 (comment): We need to follow the BrainVision spec, and your proposal unfortunately does not work within those constraints 😞

hoechenberger · 2021-08-07T07:09:34Z

One could think about allowing it, and then internally in pybv write an event

👍

Example input:

{
    "onset": [1000],
    "description": ["BAD signal dropout"],
    "type": ["Comment"],
    "channel": [[1, 23, 55]],
}

[...]

It's ugly

It is suuuper ugly and nobody could possibly want that. But the JSON-inspired notation I proposed above makes it trivial to resolve this issue, while staying very readable and trivial to parse:

[
    {
        'onset': 1000,
        'description': 'BAD signal dropout',
        'type': 'Comment',
        'channel': 1
    },
    {
        'onset': 1000,
        'description': 'BAD signal dropout',
        'type': 'Comment',
        'channel': 23
    },
    {
        'onset': 1000,
        'description': 'BAD signal dropout',
        'type': 'Comment',
        'channel': 55
    }
]

hoechenberger · 2021-08-07T07:12:04Z

@sappelhoff

Since MNE's Annotations don't immediately allow for an output that would suit this API, I propose we add Annotations.to_json() to MNE-Python. This will resolve the issue.

sappelhoff · 2021-08-07T07:18:53Z

@hoechenberger I feel like you are not reading my messages 😂

This thread is about how to make the most out of the BrainVision format for annotations, I like your proposals and they make sense --> but they don't fit with the BrainVision spec

hoechenberger · 2021-08-07T07:38:58Z

My proposals were for the "annotations" parameter you suggested to introduce... I'm confused now! 😬🙈

sappelhoff · 2021-08-08T07:39:39Z

Okay, I think I get part of it now - I think I was sometimes also talking about how to write it to VMRK, whereas you were always talking about the input into "write_brainvision". :-)

I like that notation, but it would make specifying an input for pybv much harder as a user (unwieldy), wouldn't it?

I am reconsidering my argument here: Let's go with your proposed "list of dicts" structure. It is not significantly harder to prepare, to go from my initial structure to yours it'd just be:

old_annots = {
    "onset": [1000, 2000, 3000],
    "duration": [25, 50, 75],
    "description": [1, "BAD signal dropout", 555],
    "type": ["Stimulus", "Comment", "Response"],
    "channel": [0, 64, 0],
}

# go from old_annots (dict of lists) to new_annots (list of dicts)
new_annots = []
n_annot = len(old_annots["onset"])
for i_annot in range(n_annot):
    annot = {}
    for key in old_annots.keys():
        annot[key] = old_annots[key][i_annot]
    new_annots.append(annot)

# new_annots
[{'onset': 1000,
  'duration': 25,
  'description': 1,
  'type': 'Stimulus',
  'channel': 0},
 {'onset': 2000,
  'duration': 50,
  'description': 'BAD signal dropout',
  'type': 'Comment',
  'channel': 64},
 {'onset': 3000,
  'duration': 75,
  'description': 555,
  'type': 'Response',
  'channel': 0}]

and the gain in readability and clarity of the "list of dicts" beats the simplicity of the other structure.

We can also use that list of dicts structure for coordinates in #44 and have a more consistent API overall

sappelhoff · 2021-08-08T07:56:35Z

I'd like to emphasize that I don't think "0" is a good pick to mean "all channels". I think

all affected channels should be listed explicitly

channels should be referred to by their names, not indices

if we figure listing all channels every time we wish to add an annotation for all channels is not feasible, we should allow for the string "all" or even "None" instead.

Here we misunderstood each other again (you talking ONLY about input, me talking ALSO about writing to VMRK)

Now that that's cleared up, I agree with you:

channels should be listed explicitly and by name (pybv can then convert to integer indices implicitly)
"none" is encoded by an empty list in MNE-Python, and that covers "all" and "none" channels quite well, I think: https://mne.tools/dev/generated/mne.Annotations.html?highlight=ch_names#mne.Annotations (in pybv VMRK, this would be coded as 0)

That just leaves us with a single problem: How to encode in VMRK if one event pertains to several (more than 1, less than "all") channels?

My proposal was:

Mk1=Comment,BAD signal dropout,1000,1,1
Mk2=Comment,BAD signal dropout,1000,1,23
Mk3=Comment,BAD signal dropout,1000,1,55

Do you see a BrainVision spec compliant alternative?

If there is no alternative, there are two options:

do not support it (i.e., only support lists of len 0 and len 1 for channels in annots)
write it as I proposed (but make sure first, that MNE-Python would be able to read that, and that it wouldn't break FT, EEGLAB, Brainstorm)

hoechenberger · 2021-08-08T09:34:28Z

Haha so many misunderstandings!! So sorry about the confusion!!

On a somewhat unrelated side note, I'm currently in Berlin and happy to meet up for a coffee or beer if you feel like it 👍 Just drop me a message on Discord

cbrnr · 2021-08-09T09:16:40Z

If repeating a marker for each affected channel is the only compliant solution then I'd go for it. Just to make sure I understand the format, a trailing zero means "all channels", correct? In MNE-Python, all channels corresponds to an empty list (https://mne.tools/dev/generated/mne.Annotations.html#mne-annotations), so how would the API handle this (I lost track so maybe @sappelhoff can you summarize the current state)?

sappelhoff · 2021-08-09T09:23:37Z

a trailing zero means "all channels", correct?

correct

In MNE-Python, all channels corresponds to an empty list

yes

so how would the API handle this (I lost track so maybe @sappelhoff can you summarize the current state)?

A user passes the following input to write_brainvision's event parameter (which we redesign, ~~breaking API change~~ EDIT: we can remain backward compatible and still accept ndarray next to list of dict):

[{'onset': 1000,
  'duration': 25,
  'description': 1,
  'type': 'Stimulus',
  'channels': []},
 {'onset': 2000,
  'duration': 50,
  'description': 'BAD signal dropout',
  'type': 'Comment',
  'channels': [62, 63, 64]}]

pybv will then write the following to VMRK:

Mk1=Stimulus,S  1,1000,25,0
Mk2=Comment,BAD signal dropout,2000,50,62
Mk3=Comment,BAD signal dropout,2000,50,63
Mk4=Comment,BAD signal dropout,2000,50,64

I think logically these events in VMRK make sense and they are compliant. It's still not pretty, but we have our constraints to work with here. We still need to check that MNE-Python, EEGLAB, Fieldtrip, Brainstorm would properly read this, or at least not produce a bunch of nonsense.

the input for the events parameter of write_brainvision can be easily derived from MNE-Python annotations (if sfreq is available, too) ... and also designed from scratch for non MNE users.

cbrnr · 2021-08-09T09:29:07Z

Thanks! This looks good. We could also directly accept an mne.Annotations object in addition to a list of dicts; I think nobody would want to create this manually just to write some annotations.

In addition to the readers you mentioned, we should first test if BrainVision Analyzer reads such files.

cbrnr · 2022-02-10T10:58:40Z

I just revisited an old MNELAB issue and found out that being able to pass Annotations to write_brainvision() would be really useful. @sappelhoff would you be able to summarize the status and who agreed to do what? I think we haven't even decided on a solution yet.

sappelhoff · 2022-02-10T12:43:46Z

@cbrnr I think we did find a solution, and a consistent API for writing events to VMRK (this issue) and coordinates to VHDR (#44 (comment)).

The basic API is described in #77 (comment) -> input to events becomes a list of dict (or for backward compatibility, can still be a numpy array with 2 or 3 columns; but that won't unlock all features).

Talking about both events (this issue) and coordinates (#44), we "only" need to:

check whether the (i) "duplicate events" (when an annotation is with respect to channels other than "all" or a single one), or (ii) the coordinates section raise any issue in: FieldTrip, EEGLAB, BrainStorm, MNE-Python, BrainVision Analyzer
1. If they do - try to fix them in those readers (and write to Brain Products to perhaps fix their BrainVision Analyzer, if it causes problems, or to improve the BV spec)
implement it
hopefully ship it to MNE Python's export module

I am currently fully blocked due to having to submit my dissertation soon, so if you want to give it a shot - be my guest! Else, I hope to be able to work on this some time this year, as I consider it a major improvement that I definitely want for pybv.

cbrnr · 2022-02-10T14:06:45Z

Thanks @sappelhoff! I don't have time right now either, but I'm happy to help. I think it would be nice if the events parameter accepted an ndarray (the MNE events structure) as well as an mne.Annotations object as well. Or at least we should provide helper functions that convert events and annotations into this new format. Otherwise people will have to perform gymnastics like [{"onset": a["onset"], "duration": a["duration"], "description": a["description"]} for a in raw.annotations] to export their existing annotations.

sappelhoff · 2022-02-10T14:23:09Z

agreed to support ndarray (the "custom" type we currently support, which can be easily indexed from the mne type), mne.annotations, and a standard list of dict

I think it's important to offer non-mne options as well

cbrnr · 2022-02-10T15:07:14Z

Since pybv does not depend on MNE, how would you implement a parameter accepting also an mne.Annotations object? An extra wrapper function just in MNE (raw.export())? That's probably the best option.

sappelhoff · 2022-02-10T15:16:56Z

Since pybv does not depend on MNE, how would you implement a parameter accepting also an mne.Annotations object? An extra wrapper function just in MNE (raw.export())? That's probably the best option

mmh and in "pure pybv" we'd only have ndarray or list of dicts? - Yes, that sounds good.

So far I had thought to have a check of parameter types, and raise if the then optional mne dependency is not present, but I like your suggestion better.

sappelhoff added the enhancement New feature or request label Aug 5, 2021

sappelhoff added this to the 0.6.0 milestone Aug 6, 2021

sappelhoff added effort:medium estimated medium effort task impact:high estimated high impact task labels Aug 6, 2021

sappelhoff added the API influences the API of pybv label Aug 7, 2021

sappelhoff self-assigned this Aug 7, 2021

sappelhoff mentioned this issue Aug 8, 2021

Let pybv write [coordinates] section to vhdr files #44

Open

sappelhoff modified the milestones: 0.6.0, 0.7.0 Sep 27, 2021

This was referenced May 22, 2022

Add tests (CI) running Matlab/Octave for popular BV readers: FieldTrip, EEGLAB, Brainstorm #83

Open

Allow passing list of dict as argument to events parameter (=allow writing mne.annotations to VMRK) #86

Merged

sappelhoff closed this as completed in #86 May 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable writing (MNE-Python) annotations #77

enable writing (MNE-Python) annotations #77

sappelhoff commented Aug 5, 2021 •

edited

cbrnr commented Aug 5, 2021

sappelhoff commented Aug 6, 2021

cbrnr commented Aug 6, 2021

sappelhoff commented Aug 6, 2021

sappelhoff commented Aug 7, 2021 •

edited

sappelhoff commented Aug 7, 2021

cbrnr commented Aug 7, 2021

hoechenberger commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

sappelhoff commented Aug 7, 2021 •

edited

sappelhoff commented Aug 7, 2021

hoechenberger commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

sappelhoff commented Aug 7, 2021

hoechenberger commented Aug 7, 2021

sappelhoff commented Aug 8, 2021 •

edited

sappelhoff commented Aug 8, 2021

hoechenberger commented Aug 8, 2021

cbrnr commented Aug 9, 2021

sappelhoff commented Aug 9, 2021 •

edited

cbrnr commented Aug 9, 2021

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022 •

edited

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022

enable writing (MNE-Python) annotations #77

enable writing (MNE-Python) annotations #77

Comments

sappelhoff commented Aug 5, 2021 • edited

cbrnr commented Aug 5, 2021

sappelhoff commented Aug 6, 2021

cbrnr commented Aug 6, 2021

sappelhoff commented Aug 6, 2021

sappelhoff commented Aug 7, 2021 • edited

sappelhoff commented Aug 7, 2021

cbrnr commented Aug 7, 2021

hoechenberger commented Aug 7, 2021 • edited

hoechenberger commented Aug 7, 2021 • edited

sappelhoff commented Aug 7, 2021 • edited

sappelhoff commented Aug 7, 2021

hoechenberger commented Aug 7, 2021 • edited

hoechenberger commented Aug 7, 2021 • edited

sappelhoff commented Aug 7, 2021

hoechenberger commented Aug 7, 2021

sappelhoff commented Aug 8, 2021 • edited

sappelhoff commented Aug 8, 2021

hoechenberger commented Aug 8, 2021

cbrnr commented Aug 9, 2021

sappelhoff commented Aug 9, 2021 • edited

cbrnr commented Aug 9, 2021

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022 • edited

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022

cbrnr commented Feb 10, 2022

sappelhoff commented Feb 10, 2022

sappelhoff commented Aug 5, 2021 •

edited

sappelhoff commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

sappelhoff commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

hoechenberger commented Aug 7, 2021 •

edited

sappelhoff commented Aug 8, 2021 •

edited

sappelhoff commented Aug 9, 2021 •

edited

sappelhoff commented Feb 10, 2022 •

edited