Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH, FIX: Make eyelink occular annotations "channel aware", and call blink annotations "BAD" #11746

Merged
merged 10 commits into from
Jun 23, 2023
2 changes: 2 additions & 0 deletions doc/changes/latest.inc
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Enhancements
- Add ``cmap`` argument for the :func:`mne.viz.plot_sensors` (:gh:`11720` by :newcontrib:`Gennadiy Belonosov`)
- When failing to locate a file, we now print the full path in quotation marks to help spot accidentally added trailing spaces (:gh:`11718` by `Richard Höchenberger`_)
- Added :class:`mne.preprocessing.eyetracking.Calibration` to store eye-tracking calibration info, and :func:`mne.preprocessing.eyetracking.read_eyelink_calibration` to read calibration data from EyeLink systems (:gh:`11719` by `Scott Huberty`_)
- ocular :class:`mne.Annotations` read in by :func:`mne.io.read_raw_eyelink` are now channel aware. This means if the left eye blinked, the associated annotation will store this in the ``'ch_names'`` key. (:gh:`11746` by `Scott Huberty`_)
scott-huberty marked this conversation as resolved.
Show resolved Hide resolved

Bugs
~~~~
Expand All @@ -37,6 +38,7 @@ Bugs
- Fix hanging interpreter with matplotlib figures using ``mne/viz/_mpl_figure.py`` in spyder console and jupyter notebooks (:gh:`11696` by `Mathieu Scheltienne`_)
- Fix bug with overlapping text for :meth:`mne.Evoked.plot` (:gh:`11698` by `Alex Rockhill`_)
- For :func:`mne.io.read_raw_eyelink`, the default value of the ``gap_description`` parameter is now ``'BAD_ACQ_SKIP'``, following MNE convention (:gh:`11719` by `Scott Huberty`_)
- blink :class:`mne.Annotations` read in by :func:`mne.io.read_raw_eyelink` now begin with ``'BAD_'``, i.e. ``'BAD_blink'``, because ocular data are missing during blinks. (:gh:`11746` by `Scott Huberty`_)

API changes
~~~~~~~~~~~
Expand Down
246 changes: 246 additions & 0 deletions mne/io/eyelink/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,252 @@
import re
import numpy as np

from ...utils import _check_pandas_installed


def _isfloat(token):
"""Boolean test for whether string can be of type float.

Parameters
----------
token : str
Single element from tokens list.
"""
if isinstance(token, str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead let's use convenience functions as much as possible, like

_validate_type(token, (str, 'numeric'), 'token')

Then you can just do your try/except without any type business

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, to clarify, that would be:

try:
      _validate_type(token, (str, 'numeric'), 'token')
       return True
except ValueError:
       return False

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently you raise an error for anything other than str, then try to convert to float and return True/False based on floatiness. So I guess it should be:

_validate_type(token, str, 'token')
try:
    float(token)
except ValueError:
    return False
else:
    return True

try:
float(token)
return True
except ValueError:
return False
else:
raise ValueError(
"input should be a string," f" but {token} is of type {type(token)}"
)


def _convert_types(tokens):
"""Convert the type of each token in list.

The tokens input is a list of string elements.
Posix timestamp strings can be integers, eye gaze position and
pupil size can be floats. flags token ("...") remains as string.
Missing eye/head-target data (indicated by '.' or 'MISSING_DATA')
are replaced by np.nan.

Parameters
----------
Tokens : list
List of string elements.

Returns
-------
Tokens list with elements of various types.
"""
return [
int(token)
if token.isdigit() # execute this before _isfloat()
else float(token)
if _isfloat(token)
else np.nan
if token in (".", "MISSING_DATA")
else token # remains as string
for token in tokens
]


def _parse_line(line):
"""Parse tab delminited string from eyelink ASCII file.

Takes a tab deliminited string from eyelink file,
splits it into a list of tokens, and converts the type
for each token in the list.
"""
if len(line):
tokens = line.split()
return _convert_types(tokens)
else:
raise ValueError("line is empty, nothing to parse")
scott-huberty marked this conversation as resolved.
Show resolved Hide resolved


def _is_sys_msg(line):
"""Flag lines from eyelink ASCII file that contain a known system message.

Some lines in eyelink files are system outputs usually
only meant for Eyelinks DataViewer application to read.
These shouldn't need to be parsed.

Parameters
----------
line : string
single line from Eyelink asc file

Returns
-------
bool :
True if any of the following strings that are
known to indicate a system message are in the line

Notes
-----
Examples of eyelink system messages:
- ;Sess:22Aug22;Tria:1;Tri2:False;ESNT:182BFE4C2F4;
- ;NTPT:182BFE55C96;SMSG:__NTP_CLOCK_SYNC__;DIFF:-1;
- !V APLAYSTART 0 1 library/audio
- !MODE RECORD CR 500 2 1 R
"""
return any(["!V" in line, "!MODE" in line, ";" in line])
scott-huberty marked this conversation as resolved.
Show resolved Hide resolved


def _get_sfreq(rec_info):
"""Get sampling frequency from Eyelink ASCII file.

Parameters
----------
rec_info : list
the first list in self._event_lines['SAMPLES'].
The sfreq occurs after RATE: i.e. [..., RATE, 1000, ...].

Returns
-------
sfreq : int | float
"""
for i, token in enumerate(rec_info):
if token == "RATE":
# sfreq is the first token after RATE
return rec_info[i + 1]
scott-huberty marked this conversation as resolved.
Show resolved Hide resolved


def _sort_by_time(df, col="time"):
df.sort_values(col, ascending=True, inplace=True)
df.reset_index(drop=True, inplace=True)


def _convert_times(df, first_samp, col="time"):
"""Set initial time to 0, converts from ms to seconds in place.

Parameters
----------
df pandas.DataFrame:
One of the dataframes in the self.dataframes dict.

first_samp int:
timestamp of the first sample of the recording. This should
be the first sample of the first recording block.
col str (default 'time'):
column name to sort pandas.DataFrame by

Notes
-----
Each sample in an Eyelink file has a posix timestamp string.
Subtracts the "first" sample's timestamp from each timestamp.
The "first" sample is inferred to be the first sample of
the first recording block, i.e. the first "START" line.
"""
_sort_by_time(df, col)
for col in df.columns:
if col.endswith("time"): # 'time' and 'end_time' cols
df[col] -= first_samp
df[col] /= 1000
if col in ["duration", "offset"]:
df[col] /= 1000


def _fill_times(
df,
sfreq,
time_col="time",
):
"""Fill missing timestamps if there are multiple recording blocks.

Parameters
----------
df : pandas.DataFrame:
dataframe of the eyetracking data samples, BEFORE
_convert_times() is applied to the dataframe

sfreq : int | float:
sampling frequency of the data

time_col : str (default 'time'):
name of column with the timestamps (e.g. 9511881, 9511882, ...)

Returns
-------
%(df_return)s

Notes
-----
After _parse_recording_blocks, Files with multiple recording blocks will
have missing timestamps for the duration of the period between the blocks.
This would cause the occular annotations (i.e. blinks) to not line up with
the signal.
"""
pd = _check_pandas_installed()

first, last = df[time_col].iloc[[0, -1]]
step = 1000 / sfreq
df[time_col] = df[time_col].astype(float)
new_times = pd.DataFrame(
np.arange(first, last + step / 2, step), columns=[time_col]
)
return pd.merge_asof(
new_times, df, on=time_col, direction="nearest", tolerance=step / 10
)


def _find_overlaps(df, max_time=0.05):
"""Merge left/right eye events with onset/offset diffs less than max_time.

df : pandas.DataFrame
Pandas DataFrame with occular events (fixations, saccades, blinks)
max_time : float (default 0.05)
Time in seconds. Defaults to .05 (50 ms)

Returns
-------
DataFrame: %(df_return)s
:class:`pandas.DataFrame` specifying overlapped eye events, if any
Notes
-----
The idea is to cumulative sum the boolean values for rows with onset and
offset differences (against the previous row) that are greater than the
max_time. If onset and offset diffs are less than max_time then no_overlap
will become False. Alternatively, if either the onset or offset diff is
greater than max_time, no_overlap becomes True. Cumulatively summing over
these boolean values will leave rows with no_overlap == False unchanged
and hence with the same group number.
"""
pd = _check_pandas_installed()

df = df.copy()
df["overlap_start"] = df.sort_values("time")["time"].diff().lt(max_time)

df["overlap_end"] = df["end_time"].diff().abs().lt(max_time)

df["no_overlap"] = ~(df["overlap_end"] & df["overlap_start"])
df["group"] = df["no_overlap"].cumsum()

# now use groupby on 'group'. If one left and one right eye in group
# the new start/end times are the mean of the two eyes
ovrlp = pd.concat(
[
pd.DataFrame(g[1].drop(columns="eye").mean()).T
if (len(g[1]) == 2) and (len(g[1].eye.unique()) == 2)
else g[1] # not an overlap, return group unchanged
for g in df.groupby("group")
]
)
# overlapped events get a "both" value in the "eye" col
if "eye" in ovrlp.columns:
ovrlp["eye"] = ovrlp["eye"].fillna("both")
else:
ovrlp["eye"] = "both"
tmp_cols = ["overlap_start", "overlap_end", "no_overlap", "group"]
return ovrlp.drop(columns=tmp_cols).reset_index(drop=True)


# Used by read_eyelinke_calibration


def _find_recording_start(lines):
"""Return the first START line in an SR Research EyeLink ASCII file.
Expand Down