Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FIX: make all annotate_* funcs return annots with consistent orig_time #10067

Merged
merged 7 commits into from Dec 8, 2021

Conversation

sappelhoff
Copy link
Member

@sappelhoff sappelhoff commented Dec 1, 2021

closes #10060

Now all annotate_* functions will return annotations with a consisten orig_time (first thought None would be a good idea, ... now changing it to raw.info["meas_date"] for all)

might be worth including in 0.24.1, cc @larsoner

Copy link
Member Author

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, seems to be a little more tricky than expected ... and I find the docs on how annots.orig_time and raw.meas_date interact a bit hard to understand (the "figure" in the docstr)

If anyone knows what's going on with these two failing tests, I'd be happy about pointers:

  • pytest mne/preprocessing/tests/test_maxwell.py::test_find_bads_maxwell_flat
  • pytest mne/preprocessing/tests/test_flat.py::test_annotate_flat

@hoechenberger
Copy link
Member

hoechenberger commented Dec 1, 2021

and I find the docs on how annots.orig_time and raw.meas_date interact a bit hard to understand

+100
I never understood this stuff, and I've read it many times before

Also not sure if ASCII art is really the best choice here

@hoechenberger
Copy link
Member

If anyone knows what's going on with these two failing tests, I'd be happy about pointers:

The problem occurs when the annotations are concatenated in the test:

        raw_time.set_annotations(raw_time.annotations + annot)
> annot.onset
array([10.])
> raw_time.annotations.onset
array([], dtype=float64)

This produces an annotation with onset at 10 seconds. However:

> raw.times.max()
0.999

Now during set_annotations, we do the following:

                new_annotations.crop(0, self.times[-1] + delta,
                                     emit_warning=emit_warning)
                new_annotations.onset += self._first_time

So we crop away the annotation at 10 seconds

To make it work, I think we'd need to do something like this instead (not tested)

                new_annotations.onset -= self._first_time
                new_annotations.crop(0, self.times[-1] + delta,
                                     emit_warning=emit_warning)

But this will probably break other use cases?

@agramfort
Copy link
Member

personally for me this is a regression.

these annotations can contain thinks like sleep spindles, eye movements and you want to know
at what time of the day this occurred.

I would rather use meas_date for all annotations

@sappelhoff
Copy link
Member Author

I would rather use meas_date for all annotations

I guess that would also be fine. My problem was arising when I wanted to concatenate annots_muscle_zscore and annots_flat as generated by their respective functions:

ValueError: orig_time should be the same to add/concatenate 2 annotations (got None != 1912-06-22 14:01:10.497486+00:00)

both of the annots were created on the same Raw, so if all annotate_* funcs in MNE would use meas_date, the problem would also be solved.

It's just that the current inconsistency is suboptimal.

@sappelhoff sappelhoff changed the title FIX: make all annotate_* funcs return annots with orig_time=None FIX: make all annotate_* funcs return annots with consistent orig_time Dec 2, 2021
Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

+1 for MRG if CIs are green

thx @sappelhoff

@hoechenberger
Copy link
Member

I like this and I think this should be backported too

@sappelhoff
Copy link
Member Author

I'll push a tiny commit with tests after CIs come back green, then this is ready from my side as well.

@sappelhoff sappelhoff changed the title FIX: make all annotate_* funcs return annots with consistent orig_time [MRG] FIX: make all annotate_* funcs return annots with consistent orig_time Dec 2, 2021
Copy link
Member Author

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good that I added the more thorough approach ... test_movement_annotation_head_correction is actually failing ...

@larsoner
Copy link
Member

larsoner commented Dec 2, 2021

IIRC the plan is to release 1.0 in January, so another point release is unlikely before then, so not sure we need to backport. I guess we could mark things as candidates but not bother actually backporting until we decide we want to do another point release.

@sappelhoff
Copy link
Member Author

sappelhoff commented Dec 2, 2021

So with this PR, all mne.Annotations returned by mne.preprocessing.annotate_* functions will have raw.info["meas_date"] as their orig_time (from the raw passed to the annotate func).

For one test, this leads to errors (pytest mne/preprocessing/tests/test_artifact_detection.py::test_movement_annotation_head_correction) at this point:

raw.set_annotations(annot_rot + annot_tra + annot_dis)

mne/preprocessing/tests/test_artifact_detection.py:43: in test_movement_annotation_head_correction
raw.set_annotations(annot_rot + annot_tra + annot_dis)
:24: in set_annotations
???
mne/io/base.py:698: in set_annotations
new_annotations.crop(tmin=tmin, tmax=tmax,
:24: in crop
???
mne/annotations.py:567: in crop
warn('Omitted %s annotation(s) that were outside data'
mne/utils/_logging.py:365: in warn
warnings.warn_explicit(
E RuntimeWarning: Omitted 4 annotation(s) that were outside data range.

I think the reason why this fails is the interaction when calling raw.set_annotations between orig_time/meas_date and raw.first_samp ... it's just not obvious to me why, and what the right fix would be.

When going into the code with --pdb, I can see the following (which might help one of you to help me 😉 ):

(Pdb) annot_all = annot_rot + annot_tra + annot_dis
(Pdb) annot_all.onset+annot_all.duration
array([ 6.  ,  7.96,  8.34,  8.25, 11.  , 13.  , 13.49, 16.07])
(Pdb) raw
<Raw | test_move_anon_raw.fif, 392 x 20400 (17.0 s), ~68.7 MB, data loaded>
(Pdb) raw.first_samp
10800
(Pdb) raw.info["sfreq"]
1200.0

Basically:

  • raw is 17s long
  • annotations are extending to max 16.07, so should all fit
  • however, the "first_samp" is at 10800 (1200 sfreq), so at about 9s, which is why the last 4 annots must be ommitted (which is what we see in the error) edit: actually, the first 4 annots are being ommitted, see the set_annotations code:

mne-python/mne/io/base.py

Lines 691 to 704 in 963eea2

if annotations.orig_time is None:
new_annotations.crop(0, self.times[-1] + delta,
emit_warning=emit_warning)
new_annotations.onset += self._first_time
else:
tmin = meas_date + timedelta(0, self._first_time)
tmax = tmin + timedelta(seconds=self.times[-1] + delta)
new_annotations.crop(tmin=tmin, tmax=tmax,
emit_warning=emit_warning)
new_annotations.onset -= (
meas_date - new_annotations.orig_time).total_seconds()
new_annotations._orig_time = meas_date
self._annotations = new_annotations

  • when orig_time=None (prior to this PR), the annotation onsets are adjusted by raw.first_samp (raw._first_time is just first_samp converted to secs)
  • however, when orig_time=meas_date (this PR), first_samp is not taken into account 😕

this leads to:

(Pdb) pass  # drop orig_time
(Pdb) annot_all2 = mne.Annotations(annot_all.onset, annot_all.duration, annot_all.description)
(Pdb) pass  # annot onsets and durations are still the same between both versions
(Pdb) annot_all2.onset+annot_all2.duration
array([ 6.  ,  7.96,  8.34,  8.25, 11.  , 13.  , 13.49, 16.07])
(Pdb) annot_all.onset+annot_all.duration
array([ 6.  ,  7.96,  8.34,  8.25, 11.  , 13.  , 13.49, 16.07])
(Pdb) pass  # but setting to raw only works when orig_time=None
(Pdb) raw.set_annotations(annot_all2)
<Raw | test_move_anon_raw.fif, 392 x 20400 (17.0 s), ~68.7 MB, data loaded>
(Pdb) pass  # check annots attached to raw
(Pdb) raw.annotations
<Annotations | 8 segments: BAD_mov_dist (1), BAD_mov_rotat_vel (5), ...>
(Pdb) pass  # when taking first_samp into account, we see that the data is not from 0-17s, but from 9-26s (or similar)
(Pdb) raw.annotations.onset+raw.annotations.duration
array([15.  , 16.96, 17.34, 17.25, 20.  , 22.  , 22.49, 25.07])
(Pdb) pass  # 4 annots are <9s, so these 4 first annots are ommitted
(Pdb) pass  # when setting them to raw and orig_time is NOT None
(Pdb) raw._first_time
9.0

@sappelhoff
Copy link
Member Author

@larsoner do you have an idea how to resolve this?

@larsoner
Copy link
Member

larsoner commented Dec 3, 2021

I'll need to dig into the code a bit :(

Hopefully I can look early next week

doc/changes/latest.inc Outdated Show resolved Hide resolved
@larsoner
Copy link
Member

larsoner commented Dec 6, 2021

Okay, I'll take a look hopefully after that then

@sappelhoff
Copy link
Member Author

Okay, I'll take a look hopefully after that then

Thanks! Should be ready now. My analysis from #10067 (comment) is still where I am stuck.

Comment on lines -165 to +169
hp_ts -= raw.first_samp / sfreq
orig_time = raw.info['meas_date']
if orig_time is None:
hp_ts -= raw.first_samp / sfreq
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sappelhoff this was the key change. We only need to shift the head_pos time reference by first_samp if times are absolute / orig_time is not None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhhh yes, makes sense! Thanks for finding that!

I just think you are missing a "not" in your sentence? I.e., we only need to shift if times are NOT absolute / orig_time is None?

@agramfort
Copy link
Member

thx @larsoner 🙏

@sappelhoff merge if happy

Copy link
Member Author

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help @larsoner

@sappelhoff sappelhoff merged commit 151b8a6 into mne-tools:main Dec 8, 2021
@sappelhoff sappelhoff deleted the fix/annots/orig_time branch December 8, 2021 22:03
@larsoner
Copy link
Member

larsoner commented Dec 9, 2021

@sappelhoff looks like this broke an example:

Traceback (most recent call last):
  File "/home/circleci/project/examples/preprocessing/muscle_detection.py", line 81, in <module>
    raw.set_annotations(annot_muscle)
  File "<decorator-gen-201>", line 24, in set_annotations
  File "/home/circleci/project/mne/io/base.py", line 698, in set_annotations
    new_annotations.crop(tmin=tmin, tmax=tmax,
  File "<decorator-gen-93>", line 24, in crop
  File "/home/circleci/project/mne/annotations.py", line 567, in crop
    warn('Omitted %s annotation(s) that were outside data'
  File "/home/circleci/project/mne/utils/_logging.py", line 365, in warn
    warnings.warn_explicit(
RuntimeWarning: Omitted 1 annotation(s) that were outside data range.

Can you look?

@sappelhoff
Copy link
Member Author

mh, weird. running it locally on up to date main via python examples/preprocessing/muscle_detection.py works without errors.

@sappelhoff
Copy link
Member Author

can you confirm that @hoechenberger ?

@hoechenberger
Copy link
Member

can you confirm that @hoechenberger ?

/Users/hoechenberger/Development/mne-python/examples/preprocessing/muscle_detection.py:81: RuntimeWarning: Omitted 1 annotation(s) that were outside data range.
  raw.set_annotations(annot_muscle)
❯ mne sys_info
Platform:       macOS-12.0.1-x86_64-i386-64bit
Python:         3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 20:33:18)  [Clang 11.1.0 ]
Executable:     /Users/hoechenberger/miniforge3/envs/mne/bin/python3.9
CPU:            i386: 8 cores
Memory:         16.0 GB

mne:            1.0.dev0
numpy:          1.21.4 {blas=NO_ATLAS_INFO, lapack=lapack}
scipy:          1.7.3
matplotlib:     3.5.0 {backend=MacOSX}

sklearn:        1.0.1
numba:          0.53.1
nibabel:        3.2.1
nilearn:        0.8.1
dipy:           1.4.1
cupy:           Not found
pandas:         1.3.4
pyvista:        0.32.1 {OpenGL 4.1 INTEL-18.2.10 via Intel(R) Iris(TM) Plus Graphics OpenGL Engine}
pyvistaqt:      0.5.0
ipyvtklink:     0.2.1
vtk:            9.0.3
PyQt5:          5.12.3
ipympl:         0.8.2
mne_qt_browser: 0.1.7
pooch:          v1.5.2

@sappelhoff
Copy link
Member Author

sappelhoff commented Dec 9, 2021

ok, thanks. got it.

Some first_samp problem :|

>>> annot_muscle
<Annotations | 1 segment: BAD_muscle (1)>
>>> annot_muscle.onset
array([15.17])
>>> annot_muscle.duration
array([1.09666667])
>>> annot_muscle.description
array(['BAD_muscle'], dtype='<U10')
>>> annot_muscle.orig_time
datetime.datetime(2013, 12, 18, 9, 43, tzinfo=datetime.timezone.utc)
>>> raw.info["meas_date"]
datetime.datetime(2013, 12, 18, 9, 43, tzinfo=datetime.timezone.utc)
>>> raw.times
array([0.00000000e+00, 3.33333333e-03, 6.66666667e-03, ...,
       2.99900000e+01, 2.99933333e+01, 2.99966667e+01])
>>> raw.times.max()
29.996666666666666
>>> raw.first_samp
39000
>>> raw

@hoechenberger
Copy link
Member

Some first_samp problem :|

Ohhh yes, our all-time favorite…

@sappelhoff
Copy link
Member Author

It seems like this can be solved like this.

Old:

annot = _annotations_from_mask(raw_copy.times, art_mask, 'BAD_muscle',
orig_time=raw.info['meas_date'])

New:

    annot = _annotations_from_mask(raw_copy.times+raw._first_time, art_mask, 'BAD_muscle',
                                   orig_time=raw.info['meas_date'])

I suspect that this is an issue in all funcs that use _annotations_from_mask, and our tests just didn't catch this. Lucky that an example warned us.

Don't know if this is the right fix 🤷

@sappelhoff
Copy link
Member Author

I'll start a new PR, adding tests that fail .. and will then push commits to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

inconsistent behavior: annotate_* funcs return annotations with different orig_time (None, or set)
4 participants