Make noisy channel exclusion during Reference compatible with MATLAB PREP #93

a-hurst · 2021-06-24T03:00:57Z

PR Description

Closes #91. This PR does three main things:

Changes Reference.robust_reference to stop permanently excluding channels that are flagged as bad-by-SNR prior to initial re-referencing.
Changes Reference.robust_reference to exclude bad-by-dropout channels from the full set of bad channels whenmatlab_strict is True, matching MatPREP's (likely accidental) behaviour.
Adds an as_dict argument to NoisyChannels.get_bads that returns bad channels by type as a dictionary in order to facilitate the above changes. In addition to being generally useful, it also saves a lot of dict construction in reference.py.

I still need to update the whats_new.rst and matlab_differences.rst here, but I figured given the time-zone difference that it would be better to get the code up for review before I went to bed. Thanks in advance for checking this over!

Merge Checklist

the PR has been reviewed and all comments are resolved
all CI checks pass
(if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
(if applicable): bug fixes, new features, or API changes are documented in whats_new.rst

codecov-commenter · 2021-06-24T03:09:07Z

Codecov Report

Merging #93 (92a10e7) into master (4614ca0) will increase coverage by 0.31%.
The diff coverage is 97.29%.

@@            Coverage Diff             @@
##           master      #93      +/-   ##
==========================================
+ Coverage   98.51%   98.83%   +0.31%     
==========================================
  Files           7        7              
  Lines         675      687      +12     
==========================================
+ Hits          665      679      +14     
+ Misses         10        8       -2

Impacted Files	Coverage Δ
pyprep/reference.py	`98.31% <96.00%> (+1.76%)`	⬆️
pyprep/find_noisy_channels.py	`100.00% <100.00%> (ø)`
pyprep/prep_pipeline.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4614ca0...92a10e7. Read the comment docs.

sappelhoff

LGTM! Only minor comments for docs

pyprep/reference.py

a-hurst · 2021-06-26T01:43:03Z

@sappelhoff Okay, I've updated the docs and whats-new, and have addressed your comments. A couple questions before this gets merged:

Any thoughts on whether there should be an API for configuring max_iterations at the PrepPipeline level? There's currently no API for configuring NoisyChannels thresholds from PrepPipeline either, so I wasn't sure what the expectation was for Reference settings. This might be a broader API question that goes beyond the scope of this PR.
While writing out the matlab_differences explanation, I realized two weird things about the current logic we might want to change:
- With the current changes, channels flagged as bad-by-SNR before initial average referencing are no longer permanently flagged as "bad" after initial average referencing, the same as bad-by-deviation/noise/correlation/dropout/RANSAC channels. However, because they're included as part of the self.unusable_channels set, they're still always excluded from the calculated average reference signal! This means that initial bad-by-SNR channels are considered good enough to use when interpolating flagged bad channels, but not good enough to include under any circumstances (even if interpolated) in the average reference. This seems wrong, but is exactly what MATLAB PREP is doing. Any thoughts on whether we want to change this?
- Currently, bad-by-dropout channels are flagged and removed if detected after initial average referencing (provided matlab_strict is False). However, if the signal isn't average-referenced to begin with, wouldn't the initial average referencing make any intermittent flat regions in the data no longer flat? As such, wouldn't we want bad-by-dropout channels to be flagged as unusable prior to initial referencing, the same as with bad-by-NaN and bad-by-flat channels?

sappelhoff · 2021-06-26T06:46:45Z

Any thoughts on whether there should be an API for configuring max_iterations at the PrepPipeline level? There's currently no API for configuring NoisyChannels thresholds from PrepPipeline either, so I wasn't sure what the expectation was for Reference settings

I think that if a parameter is configurable in MatPREP, and it's reasonably easy to make it configurable in PyPrep, then we SHOULD make it possible. These params should always have a reasonable default (like in MatPrep) though. On the other hand, let's not expose more params than MatPrep does (for now), unless we judge it very very important.

This means that initial bad-by-SNR channels are considered good enough to use when interpolating flagged bad channels, but not good enough to include under any circumstances (even if interpolated) in the average reference

🤔 yeah, that sounds off. +1 for an issue about this 1. on pyprep and 2. on matprep. The one for pyprep can be marked to be resolved in version 0.5 --> version 0.4 being targeted more at feature completeness with matprep. If nobody ever reacts to this on matprep, we can do our own, non-matprep-strict thing in pyprep starting with 0.5.

However, if the signal isn't average-referenced to begin with, wouldn't the initial average referencing make any intermittent flat regions in the data no longer flat? As such, wouldn't we want bad-by-dropout channels to be flagged as unusable prior to initial referencing, the same as with bad-by-NaN and bad-by-flat channels?

Yes, that's true, good point. What does matprep do here?

sappelhoff

there are three lines in this diff that are not being covered --> is it possible to reasonably do so with little changes/effort? If not, no problem.

docs/matlab_differences.rst

a-hurst · 2021-06-26T15:40:20Z

I think that if a parameter is configurable in MatPREP, and it's reasonably easy to make it configurable in PyPrep, then we SHOULD make it possible.

Okay, I'll add it to the params dict for now!

+1 for an issue about this 1. on pyprep and 2. on matprep. The one for pyprep can be marked to be resolved in version 0.5 --> version 0.4 being targeted more at feature completeness with matprep. If nobody ever reacts to this on matprep, we can do our own, non-matprep-strict thing in pyprep starting with 0.5.

Sounds good, will do! Maybe this is a place for some experimentation, but my gut instinct would be to default to only marking initial bad-by-flat and bad-by-NaN channels as fully "unusable". I'd imagine that the original intent was to removing channels that contribute more noise than signal from contaminating the average reference, but given how sensitive the 'bad-by-correlation' default settings are (if over 1% of all correlation windows are below threshold, the whole channel is considered bad) and how that can change after average referencing, I don't think initial bad-by-SNR channels should be treated as permanently bad. Anyway, that can all be discussed in the new issue.

Yes, that's true, good point. What does matprep do here?

It just ignores dropout channels completely during referencing: only bad-by-flat/NaN/SNR channels are permanently flagged and removed from the average reference at the onset, and due to the likely bug we've reported already they won't get identified or excluded on subsequent iterations either.

there are three lines in this diff that are not being covered --> is it possible to reasonably do so with little changes/effort? If not, no problem.

The matlab_strict line will get covered in the EEGLAB interpolation PR when I can add the MatPREP comparison tests, but I'll try adding a test case to cover the other main ones!

a-hurst · 2021-06-26T20:31:26Z

Okay, I've made codecov a bit happier now: I covered a line that only gets run if the input signal is completely clean, fixed a bug that prevented a ValueError from getting covered (it was throwing a ValueError earlier, so the test passed but the coverage failed), and worked around a known coverage.py bug where a break immediately after an if statement doesn't get counted as being run.

sappelhoff · 2021-06-27T08:15:40Z

Thanks @a-hurst

…PREP (#93) * Add "as_dict" option for get_bads * Don't permanently exclude initial bad-by-SNRs * Match PREP's noisy channel updating logic * Minor variable name cleanup * use as_dict throughout reference.py * Fix 'ignore' logic * Add back "bad_all" (whoops) * Update matlab_differences.rst * Add max_iterations args, link to dropout issue * Fix diffs mistake, update whats_new.rst * Add PrepPipeline API for max_iterations * Improve test coverage for Reference * Add whats_new entry for SNR changes * Fix quotes to make black happy * Improve Reference test coverage some more * remove unused import

a-hurst added 7 commits June 23, 2021 18:08

Add "as_dict" option for get_bads

b8447d0

Don't permanently exclude initial bad-by-SNRs

6964471

Match PREP's noisy channel updating logic

1d5d32e

Minor variable name cleanup

f1b1232

use as_dict throughout reference.py

7613199

Fix 'ignore' logic

d0121b8

Add back "bad_all" (whoops)

d9b8adf

sappelhoff added this to the 0.4.0 milestone Jun 24, 2021

sappelhoff reviewed Jun 24, 2021

View reviewed changes

pyprep/reference.py Show resolved Hide resolved

pyprep/reference.py Outdated Show resolved Hide resolved

pyprep/reference.py Show resolved Hide resolved

a-hurst added 3 commits June 25, 2021 21:29

Update matlab_differences.rst

9e580a6

Add max_iterations args, link to dropout issue

0888ae8

Fix diffs mistake, update whats_new.rst

9f5f093

sappelhoff reviewed Jun 26, 2021

View reviewed changes

docs/matlab_differences.rst Outdated Show resolved Hide resolved

a-hurst added 6 commits June 26, 2021 14:32

Add PrepPipeline API for max_iterations

93ffb82

Improve test coverage for Reference

b8b280c

Add whats_new entry for SNR changes

6e6a208

Fix quotes to make black happy

cc33439

Improve Reference test coverage some more

0c9b2a9

remove unused import

92a10e7

sappelhoff mentioned this pull request Jun 27, 2021

Weird handling of initial bad-by-SNR channels during robust referencing #94

Open

sappelhoff approved these changes Jun 27, 2021

View reviewed changes

sappelhoff merged commit 3625b26 into sappelhoff:master Jun 27, 2021

a-hurst mentioned this pull request Jun 27, 2021

Numeric equivalence with MATLAB PREP #58

Closed

23 tasks

a-hurst mentioned this pull request Jun 27, 2021

Bad-by-dropout channels sometimes ignored during Reference #97

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make noisy channel exclusion during Reference compatible with MATLAB PREP #93

Make noisy channel exclusion during Reference compatible with MATLAB PREP #93

a-hurst commented Jun 24, 2021 •

edited by sappelhoff

codecov-commenter commented Jun 24, 2021 •

edited

sappelhoff left a comment

a-hurst commented Jun 26, 2021 •

edited

sappelhoff commented Jun 26, 2021

sappelhoff left a comment

a-hurst commented Jun 26, 2021

a-hurst commented Jun 26, 2021

sappelhoff commented Jun 27, 2021

Make noisy channel exclusion during Reference compatible with MATLAB PREP #93

Make noisy channel exclusion during Reference compatible with MATLAB PREP #93

Conversation

a-hurst commented Jun 24, 2021 • edited by sappelhoff

PR Description

Merge Checklist

codecov-commenter commented Jun 24, 2021 • edited

Codecov Report

sappelhoff left a comment

Choose a reason for hiding this comment

a-hurst commented Jun 26, 2021 • edited

sappelhoff commented Jun 26, 2021

sappelhoff left a comment

Choose a reason for hiding this comment

a-hurst commented Jun 26, 2021

a-hurst commented Jun 26, 2021

sappelhoff commented Jun 27, 2021

a-hurst commented Jun 24, 2021 •

edited by sappelhoff

codecov-commenter commented Jun 24, 2021 •

edited

a-hurst commented Jun 26, 2021 •

edited