Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make removeTrend output identical to MATLAB PREP #71

Merged
merged 11 commits into from Apr 29, 2021

Conversation

a-hurst
Copy link
Collaborator

@a-hurst a-hurst commented Apr 26, 2021

PR Description

Closes #55.

Basically, MNE and EEGLAB calculate FIR filter parameters slightly differently (specifically, the length of the filter). To bypass this, I've added a new function that calculates the parameter values the same way that EEGLAB does and uses SciPy's signal.firwin to turn them into a MatPREP-identical filter. Then, I use the same internal method MNE's filter.filter_data does to apply the filter to the signal, producing identical output to MatPREP's removeTrend.

Merge Checklist

  • the PR has been reviewed and all comments are resolved
  • all CI checks pass
  • (if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
  • (if applicable): bug fixes, new features, or API changes are documented in whats_new.rst

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 26, 2021

Whoops, looks like I broke something by accident. Serves me right for being too lazy to run unit tests locally before pushing.

@sappelhoff sappelhoff added this to the 0.4.0 milestone Apr 26, 2021
@a-hurst a-hurst marked this pull request as draft April 26, 2021 20:13
@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 26, 2021

Okay, so I may have jumped the gun a bit: I was looking at the wrong values in MATLAB (the pre-filtered ones) and my FIR filter was also broken (missing the big spike in the middle), so with the two issues combined it looked like it was working.

I've fixed the custom filter itself so it's actually identical to MATLAB's now, but MNE's filter applying function still produces values from EEGLAB's so I'm giving this a rest for now. The actual filter application code in MATLAB looks reasonably simple, so maybe on a future day I'll try adapting that too.

@sappelhoff
Copy link
Owner

If something is unclear from the MNE side, you can always pose a question on https://mne.discourse.group :-)

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 28, 2021

@sappelhoff So it absolutely wasn't worth the total time and effort, but sunk cost fallacy kept me going until the end: in the latest commit I've implemented the equivalent of MATLAB's firfilt in Python using SciPy/NumPy.

Now NoisyChannels gives me identical RANSAC correlations regardless of whether I use a .set that's been pre-highpassed in MATLAB (what I've been doing so far for tests) or I do the high-passing in PyPREP with removeTrend. On the flip side, this method is also about 4x slower than the previous one (~8 seconds vs. ~2 seconds for a 64-channel, 694016-sample file), and I really don't know much about filter theory or signal processing so MNE's filtering could be technically superior at trend-removal than the code here.

My current thinking is that maybe this should just be wrapped in matlab_strict unless there's also a technical reason to prefer this style of filtering over the old one? Below I've included a comparison of the RANSAC correlations before and after this PR to illustrate how much of a difference this all makes. Let me know what you think!

Comparisons of RANSAC correlations between removeTrend methods (click to unhide)

Current removeTrend method (MNE's filter_data):

array([[0.89438387, 0.82456243, 0.91580898, ..., 0.98728151, 0.95233041,
        0.94015163],
       [0.91034277, 0.9600703 , 0.64722385, ..., 0.97487885, 0.83335142,
        0.97426197],
       [0.81404095, 0.80873159, 0.80359327, ..., 0.96677663, 0.95435911,
        0.88024998],
       ...,
       [0.8676419 , 0.89621802, 0.8646553 , ..., 0.99084073, 0.96557544,
        0.98570571],
       [0.98061229, 0.95732037, 0.93206238, ..., 0.99242296, 0.97696497,
        0.9704131 ],
       [0.95575104, 0.84842815, 0.7716721 , ..., 0.99250605, 0.95400638,
        0.95062305]])

_eeglab_create_highpass + _eeglab_fir_filter (same as values when using pre-highpassed .set from MatPREP & commenting out PyPREP's removeTrend):

array([[0.89682968, 0.82714556, 0.91665897, ..., 0.98672507, 0.95110975,
        0.94141531],
       [0.91036137, 0.96005206, 0.64718089, ..., 0.97489243, 0.8333695 ,
        0.97430253],
       [0.81397999, 0.80860207, 0.80386369, ..., 0.96684684, 0.95454648,
        0.8802559 ],
       ...,
       [0.867604  , 0.89619309, 0.86480001, ..., 0.99082554, 0.9655723 ,
        0.98570184],
       [0.98061141, 0.95732534, 0.93203051, ..., 0.99242111, 0.97695023,
        0.97039977],
       [0.95548201, 0.8487357 , 0.77087238, ..., 0.99259436, 0.95389967,
        0.95089473]])

@a-hurst a-hurst marked this pull request as ready for review April 28, 2021 00:43
@sappelhoff
Copy link
Owner

My current thinking is that maybe this should just be wrapped in matlab_strict

yes! Let's do that and keep the mne detrending as a default 👍

the differences are quite tiny but it's good you got it to align perfectly --> yet as you say, we don't know how "good" and/or "robust" your code is ... at least relative to MNE's filtering implementation that is tested every minute by lots of users.

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 28, 2021

If you know anyone who knows their stuff re: filtering and signal processing theory, it'd be interesting to hear their take on the functional difference between the EEGLAB and MNE methods. Since we're wrapping this in matlab_strict (and thus making it another "deliberate difference"), it'd be nice to know & be able to explain on the 'differences' RST page if MNE's approach is technically/theoretically better than EEGLAB's (beyond the obvious maintenance/speed concerns we've already discussed).

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 28, 2021

@sappelhoff Okay, think I've got this sorted now! In addition to wrapping this in matlab_strict and documenting the difference in the deliberate differences page, I also cleaned up the docstrings for removeTrend a bit and changed sample_rate from an optional argument with a default to a required one since that seemed to make more sense.

If either of us learn any more about the costs/benefits of MNE's filtering algorithm vs. EEGLAB's, I'll update the deliberate differences section accordingly.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, only some tiny clarifications needed if possible

docs/matlab_differences.rst Outdated Show resolved Hide resolved
pyprep/removeTrend.py Outdated Show resolved Hide resolved
pyprep/utils.py Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

Merging #71 (b62e4cf) into master (0fa6ac4) will increase coverage by 0.17%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #71      +/-   ##
==========================================
+ Coverage   97.15%   97.33%   +0.17%     
==========================================
  Files           7        7              
  Lines         597      637      +40     
==========================================
+ Hits          580      620      +40     
  Misses         17       17              
Impacted Files Coverage Δ
pyprep/find_noisy_channels.py 96.53% <ø> (ø)
pyprep/prep_pipeline.py 100.00% <100.00%> (ø)
pyprep/reference.py 96.55% <100.00%> (ø)
pyprep/removeTrend.py 95.08% <100.00%> (+0.43%) ⬆️
pyprep/utils.py 99.06% <100.00%> (+0.45%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0fa6ac4...b62e4cf. Read the comment docs.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a bunch!

@sappelhoff sappelhoff merged commit fa4bad3 into sappelhoff:master Apr 29, 2021
@a-hurst a-hurst mentioned this pull request Apr 29, 2021
23 tasks
@a-hurst
Copy link
Collaborator Author

a-hurst commented Nov 18, 2021

@sappelhoff So I'm currently taking a grad course on signal processing, and I think I finally understand the key difference between MNE's highpass filter and EEGLAB's: MNE defaults to applying the filter using a zero-phase method that doesn't result in any phase shift in the signal (the equivalent of filtfilt in MATLAB/Scipy), whereas the EEGLAB filter function I re-implemented here only does a single forward pass (using lfilter), so you end up with phase shift in the signal.

As far as I can tell, other than theoretically being faster (which it isn't in the current implementation) there's absolutely no reason to use a forwards-only filter like EEGLAB does instead of the backwards-forwards zero-phase approach used by MNE. I'm guessing this is a weird legacy carry-over for EEGLAB from the days when people were running EEG analyses on Pentium 3's and PowerPC G4's so CPU cycles were at a major premium. I'll update the "differences" docs to reflect this when I get a chance!

@sappelhoff
Copy link
Owner

whereas the EEGLAB filter function I re-implemented here only does a single forward pass (using lfilter), so you end up with phase shift in the signal.

😱 gosh ...

I'm guessing this is a weird legacy carry-over for EEGLAB from the days when people were running EEG analyses on Pentium 3's and PowerPC G4's so CPU cycles were at a major premium

you are probably right, EEGLAB has been around for quite some time!

great that you remembered this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

removeTrend produces different values in PyPREP than MATLAB PREP
3 participants