Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to NoisyChannel a bad_by_psd method to tackle low-frequency artefacts #143

Open
nabilalibou opened this issue Feb 7, 2024 · 8 comments

Comments

@nabilalibou
Copy link
Contributor

Hey,

I like to implement the NoisyChannel class in my preprocessing pipeline and to I like to complete the detection by looking at which channel has a psd value in at least one frequency that is significantly different (given a Z-score threshold) than the others.

In term of code it would simply looks a bit like this:

def bad_by_psd(data, fmin=0, fmax=np.inf, sd=3):
    data_size = len(get_good_eeg_chan(data))  # data_size = eeg channel number
    psd = data.compute_psd(fmin=fmin, fmax=fmax)
    log_psd = 10 * np.log10(psd.get_data())
    zscore_psd = scipy.stats.zscore(log_psd)
    mask = np.zeros(data_size, dtype=bool)
    for i in range(len(mask)):
        mask[i] = any(zscore_psd[i] > sd)
    return mask  # or it could return directly a list of bad channels

It has given me good results, and complements deviation and correlation detection quite well.
In essence, it's a little redundant with find_bad_by_hfnoise() and less explicit but there a lot of use cases where we are not interested in >50 Hz frequencies and bad_by_psd allow to tackle channel parasitized by low delta noises (like sweat electrodes) or gamma noises.

What do you think about a new way to tackle artifacts based on PSD ?

@sappelhoff
Copy link
Owner

I am a bit hesitant to start including code in this repository that goes beyond the methods included in the original PREP. Perhaps you could open an issue here, on the repo of the original PREP: https://github.com/VisLab/EEG-Clean-Tools

It would be interesting to see, whether the authors had originally considered a method like the one you proposed and why it's not included.

Having that said, there are minor deviations / improvements from the original PREP that I am happy to accommodate here, as for example you suggested in #142, and as is documented here: https://pyprep.readthedocs.io/en/latest/matlab_differences.html

What are your opinions @a-hurst @yjmantilla?

@nabilalibou
Copy link
Contributor Author

nabilalibou commented Feb 8, 2024

I understand that the ideal would be for it to be put first in the Matlab version or at least that we get the answers to our questions. But the official Matlab PREP repo https://github.com/VisLab/EEG-Clean-Tools looks a bit abandoned (i see no maj since more than 2 years, no issues/PR answered since 2 years).

I use the NoisyChannel module in my own automated pipeline (I have the impression that this module is used stand-alone by a lot of users) and I would like to contribute to this module but then it would depend on what kind of future/score you envisage for the Python PREP package.
Among these options, which one would best matches your vision:

  1. This repertory is intended to stick to the original. To be updated when the original is updated, and if the original is paused, to be at least an archive. Permitted modifications will be those of the QoL type or those which are a strict improvement.

  2. The original version seems to be on pause so the PREP pipeline will drifts and will go beyond with its own version.

  3. The Python version will have "two modes": one that is a copy of the original (+ QoL changes and others deliberate differences) + another upgraded Python version which will continue to be refined.

  4. Option 1) but another NoisyChannel is created from the original class as a stand-alone package and will continue to be improved independently.

@a-hurst
Copy link
Collaborator

a-hurst commented Feb 8, 2024

@sappelhoff Hey Stefan, it's been a while! Hope you're doing well 🙂

Haven't touched EEG data in a few years, but I remember having similar discussions on a smaller scale re: matching MATLAB PREP behaviour exactly vs addressing shortcomings of MatPREP when we stumble on them. The solution we settled on then for improving on the original was having a matlab_strict flag for PyPREP that would strictly match the original behaviour even when it seemed to be buggy/undesirable. Adding a whole new pipeline stage goes a bit beyond those minor tweaks though, so it's worth a new discussion.

My take is that since the original PREP seems to be largely unmaintained now, it makes sense for PyPREP to extend and improve upon the original as long as there are contributors willing to do so! These extensions should probably be optional, and the matprep_strict mode should always aim to match original behaviour as closely as possible, but as with #83 there are some clear logical problems with how the original works. In the past we've tried being good open-source citizens and collaborating with the original devs on issues we found, but if they've moved on with their careers and no longer have time to maintain it (communication was slow and sparse even in 2021) I don't see that as a reason to limit improvements on their methods.

For the sake of maintainability and project scope, I'd say that PRs for new options should be considered as long as:

  • matprep_strict is respected (CI test should make sure of that)
  • They have documentation and unit test coverage
  • They clearly show how the new option is an improvement over the original pipeline

Does that make sense as a broad strategy?

Also unrelatedly I realize I never finished that API rewrite for PyPREP I started ages ago (PhD + being the equipment/programming guy for multiple labs has been quite busy). I've got a bit more software engineering experience under my belt now vs when I started it, so I'll try to make some time to polish it off sometime soon.

@sappelhoff
Copy link
Owner

@sappelhoff Hey Stefan, it's been a while! Hope you're doing well 🙂

Doing great, thanks! Happy to hear from you, and happy to hear about your experiences in the past years!

Does that make sense as a broad strategy?

Thanks for taking the time and thinking through this / communicating it so coherently (also thanks to @nabilalibou for your comments in the post before). I agree with the strategy, however I believe that this is the crucial point:

They clearly show how the new option is an improvement over the original pipeline

--> this will oftentime require setting up tests scripts with at least two different datasets. It will require some effort. But if somebody is willing to spend that effort and work diligently on it, I would be happy to review and to eventually include such a feature in PyPrep.

@sappelhoff
Copy link
Owner

so I'll try to make some time to polish it off sometime soon.

sounds great :-)

@nabilalibou
Copy link
Contributor Author

--> this will oftentime require setting up tests scripts with at least two different datasets. It will require some effort. But if somebody is willing to spend that effort and work diligently on it, I would be happy to review and to eventually include such a feature in PyPrep.

In my opinion, for a feature such as a new way of detecting bad channel, you would need a certain amount of EEG data on which to test + the groundtruth (channels labelled good or bad by experts).
In addition to the matprep_strict flag strategy we could make some features accessible via the API but without them being called by defaut when launching the PREP pipeline. Like an alpha status before the feature is properly tested in term of pure performance, what do you think ?

@sappelhoff
Copy link
Owner

you would need a certain amount of EEG data on which to test + the groundtruth (channels labelled good or bad by experts).

agreed

In addition to the matprep_strict flag strategy we could make some features accessible via the API but without them being called by defaut when launching the PREP pipeline.

agreed, as long as we then also document their usage in dedicated examples that showcase their usefulness

Like an alpha status before the feature is properly tested in term of pure performance, what do you think ?

sounds good to me. If you want to take this on, I am willing to review and to eventually include it into pyprep if it meets the standards discussed above. One important note for this is though, that I probably won't be able to commit a lot of time to this, so some patience will be required on this way 🙂

@nabilalibou
Copy link
Contributor Author

Ok then here we go :)

@nabilalibou nabilalibou mentioned this issue Apr 14, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants