Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detrend_correction #9

Merged
merged 32 commits into from Apr 8, 2020
Merged

Conversation

yjmantilla
Copy link
Collaborator

@yjmantilla yjmantilla commented Mar 28, 2020

This pull request fixes #8 thus closes #8

Regarding Detrends

In find_noisy_channels:

On lines 34-35 of the current version a detrend is done on a trended copy of the raw data as follows:

self.EEGData = self.raw_mne.get_data(picks="eeg")
self.EEGData = removeTrend(self.EEGData, sample_rate=self.sample_rate)

Now on lines 153-154 of the function find_bad_by_nan_flat we have:

self.raw_mne.drop_channels(list(set(self.bad_by_nan + self.bad_by_flat)))
self.EEGData = self.raw_mne.get_data(picks="eeg")

This replaces the EEGData with a trended version again (the idea was to drop the unusable channels I think).

This replacement with a trended version destabilizes the ransac quite a lot.

Furthermore in reference.py:

Through lines 131-135 a detrend is done then NoisyChannels (find_noisy_channels main class) is used, this means the detrend is applied twice.

raw._data = removeTrend(raw.get_data(), sample_rate=self.sfreq)
noisy_detector = NoisyChannels(raw)
noisy_detector.find_all_bads(ransac=self.ransac)

Because of this I decided tu put a boolean to indicate whether to do the detrend internally or assume the signal is already detrended.

Regarding correlation criteria:

In find_noisy_channels frac_bad is initialized with 0.1 as follows:

def find_bad_by_correlation(self, correlation_secs=1.0, correlation_threshold=0.4, frac_bad=0.1):

In the original PREP this values was at 0.01, now I understand this change was made because as they say in the exampe:

we keep all the default parameter settings as described in PREP paper except one, the fraction of bad time windows (we change it from 0.01 to 0.1), because the EEG data is 60s long, which means it gots only 60 time windows. We think the algorithm will be too sensitive if using the default setting.

While this makes sense for the example the default should be the one of the original PREP. More so it would be necessary to make the frac_bad an input parameter to find_noisy_channels to cover cases similar to the example.

In this pull request I have only put back the original default of PREP, I haven't made the change to make the frac_bad a passed parameter.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @yjmantilla! I also like the way you solved the problems. I merely added some formatting suggestions for the code to conform to pep8 and pep257.

In this pull request I have only put back the original default of PREP, I haven't made the change to make the frac_bad a passed parameter.

A new pull request for adding that change would be welcome!

pyprep/reference.py Outdated Show resolved Hide resolved
pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
@sappelhoff
Copy link
Owner

BTW: if this fixes issue #8 then you can include one of the following snippets in your pull request description:

to automatically close your issue once this PR is merged.

@sappelhoff
Copy link
Owner

also, please feel free to add yourself to the list of authors in https://github.com/sappelhoff/pyprep/blob/master/docs/whats_new.rst (at the bottom). Please follow the existing formatting.

sappelhoff and others added 2 commits March 29, 2020 18:26
Co-Authored-By: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
@yjmantilla
Copy link
Collaborator Author

So I committed your suggestions onto the branch, now it is saying that some checks failed. Reading a bit it seems it is because of the formatting but I don't really know how to solve that because it doesn't quite say what should exactly be corrected (or idk how to see that.).

So if you know how to correct that I would be glad if you told me.

I have two other more modifications I think would be important but I don't want to pollute this pull request because they don't quite have anything to do with the detrend, but just so you know they are:

  • do ransac channel-wise so it doesn't need a lot of ram (it is a bit slower though). Alternatively I have another version that automatically resamples the data to fit memory but I don't really like that approach. I would like to hear your thoughts on this.

  • correct a little thing in performReference so it resembles the original PREP logic better.

@sappelhoff
Copy link
Owner

@yjmantilla yes, I did some general maintenance and docs work yesterday to prepare the package for a next release once your improvements are all in. In that process I also added GitHub Actions to the test suite.

The issues are indeed related to formatting. You can run pip install -r requirements-dev.txt from the pyprep root (or to install individually pip install black and pip install pydocstyle) and then run both, black and pydocstyle of them from the command line to see what the issues are. Or Alternatively use the Makefile and run make check, which calls both black, and pydocstyle.

When you push some new commits to this PR, please also push one to add yourself to the authors list https://github.com/sappelhoff/pyprep/blob/master/docs/whats_new.rst (see my comment: #9 (comment)) and under the Current subheading in the "What's new" doc, add a new subheading Bug and add your PR details there, following previous formatting.

@sappelhoff
Copy link
Owner

do ransac channel-wise so it doesn't need a lot of ram (it is a bit slower though)

how much slower? Did you perform some timing tests? In general, this solution does seem better than automatically subsampling though, I agree with you on that. Maybe this contribution that you plan could also allow us to drop the psutil dependency, which I only added to check for available RAM, because ransac currently needs so much of it.

correct a little thing in performReference so it resembles the original PREP logic better.

separate PR welcome!

@yjmantilla
Copy link
Collaborator Author

yjmantilla commented Mar 30, 2020

Thanks! I'll be checking the formatting and the author thing the next week because in the current one I need to have my research proposal ready. I haven't perform rigorous timing tests but just noticed it. I will take that into account for the PR of the ransac.

Regarding the consumption of RAM maybe psutil wont be required for ransac but I see another issue:

  • the multitaper filter of MNE needs lots of ram too and there is no prior calculation (as with ransac) which tells you how much you will need. Maybe there is a way to lower that consumption but probably that would require a custom-made multitaper filter. I do think that matlab's multitaper doesnt require that much ram.

For some of the data I'm processing it needs 9gb, but I do recognize this data is sampled at 1000Hz with a duration of about 5 minutes and 58 channels so maybe that is quite big. For now I just included that if line_freqs = [ ] then to skip directly to the reference stage.

@sappelhoff
Copy link
Owner

alright, good luck with your research proposal then and see you next week :-)

@sappelhoff sappelhoff mentioned this pull request Apr 4, 2020
@sappelhoff
Copy link
Owner

@yjmantilla how is the situation? Can you estimate when you will have time to work on this?

@yjmantilla
Copy link
Collaborator Author

yjmantilla commented Apr 6, 2020

Hi, Im still working in my research stuff. Should be less than 2 days more (still have classes, homework and stuff but thats the normal). I passed the black formatting, added the bug to whats_new and myself to the authors lists as you suggested. It says there is a conflict with the whats new but I guess those are from the obvious changes.

Now, I dont know if the tests are now passed or not to see if there is anything left to correct in this particular pull request.

Once this one is actually merged I will make pull requests for the other changes I mentioned.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see the conflict :-/ can you resolve it?

pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
docs/source/whats_new.rst Outdated Show resolved Hide resolved
docs/source/whats_new.rst Outdated Show resolved Hide resolved
@yjmantilla
Copy link
Collaborator Author

Ah yes, I see the conflict :-/ can you resolve it?

I think all is according to what you said to me (conflict, black, whats_new, etc). Looking forward to your feedback.

@sappelhoff
Copy link
Owner

Looking forward to your feedback.

There is still a conflict with whats_new unfortunately. This is because when you started the PR, you got a copy of whats_new to your branch ... however, in the meantime I did changes to whats_new UPSTREAM (meaning: in sappelhoff/pyprep, the "master repository"). And you did not get these updates, ... instead you applied your own updates, and now YOUR updates are clashing with MY updates (which happen to be the current master state)

one way to resolve the conflict would be:

  1. discard all your changes to whats_new (i.e., store them somewhere else for now)
    1. could be done using git reset and then git checkout whats_new I think (substituting the correct file paths of course, I am just writing fast here)
  2. switch to your master branch on your fork
  3. git pull upstream master (given that you have a remote called upstream, pointing to sappelhoff/pyprep)
  4. switch back to your development branch: git checkout detrend_correction
  5. rebase your branch on top of my updates: git rebase master
  6. now everything is up to date and you can apply YOUR changes to whats_new, and no conflict should arise

Note: before you do any of this, do: git branch backup from your detrend_correction branch, so that in case something goes wrong, you can always get your backup :)

Let me know if I should explain something in more detail.

@yjmantilla
Copy link
Collaborator Author

Sure, I will try all of this. Sorry for inconvenience and thanks for the explanation. I've never done this level of github hacking before haha.

@sappelhoff
Copy link
Owner

okay, then here are some more handy tips:

  • list all your remote that you can push to or pull from git remote -v
  • add a new remote and name it "upstream": git remote add upstream https://github.com/sappelhoff/pyprep
  • Then pulling and pushing changes can be done like this:
    • git pull upstream master
    • git push origin
    • etc.
    • note that for git push upstream, you would need permissions on sappelhoff/pyprep ... so you do PRs instead

@yjmantilla
Copy link
Collaborator Author

yjmantilla commented Apr 6, 2020

So I did the steps to do the rebase; apparently there aren't conflicts anymore and it passed all the tests. The only thing weird is that there are now 33 files changed, the majority of them seem from the commits you did after I did the fork so maybe that is to be expected? or maybe I did it wrong (?).

In any case I look forward to your feedback as always.

@sappelhoff
Copy link
Owner

sappelhoff commented Apr 7, 2020

I see 🤔 something must have gone wrong during rebasing.

I can't really say WHAT the issue was, possibly it's a GitHub issue as well (see: https://github.community/t5/How-to-use-Git-and-GitHub/GitHub-pull-requests-showing-invalid-diff-for-already-merged/td-p/3000) ... but I can't say for sure.

I took some time to write up the workflow. I repeat a lot of stuff, and I don't really know your git skill level, so it might be that I am explaining way too much ... but this way, I'll be able to link other contributors to this post in the future.

Hope this helps!

Workflow

  • I assume you are working from a command line
  • github.com/sappelhoff/pyprep is upstream
  • github.com/yjmantilla/pyprep is origin (your fork)

Syncing your fork's master with upstream master

  • your start by cloning your fork: git clone github.com/yjmantilla/pyprep
    • you'll have your own master branch there
  • you always want to make sure that your fork's master and upstream master are aligned
    • to do this, you work with git remotes
    • Note: this also means that you NEVER work on master (unless you know what you are doing) ... because you want to always be able to SYNC your fork with upstream, which would mean losing your own work on master
  • use git remote -v to list your configured remotes
    • initially this will only list origin ... a bit like this maybe:
origin	https://github.com/yjmantilla/pyprep (fetch)
origin	https://github.com/yjmantilla/pyprep (push)
  • Now you want to add upstream as a remote. Use git remote add upstream https://github.com/sappelhoff/pyprep
  • again, do git remote -v, it should look like this:
origin	https://github.com/yjmantilla/pyprep (fetch)
origin	https://github.com/yjmantilla/pyprep (push)
upstream	https://github.com/sappelhoff/pyprep (fetch)
upstream	https://github.com/sappelhoff/pyprep (push)

  • Now you can use your upstream remote to make sure your fork's master is ALWAYS up to
    date.
    1. git checkout master to make sure you are on your master branch
    2. Make sure you do not have any changes on your master, because we will discard them!
    3. git pull upstream master SYNC your fork and upstream
    4. sometimes there are issues, so to be safe, do: git reset --hard upstream/master ... this makes sure that both branches are really synced.
    5. ensure with another git pull upstream master ... this should say "already up to date"

... okay. so far so good :-)

If anything went wrong, you could just go the easy way and make a fresh clone in a new directory.

Now we can do the rebase stuff

rebasing

  • before working on any feature: always do git checkout master and git pull upstream master
  • then make your new branch to work on and check it out, e.g. git checkout -b my_feature
    • do your work
    • submit a PR
    • hope you are lucky and nobody did work in between

however IF somebody did work in between, do the following:

  1. sync master through: git checkout master and git pull upstream master
  2. go back to your branch and rebase it: git checkout my_feature and then git rebase master

Now it could be that you are lucky and there no conflicts ... in that case, the rebase just works and you can then finish up by FORCE pushing your rebased branch: git push -f my_feature ... you need to force it, because rebasing changed the history of your branch --> but it should all be safe

in case you are unlucky, there are conflicts and you'll have to resolve them step by step .... git will be in rebase mode and try to rebase one commit after another ... for each commit where conflicts are detected, it'll stop.

Then you have to do: git status to see conflicting files ... then edit these files to resolve conflicts ... then git add <filename> ... and then git rebase --continue to go on to the next commit, rinse and repeat.

NOTE: the conflict resolution part is the dangerous part that can get very messy and where you can actually lose stuff ... so make backups of your branch before.

after everything is resolved, you can again do git push -f my_feature.

if you screw up during rebasing and you panic, you can do git rebase --abort and start again.

the easy way

if nothing helps and you just don't know how to resolve the issues and conflicts ... just make a new branch:
1. git checkout master
1. git pull upstream master
1. git checkout -b my_feature_2nd_attempt

and apply your changes manually.

This is a bit shameful to do, but it's a last resort ... I often did it that way 😬


I suggest you try rebasing this way once more (including the last step "the easy way" as a last resort).

If everything fails, I have a copy of your remote now and I can integrate your changes, so no worries there.

@yjmantilla
Copy link
Collaborator Author

yjmantilla commented Apr 7, 2020

I followed the steps but just after the rebase of the branch it says:

$ git rebase detrend_correction                                                  
Current branch detrend_correction is up to date.

I also just after syncing my fork master with your master I did a git push to my master
because my fork remote master wouldn't sync itself (just my local one).

Here is how the graph looks:

image

Here master is d81c5cf

Maybe at this point I should just try the easy way.

Although Im curious, if you try to do the merge as it is doens't it show somehow that much of the commits are already done since after all it took them from the current master??

@yjmantilla
Copy link
Collaborator Author

If it helps at all here is the log:


user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop
$ git clone https://github.com/yjmantilla/pyprep.git
Cloning into 'pyprep'...
remote: Enumerating objects: 310, done.
remote: Counting objects: 100% (310/310), done.
remote: Compressing objects: 100% (193/193), done.
remote: Total 902 (delta 175), reused 215 (delta 105), pack-reused 592
Receiving objects: 100% (902/902), 26.21 MiB | 2.85 MiB/s, done.
Resolving deltas: 100% (509/509), done.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop
$ cd pyprep

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git branch -vv
* master 4b5f30a [origin/master] fix html

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git remote -v
origin  https://github.com/yjmantilla/pyprep.git (fetch)
origin  https://github.com/yjmantilla/pyprep.git (push)

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git remote add upstream https://github.com/sappelhoff/pyprep.git

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git remote -v
origin  https://github.com/yjmantilla/pyprep.git (fetch)
origin  https://github.com/yjmantilla/pyprep.git (push)
upstream        https://github.com/sappelhoff/pyprep.git (fetch)
upstream        https://github.com/sappelhoff/pyprep.git (push)

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git pull upstream master
From https://github.com/sappelhoff/pyprep
 * branch            master     -> FETCH_HEAD
 * [new branch]      master     -> upstream/master
Updating 4b5f30a..d81c5cf
Fast-forward
 .github/workflows/python_tests.yml |    49 +
 .gitignore                         |     2 +
 .mailmap                           |     4 +
 .travis.yml                        |    21 -
 LICENSE.txt => LICENSE             |     2 +-
 MANIFEST.in                        |    32 +-
 README.md                          |    54 -
 README.rst                         |    76 +
 docs/Makefile                      |     6 +-
 docs/api.rst                       |    50 +
 docs/conf.py                       |    93 +
 docs/index.rst                     |     5 +
 docs/requirements-rtd.txt          |     2 +
 docs/source/conf.py                |   110 -
 docs/source/examples.rst           |    59 -
 docs/source/index.rst              |    14 -
 docs/source/modules.rst            |     7 -
 docs/source/pyprep.rst             |    22 -
 docs/{source => }/whats_new.rst    |    12 +-
 examples/README.rst                |    10 +
 examples/prep_demo.ipynb           |   499 --
 examples/run_full_prep.py          |   243 +
 examples/test_data/S001R01.edf     |  5529 ------------------
 examples/test_data/S001R02.edf     |  4850 ----------------
 examples/test_data/S001R03.edf     | 10302 ---------------------------------
 examples/test_data/S001R04.edf     | 10854 -----------------------------------
 examples/test_data/S002R01.edf     |  8047 --------------------------
 examples/test_data/S003R01.edf     |  4541 ---------------
 examples/test_data/S004R01.edf     |  8304 ---------------------------
 examples/use_noisy_module.py       |    81 +
 pyprep/noisy.py                    |    48 +-
 pyprep/prep_pipeline.py            |    33 +-
 pyprep/removeTrend.py              |    33 +-
 pyprep/utilities.py                |    17 +-
 requirements-dev.txt               |    12 +
 requirements.txt                   |     2 -
 setup.py                           |    11 +-
 37 files changed, 749 insertions(+), 53287 deletions(-)
 create mode 100644 .github/workflows/python_tests.yml
 create mode 100644 .mailmap
 delete mode 100644 .travis.yml
 rename LICENSE.txt => LICENSE (96%)
 delete mode 100644 README.md
 create mode 100644 README.rst
 create mode 100644 docs/api.rst
 create mode 100644 docs/conf.py
 create mode 100644 docs/index.rst
 create mode 100644 docs/requirements-rtd.txt
 delete mode 100644 docs/source/conf.py
 delete mode 100644 docs/source/examples.rst
 delete mode 100644 docs/source/index.rst
 delete mode 100644 docs/source/modules.rst
 delete mode 100644 docs/source/pyprep.rst
 rename docs/{source => }/whats_new.rst (93%)
 create mode 100644 examples/README.rst
 delete mode 100644 examples/prep_demo.ipynb
 create mode 100644 examples/run_full_prep.py
 delete mode 100644 examples/test_data/S001R01.edf
 delete mode 100644 examples/test_data/S001R02.edf
 delete mode 100644 examples/test_data/S001R03.edf
 delete mode 100644 examples/test_data/S001R04.edf
 delete mode 100644 examples/test_data/S002R01.edf
 delete mode 100644 examples/test_data/S003R01.edf
 delete mode 100644 examples/test_data/S004R01.edf
 create mode 100644 examples/use_noisy_module.py
 create mode 100644 requirements-dev.txt

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git reset --hard upstream/master
HEAD is now at d81c5cf download test data instead of storing it

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git pull upstream master
From https://github.com/sappelhoff/pyprep
 * branch            master     -> FETCH_HEAD
Already up to date.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git status
On branch master
Your branch is ahead of 'origin/master' by 23 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git branch -vv
* master d81c5cf [origin/master: ahead 23] download test data instead of storing it

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git checkout detrend_correction
Switched to a new branch 'detrend_correction'
Branch 'detrend_correction' set up to track remote branch 'detrend_correction' from 'origin'.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git rebase detrend_correction
Current branch detrend_correction is up to date.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git branch -vv
* detrend_correction 1fe191c [origin/detrend_correction] rebase
  master             d81c5cf [origin/master: ahead 23] download test data instead of storing it

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git checkout master
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 23 commits.
  (use "git push" to publish your local commits)

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git push
Total 0 (delta 0), reused 0 (delta 0)
To https://github.com/yjmantilla/pyprep.git
   4b5f30a..d81c5cf  master -> master

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git checkout detrend_correction
Switched to branch 'detrend_correction'
Your branch is up to date with 'origin/detrend_correction'.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git rebase detrend_correction
Current branch detrend_correction is up to date.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git status
On branch detrend_correction
Your branch is up to date with 'origin/detrend_correction'.

nothing to commit, working tree clean

@sappelhoff
Copy link
Owner

oh damn ... I have a typo in my guide -.- sorry @yjmantilla

The issue is here: $ git rebase detrend_correction

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (master)
$ git checkout detrend_correction
Switched to a new branch 'detrend_correction'
Branch 'detrend_correction' set up to track remote branch 'detrend_correction' from 'origin'.

user@DESKTOP-N1TH9G9 MINGW64 ~/Desktop/pyprep (detrend_correction)
$ git rebase detrend_correction
Current branch detrend_correction is up to date.

--> instead, it should be git rebase master, because you want to give the detrend_correction branch a new base (namely: master), from which to start applying the new commits on.

@sappelhoff
Copy link
Owner

Although Im curious, if you try to do the merge as it is doens't it show somehow that much of the commits are already done since after all it took them from the current master??

yes, I am curious about that as well ... it might be a GitHub bug that a diff is shown ... although there is no diff 🤔

I think I have tortured you enough now and I'll try to squash and commit your PR.

@sappelhoff sappelhoff merged commit 7e9899d into sappelhoff:master Apr 8, 2020
@sappelhoff
Copy link
Owner

Yes, seems like some GitHub issue, the squash-and-merge commit of the PR was clean ...

See for yourself by:

git checkout master
git pull upstream master
git reset HEAD~
git diff

And the diff is exactly what you contributed :-) thanks a lot for the fix and also for patiently following git instructions!

🚀

@sappelhoff
Copy link
Owner

also, I slightly changed your whatsnew entries, see: 0bec296

mainly to:

  • make the entries more concise (if people want to know details, they click on the link and can read the PR)
  • make more links to functions and methods using RST syntax ... e.g., link to a function like:
func:`pyprep.noisy.find_bad_epochs`
  • ... this would then link to the API docs ... at least once we clean that up a bit :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ransac return a lot more bad channels than matlab version
2 participants