Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add window-wise RANSAC to reduce RAM requirements #66

Merged
merged 10 commits into from May 1, 2021

Conversation

a-hurst
Copy link
Collaborator

@a-hurst a-hurst commented Apr 16, 2021

PR Description

(Updated) This PR does a few things:

  • Adds support for window-wise RANSAC and makes it the default method
  • Reworks channel-wise RANSAC to use the same pre-generated interpolation matrices as window-wise, which speeds things up and reduces the amount of divergent/redundant code between methods
  • Reworks/refactors the channel-wise RANSAC functions a fair bit to better match the function names, argument names, variable names, and docstrings of their window-wise equivalence, for easier comparison and maintainability
  • Adds a progress-printing function for window-wise RANSAC
  • Overhauls the RANSAC unit tests considerably, including the addition of explicit comparison of the window-wise/channel-wise correlation matrices for both regular and MATLAB-strict RANSAC to make sure we get identical results for the same seed between methods (and that MATLAB strict values differ from non-strict ones as expected).

Currently Missing

  • Progress indicators for RANSAC
  • Proper API for matlab_strict mode (handled in earlier PR)
  • Any sort of optimization/parallelization for additional speed (I have some ideas, but will save for a future PR since this one's already pretty big)
  • Any sort of memory checking or channel chunking support (added memory check at start)
  • A clear idea of what to do about MATLAB PREP's weird correlation function (handled in earlier PR)

Other Notes

  • For my test file (sub_002/ses-01 from that open meditation dataset), window-wise RANSAC is about 36% slower than the existing RANSAC approach (192 vs 141 seconds). However, RAM usage during RANSAC is now much lower (maxed out at 2.85 GB), which is nice but also suggests that window-wise RANSAC could be improved to take better advantage of RAM on beefier systems.

Merge Checklist

  • the PR has been reviewed and all comments are resolved
  • all CI checks pass
  • (if applicable): the PR description includes the phrase closes #<issue-number> to automatically close an issue
  • (if applicable): bug fixes, new features, or API changes are documented in whats_new.rst

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 16, 2021

Whoops, accidentally submitted the PR before writing a description! Fixing that now.

Also, apologies for putting so much of this into a single commit: I know that makes it hard to review! I couldn't think of a better way to do this piece-wise though, since apart from the utility functions the rest of the code is interdependent.

@a-hurst a-hurst marked this pull request as draft April 16, 2021 18:55
@sappelhoff
Copy link
Owner

gonna do a proper review later, but regarding the weird matprep correlation --> is that maybe because the data are already mean-centered? ... i.e., mean(x) is already around 0, so x - mean(x) is unnecessary and you can just square x directly? Just a hunch ... could be completely off.

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 17, 2021

gonna do a proper review later, but regarding the weird matprep correlation --> is that maybe because the data are already mean-centered? ... i.e., mean(x) is already around 0, so x - mean(x) is unnecessary and you can just square x directly? Just a hunch ... could be completely off.

That sounds like a good explanation. That said, it is generally going to be true that the real or predicted signal for each channel during any given 5-second window is centered on zero? I should probably compare the results directly with normal Pearson correlation to see how much of a difference it makes. Also, the "bad-by-correlation" code uses the normal numpy corrcoef correlation function and matches up identically with MATLAB PREP, so it's odd that they'd use this unique correlation method here but not there.

Regardless, if we ultimately decide that Pearson (or some other method of correlation) would make a better default then we'll probably need to adjust the RANSAC bad-correlation threshold as well, since the current one is tuned for the current calculation method. That makes me think this question might be better suited for a future issue/PR.

EDIT: Just tested directly, and it looks like the the results are pretty similar (although still somewhat divergent) between the correlation method MatPREP uses and the one we were using before (np.corrcoef). Here's all the correlations for the first RANSAC window with both methods:

Comparisons of `_correlate_arrays` and `np.corrcoef` values (click to unhide)

With _correlate_arrays:

>>>  _correlate_arrays(actual, predicted)
array([0.89845652, 0.82875389, 0.91511714, 0.91138047, 0.94749736,
       0.91007793, 0.92377913, 0.79589614, 0.83800662, 0.88510196,
       0.94082362, 0.70040023, 0.91412564, 0.95717966, 0.97681393,
       0.9498704 , 0.96668755, 0.92801575, 0.64204451, 0.91474818,
       0.91883667, 0.97698636, 0.92933287, 0.97297276, 0.98288691,
       0.95930912, 0.98388164, 0.94297334, 0.9536785 , 0.78055453,
       0.91809752, 0.77438666, 0.68174176, 0.81035593, 0.89479281,
       0.95252874, 0.75185651, 0.83022438, 0.92206843, 0.96422559,
       0.91766885, 0.9591769 , 0.94557885, 0.84677134, 0.96856761,
       0.71886495, 0.96437593, 0.62813191, 0.85411866, 0.92932526,
       0.95530658, 0.89418018, 0.96117152, 0.93188952, 0.95649761,
       0.97970686, 0.96139793, 0.9881229 , 0.95055001, 0.94476785])

With np.corrcoef (60 is the number of total channels being compared):

>>> np.diag(np.corrcoef(actual, predicted)[0:60, 60:])
array([0.89604913, 0.82697111, 0.91719863, 0.9110939 , 0.94679295,
       0.91137861, 0.92356072, 0.79522897, 0.83669035, 0.88110246,
       0.94066768, 0.70462759, 0.91424179, 0.95697956, 0.97630347,
       0.94934379, 0.96676154, 0.92770026, 0.62840986, 0.91122078,
       0.91661858, 0.97732134, 0.93090906, 0.97321228, 0.98282457,
       0.96088597, 0.98366312, 0.94097458, 0.95273919, 0.77939706,
       0.91990479, 0.7773422 , 0.67990887, 0.80892266, 0.89474851,
       0.95226623, 0.752651  , 0.83074011, 0.92116674, 0.96506751,
       0.91770327, 0.95860049, 0.94440995, 0.84337725, 0.9674785 ,
       0.73259463, 0.96433739, 0.64966689, 0.85175615, 0.93301079,
       0.95368087, 0.89043239, 0.96199825, 0.93195703, 0.95756804,
       0.97926721, 0.95890694, 0.98788663, 0.9512345 , 0.94354823])

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Austin, could you perhaps adjust the PR title to be more descriptive of what is truly happening in terms of code? The current title

Make PyPREP's RANSAC produce identical results to MATLAB PREP

is ambiguous :-)

pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
pyprep/ransac.py Outdated Show resolved Hide resolved
pyprep/utils.py Outdated
@@ -121,6 +144,37 @@ def _get_random_subset(x, size, rand_state):
return sample


def _correlate_arrays(a, b, axis=1):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call this _mat_correlate_arrays? Or something the like? So that we better see that this comes from Matlab PREP?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I've only used the _mat prefix for functions that mimic the behaviour of built-in MATLAB functions. I can add a _matprep prefix to differentiate, but I was thinking for my follow-up PR that I'd make _correlate_arrays like _predict_median_signals and add a matlab_strict flag to switch between PyPREP (np.corrcoef) and MatPREP behaviour.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I was thinking for my follow-up PR that I'd make _correlate_arrays like _predict_median_signals and add a matlab_strict flag to switch between PyPREP (np.corrcoef) and MatPREP behaviour.

okay, as long as you don't forget about that, that sound better 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that maybe it was better using the whole "matlab" prefix instead of the "mat" one. I understand is more verbose but one could confuse "mat" as being associated with a matrix.

In any case, this is just a minor suggestion, is not that relevant.

pyprep/ransac.py Outdated Show resolved Hide resolved
pyprep/ransac.py Outdated Show resolved Hide resolved
@sappelhoff
Copy link
Owner

That sounds like a good explanation. That said, it is generally going to be true that the real or predicted signal for each channel during any given 5-second window is centered on zero?

I don't know, it was a shot in the dark.

Regardless, if we ultimately decide that Pearson (or some other method of correlation) would make a better default then we'll probably need to adjust the RANSAC bad-correlation threshold as well, since the current one is tuned for the current calculation method. That makes me think this question might be better suited for a future issue/PR.

yes, good point :)

Just tested directly, and it looks like the the results are pretty similar (although still somewhat divergent) between the correlation method MatPREP uses and the one we were using before (np.corrcoef).

that does suggest that mean(x) is indeed close to zero, and it makes little difference whether it's subtracted from x or not.

I like np.corrcoef better though 🤔

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 18, 2021

I like np.corrcoef better though 🤔

Me too. I'm going to make the MatPREP correlation method a matlab_strict option in my new PR and leave np.corrcoef as the default. We'll still need to assess whether np.corrcoef produces consistently higher/lower correlations (and thus deserves different threshold values), but for now they seem similar enough to leave as-is.

@sappelhoff sappelhoff added this to the 0.4.0 milestone Apr 19, 2021
Copy link
Collaborator

@yjmantilla yjmantilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Sorry for joining in late. I have been really busy with uni and work. Really looking forward to testing this implementation against a lot of files I processed through matlab's prep.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a-hurst are you taking a break from this PR to come back later? Or are there any blocking items that you need help with?

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 28, 2021

@a-hurst are you taking a break from this PR to come back later? Or are there any blocking items that you need help with?

Since I've broken this down into a bunch of separate PRs now, the only new thing this PR adds is support for is window-wise RANSAC, which is slower than the current method but has much lower RAM demands for longer recordings. To finish this up, I'm going to rebase on the current master branch and see if I can improve performance at all. I'll try and do the former today!

@sappelhoff
Copy link
Owner

sounds great! --> but even if it's a bit slower, it'd still be a win for users with low RAM machines 👍

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 28, 2021

sounds great! --> but even if it's a bit slower, it'd still be a win for users with low RAM machines 👍

When I pull this together, should I make window-wise the new default then, with channel-wise still as an option for people with high-RAM workstations who want to process as quickly as possible?

@sappelhoff
Copy link
Owner

Sounds good!

@a-hurst a-hurst changed the title Make PyPREP's RANSAC produce identical results to MATLAB PREP Add window-wise RANSAC to reduce RAM requirements Apr 28, 2021
@codecov-commenter
Copy link

codecov-commenter commented Apr 30, 2021

Codecov Report

Merging #66 (7cbe3bf) into master (145a5a1) will increase coverage by 0.17%.
The diff coverage is 100.00%.

❗ Current head 7cbe3bf differs from pull request most recent head 96b2965. Consider uploading reports for the commit 96b2965 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #66      +/-   ##
==========================================
+ Coverage   97.31%   97.49%   +0.17%     
==========================================
  Files           7        7              
  Lines         634      678      +44     
==========================================
+ Hits          617      661      +44     
  Misses         17       17              
Impacted Files Coverage Δ
pyprep/find_noisy_channels.py 96.53% <100.00%> (ø)
pyprep/prep_pipeline.py 100.00% <100.00%> (ø)
pyprep/ransac.py 97.85% <100.00%> (+0.58%) ⬆️
pyprep/reference.py 96.55% <100.00%> (ø)
pyprep/utils.py 99.17% <100.00%> (+0.10%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 145a5a1...96b2965. Read the comment docs.

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 30, 2021

@sappelhoff Okay, so I think this is ready for review now: I haven't updated the whats_new yet and still need to finish exposing the public API for channel-wise RANSAC, but all the key parts are in place now and there are some things I wanted to get your thoughts on before finishing up.

The main thing I wanted to get feedback on is the repurposing of the channel_wise argument to find_bad_by_ransac: basically, I realized it would be confusing to have an argument for window_wise if that's going to be the default method, so I made channel_wise the argument to specify the channel-chunking method of RANSAC. I also added a new argument max_chunk_size to provide the same functionality as the old channel_wise argument (i.e. lets you run 1 channel per chunk) but with a bit more flexibility.

However, I realize that channel_wise is technically currently part of the public API and it's generally very bad practice to change the meaning of an argument between releases. My rationale here is that the current channel_wise functionality was never fully exposed (i.e., you couldn't select one-channel-at-a-time RANSAC from the PrepPipeline class, you could only use it if using NoisyChannels manually), and it was never formally introduced in the What's New, meaning it's unlikely to be used beyond the small circle of contributors here (@yjmantilla does this break anything for you?). Additionally, if I change the default value of max_chunk_size to 1 instead of none, channel_wise will behave exactly as it did before (though I think None is the more useful default).

Apart from that, my main changes are:

  • Reworked channel-wise RANSAC to use the same pre-generated interpolation matrices as window-wise RANSAC, which speeds things up and reduces the amount of divergent/redundant code between methods
  • Renamed the existing private channel-wise RANSAC functions to be consistent with the window-wise function names. Also rearranged their argument names/orders, variable names, and docstrings to match up more closely with their window-wise counterparts (for easier comparison/readability).
  • Added a progress-printing function for window-wise RANSAC
  • Overhauled the RANSAC unit tests considerably, including the addition of explicit comparison of the correlation matrices for window-wise/channel-wise for both regular and MATLAB-strict RANSAC to make sure window-wise and channel-wise produce identical results for the same seed.

Let me know what you think!

@sappelhoff
Copy link
Owner

I realize that channel_wise is technically currently part of the public API and it's generally very bad practice to change the meaning of an argument between releases.

generally yes, but with the immense changes that are coming in version 0.4.0, it's totally fine to break a couple of things if afterwards everything is better. Strict semantic versioning would make us increment a "Major" version even (so 0.3.1 -> 1.0.0, instead of 0.3.1 -> 0.4.0), but a 1.0.0 release would be too much of a statement at this point, I think.

and it was never formally introduced in the What's New, meaning it's unlikely to be used beyond the small circle of contributors here

another good argument why breaking would be fine here

though I think None is the more useful default

+1 to None 👍

would be interested in your opinion as well here @yjmantilla

@@ -362,6 +362,44 @@ def split_list(mylist, chunk_size):
]


def print_progress(current, end, start=None, stepsize=1, every=0.1):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you opt to code up you own progress tracking function instead of using tqdm? To avoid an additional dependency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol I had no idea tqdm existed until you mentioned it. I'd recently written more-or-less the same code for a slower R script, so it was just a question of porting it over/documenting/adding tests.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be fun to see how fast/slow your function is compared to what tqdm says:

Overhead is low -- about 60ns per iteration (80ns with tqdm.gui), and is unit tested against performance regression. By comparison, the well-established ProgressBar has an 800ns/iter overhead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. My guess is that their code is probably better optimized but that my thing is still faster than ProgressBar (which I also didn't know existed until just now): I made no attempt to optimize my own approach, but I don't think there are any particularly expensive or wasteful function calls I'm doing per loop and I'm not trying to write as much to the terminal on every update.

Comment on lines 451 to 456
Whether RANSAC should predict signals for whole chunks of channels
at once instead of predicting signals for each RANSAC window
individually. Channel-wise RANSAC generally has higher RAM demands
than window-wise RANSAC (especially if `max_chunk_size` is
``None``), but can be faster on systems with lots of RAM to spare.
Defaults to ``False``.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think I am a little confused by this.

We can run RANSAC in the following ways:

  1. over all channels and over the whole time --> this is how I coded it back then
    1. needs huge amounts of RAM
    2. but if that RAM is available, it's fast
  2. one channel at a time over the whole time --> this is what @yjmantilla implemented (if I recall correctly)
    1. much less RAM needed
    2. significantly slower
  3. calculate RANSAC over all channels but over windows
    1. this is what MATLAB PREP does, and what autoreject does
    2. implemented by @a-hurst
    3. Probably the best default

There would be a fourth potential option -> running single channels over windows ... so that'd be the slowest possible, but the lowest RAM ever (can probably even run on a raspberry pi 🤣 ).

Overall I am not sure how many of these options we should offer and how to document them.

Right now, I think one option should be fine: the matlab/autoreject way of doing all channels at once but over windows ... and if too little RAM is there, we just need to make the windows smaller ... right? (or would that affect the accuracy at some point?)

what are your opinions @a-hurst @yjmantilla

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd thought about that too (chunking-by-channel in window-wise RANSAC), but wasn't sure how to best merge the chunking code with the windowing stuff without a massive rewrite.

Right now, window-wise still takes less RAM than one-at-a-time channel-wise unless (channel count x 5 seconds) is larger than the length of the whole recording (i.e., one-channel-at-a-time only takes less RAM if a 32-channel recording is under 2.6 minutes, a 64-channel is under 5.3 mins, a 128-channel is under 10.6 mins, etc.). Thus, in most cases window-wise should be good enough on the RAM front. Still, I wouldn't be opposed to getting that down further: with enough optimization, I could probably get PyPREP running (slowly) without swapping on my old iMac G4!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think window-wise is better. I remember talking bout this in another thread. In the usual use-case window-wise will usually consume less ram than the approach i took.

I think it would be nice to offer low-ram version even though it is the slowest, maybe fitting the largest chunk of channels could be done automatically like it is done currently in the channel wise version. but is not a priority. All the work @a-hurst is doing is quite amazing, priority should be that matlab prep comparison.

So +1 making the window-wise the default

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yjmantilla! As for the MATLAB PREP comparison, I just got the MATLAB side of that working today (automated MatPREP on GitHub actions, saving relevant data as artifacts for later testing) so expect some preliminary stuff on that end shortly :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

🤣

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please adjust this docstr according to https://github.com/sappelhoff/pyprep/pull/66/files#r624438768 for consistency :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol that commit was after I'd spent a good 30-40 minutes trying to figure out why pop_saveset was cryptically telling me I didn't have write access to the artifacts folder.

Turns out, if you use any keyword arguments in a MATLAB function call (e.g. 'savemode', 'onefile'), that somehow means that positional arguments beyond the first one no longer work and you have to use keyword arguments for everything else (i.e. pop_saveset(EEG, outpath, 'savemode', 'onefile'); doesn't work, even though pop_saveset(EEG, 'filename', outpath, 'savemode', 'onefile'); does and pop_saveset(EEG, outpath); does too). And I thought passing argument names as strings was bad enough...

@a-hurst
Copy link
Collaborator Author

a-hurst commented Apr 30, 2021

@sappelhoff Just thinking of how to implement the public API for this now: do you think these kinds of settings should be part of the prep_params dict, or should they be separate arguments on their own?

EDIT: Also, should NoisyChannels.find_all_bads() also have the channel-wise and max chunk size arguments to pass on to find_bad_by_ransac?

@sappelhoff
Copy link
Owner

Probably separate parameters 🤔 and yes, the find_all_bads method should have these params as well.

@yjmantilla
Copy link
Collaborator

yjmantilla commented Apr 30, 2021

@yjmantilla does this break anything for you?

It does not, and even if it would I would do a fork.

The main thing I wanted to get feedback on is the repurposing of the channel_wise argument to find_bad_by_ransac: basically, I realized it would be confusing to have an argument for window_wise if that's going to be the default method, so I made channel_wise the argument to specify the channel-chunking method of RANSAC. I also added a new argument max_chunk_size to provide the same functionality as the old channel_wise argument (i.e. lets you run 1 channel per chunk) but with a bit more flexibility.

I agree with this. I don't think much people used that channel-wise argument and people are aware this is an early stage python package so I think we should do the best to make things intuitive and logically coherent up until the point we are ready to say "this is it, pyprep is operating as it should be"

@a-hurst a-hurst marked this pull request as ready for review May 1, 2021 03:24
@a-hurst
Copy link
Collaborator Author

a-hurst commented May 1, 2021

@sappelhoff I've now added a proper public API for channel_wise and max_chunk_size and updated the whats_new.rst. I think this is ready for final review!

I'm still open to adding chunking support for window-wise as you suggested, but I think that might be better to do in a future PR given the amount of change in this one as-is: from a cursory look at what's involved I'd have to refactor both channel-wise and window-wise RANSAC a fair bit to make that work (see: all the wrapping try/catching re: RAM-checking that's currently not used at all for window-wise after the initial check further up). While working on this PR I had some thoughts on how to clean that up too, but again I didn't want to overload this with too many peripheral changes.

ransac : bool
To detect channels by ransac or not.
ransac : bool, optional
Whether RANSAC should be for bad channel detection, in addition to
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Whether RANSAC should be for bad channel detection, in addition to
Whether RANSAC should be used for bad channel detection, in addition to

Comment on lines 160 to 162
Whether RANSAC should predict signals for whole chunks of channels
at once instead of predicting signals for each RANSAC window
individually. Channel-wise RANSAC generally has higher RAM demands
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Whether RANSAC should predict signals for whole chunks of channels
at once instead of predicting signals for each RANSAC window
individually. Channel-wise RANSAC generally has higher RAM demands
Whether RANSAC should predict signals for chunks of channels over the entire
signal length ("channel-wise RANSAC", see `max_chunk_size` parameter).
If ``False``, RANSAC will instead predict signals for all channels at once
but over a number of smaller time windows instead of over the entire
signal length ("window-wise RANSAC").
Channel-wise RANSAC generally has higher RAM demands

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and clear, I like it!

@@ -38,6 +38,19 @@ class PrepPipeline:
ransac : bool, optional
Whether or not to use RANSAC for noisy channel detection in addition to
the other methods in :class:`~pyprep.NoisyChannels`. Defaults to True.
channel_wise : bool, optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -32,6 +32,19 @@ class Reference:
ransac : bool, optional
Whether or not to use RANSAC for noisy channel detection in addition to
the other methods in :class:`~pyprep.NoisyChannels`. Defaults to True.
channel_wise : bool, optional
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -70,10 +72,17 @@ def find_bad_by_ransac(
The duration (in seconds) of each RANSAC correlation window. Defaults to
5 seconds.
channel_wise : bool, optional
Whether RANSAC should be performed one channel at a time (lower RAM
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I have only one major comment for the docstr, the rest looks good to go!

@a-hurst
Copy link
Collaborator Author

a-hurst commented May 1, 2021

Cool! I have only one major comment for the docstr, the rest looks good to go!

Fixed in the latest commit, should be good to go now 🙂

Copy link
Owner

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you had a tiny typo and I think you forgot to update one of the many docstrs. These suggestions should fix it. I'll accept them and then merge -> but feel free to double check :-)

Thanks a lot! One leap closer to 0.4 👏

pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
pyprep/find_noisy_channels.py Outdated Show resolved Hide resolved
pyprep/prep_pipeline.py Outdated Show resolved Hide resolved
pyprep/ransac.py Outdated Show resolved Hide resolved
pyprep/reference.py Outdated Show resolved Hide resolved
@sappelhoff sappelhoff merged commit 051edd6 into sappelhoff:master May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants