-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to the windowers #152
Conversation
- adding `epochs_kwargs` to windowing functions - modifying default value of stop_offset_samples to None for consistency - improving docstrings - adding random state argument to create_mne_dummy_raw - adding tests
braindecode/datautil/windowers.py
Outdated
@@ -154,7 +168,7 @@ def create_windows_from_events( | |||
def create_fixed_length_windows( | |||
concat_ds, start_offset_samples, stop_offset_samples, | |||
window_size_samples, window_stride_samples, drop_last_window, | |||
mapping=None, preload=False, drop_bad_windows=True): | |||
mapping=None, preload=False, drop_bad_windows=True, **epochs_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't like kwargs. I would expose and document here the Epochs parameters you think can be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in new commit.
@@ -69,7 +69,7 @@ def create_from_X_y( | |||
windows_datasets = create_fixed_length_windows( | |||
base_datasets, | |||
start_offset_samples=0, | |||
stop_offset_samples=0, | |||
stop_offset_samples=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Is there a recommended way to flag API changes? I added a warning here: https://github.com/braindecode/braindecode/pull/152/files#diff-c031004527cfc6287fbb1741b7f3e1ddR231
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with warning makes sense for now? unless somebody suggests sth more sophisticated :)
braindecode/util.py
Outdated
@@ -250,7 +252,10 @@ def create_mne_dummy_raw(n_channels, n_times, sfreq, include_anns=True, | |||
save_fname : dict | None | |||
Dictionary containing the name the raw data was saved to. | |||
""" | |||
data = np.random.rand(n_channels, n_times) | |||
if random_state is None or isinstance(random_state, int): | |||
random_state = np.random.RandomState(random_state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use sklearn.utils.check_random_state
- using sklearn.utils.check_random_state - updating corresponding test
What about we rename braindecode.datautil.preprocess to
braindecode.datautil.preprocessing? It's a bit longer, but at least we
avoid this ambiguity. @robintibor <https://github.com/robintibor> @gemeinl
<https://github.com/gemeinl>
3 levels of modules seem too much for a small package as it is now. I would
expose it
in braindecode.datautil or consider a new
submodule braindecode.preprocessing
|
braindecode/datautil/windowers.py
Outdated
if (stops[-1] + trial_stop_offset_samples > | ||
ds.raw.n_times + ds.raw.first_samp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems potentially quite errorprone code, do we have a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, test_windowers.py::test_windows_cropped
used to test this indirectly - I'll add something more specific to this catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm not 100% sure the behaviour of mne.Epochs
is correct when the raw file has been cropped. Maybe there's something I'm missing that would make the handling of raw.first_samp
simpler... I opened an issue here mne-tools/mne-python#8201 just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now this will be correct, but need newest mne from master? just for clarification
braindecode/datautil/windowers.py
Outdated
window_size_samples = stops[0] - ( | ||
onsets[0] + trial_start_offset_samples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here does not seem to account for trial stop samples?
braindecode/datautil/windowers.py
Outdated
This function extracts windows of size window_size_samples in the interval | ||
[trial_start_offset_samples, trial_stop_offset_samples] around each event, | ||
with a separation of window_stride_samples between consecutive windows. If | ||
the last window around an event does not end at trial_stop_offset_samples | ||
and drop_last_window is set to False, an additional overlapping window that | ||
ends at trial_stop_offset_samples is created. | ||
|
||
in mne: tmin (s) trial onset onset + duration (s) | ||
trial: |--------------------------------|--------------------------------| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is doc here maybe wrong/inconsistent?
Below it says:
trial_stop_offset_samples - Stop offset from original trial stop, in samples.
This is also what happens in code in _compute_window_inds:
braindecode/braindecode/datautil/windowers.py
Lines 290 to 294 in fac6cea
starts = np.array([starts]) if isinstance(starts, int) else starts | |
stops = np.array([stops]) if isinstance(stops, int) else stops | |
starts += start_offset | |
stops += stop_offset |
To me, "this function extracts windows of size window_size_samples in the interval" reads quite different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll fix the docstring.
stop = ds.raw.n_times if stop_offset_samples == 0 else stop_offset_samples | ||
stop = stop - window_size_samples | ||
stop = ds.raw.n_times \ | ||
if stop_offset_samples is None else stop_offset_samples | ||
stop = stop - window_size_samples + ds.raw.first_samp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah great, not even sure what was logic behind previous decision. seems quite strange. I wonder actually if we had intended stop offset samples to refer to offset to end of recording? as it is in from events thingy and end of trial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah you have a good point, that's probably what was going on. I think the new approach might be more useful: let's say you want to extract windows from 10 minutes of a recording - you can provide the onset and offset as 2 and 12 mins, which is more natural than 2 and calculate length of recording - (2 + 10). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that extracting a particular length from start of recording if you don't have events might be what you want in many cases and would also keep like this for this PR. Only question long-term is if we want to have both options (stop given from beginning of recording and from end of recording), one could also do it in a dirty way: if negative, assume it is from end of recording, if positive, assume from start of recording... but as said let's keep in this way (from start of recording) for this PR
@@ -69,7 +69,7 @@ def create_from_X_y( | |||
windows_datasets = create_fixed_length_windows( | |||
base_datasets, | |||
start_offset_samples=0, | |||
stop_offset_samples=0, | |||
stop_offset_samples=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with warning makes sense for now? unless somebody suggests sth more sophisticated :)
Did you see the comments @hubertjb ? |
@robintibor yes, sorry for the delay! I'm getting back to it today or tomorrow. |
- adding test case for out-of-recording windowing
so seems ready to merge from what I see @hubertjb ? |
Almost! I just need to test with the latest MNE as you said - we might indeed need to ask for the latest version until the next version is out. |
@robintibor the latest commit takes the fix in MNE into account. I guess we should update the requirements to ask for the latest MNE master (or should we wait for an MNE release)? @agramfort |
(The travis fail is because of the MNE version.) |
When will this be in pip-installable mne @agramfort ? |
@robintibor @hubertjb you can change the CI here to use: pip install -U https://github.com/mne-tools/mne-python/archive/master.zip new MNE version should be released on sept 15 |
ah great then we can also make new braindecode version on pip quickly after. And for now just put as alex said, if you don't know how to do it, i can also have a try to figure it out @hubertjb |
Tests are passing with the latest MNE master on Travis! Should we wait for the upcoming MNE release before merging? |
well how is ETA now for MNE release? last was "new MNE version should be released on sept 15" @agramfort ? should be soon then I guess? |
it's imminent. Next monday at the latest
|
so do you want to try go back to MNE from pip since new version is there? @hubertjb then I can merge |
That'd be great @robintibor, thanks! I just reverted the change to the Travis config file, if tests pass we should be good to go. |
hm so seems travis fails with old error, any reason for that @hubertjb @agramfort ? Also CircleCI seems quite a bit slower than all previous builds (12m vs 5-8m when looking at builds from this very same PR (https://app.circleci.com/pipelines/github/braindecode/braindecode?branch=pull%2F152) :/ Don't know why that might be |
Weird... could it be a cache thing? https://docs.travis-ci.com/user/caching/#configuration |
It seems reason is mne is still at 0.20.8 at https://anaconda.org/conda-forge/mne . When will this change @agramfort ? |
I don't maintain mne in conda. Please use pip
… |
ok can you move mne to pip in https://github.com/braindecode/braindecode/blob/master/environment.yml @hubertjb ? I guess that should work then? |
@robintibor it worked! :) |
great work @hubertjb ! merged! |
This PR introduces some minor improvements to the windowers:
first_samp
is not 0 (i.e., the Raw object has been cropped)epochs_kwargs
to windowing functionsstop_offset_samples
to None for consistencycreate_mne_dummy_raw