[MRG] Sleep tutorial #5718

Slasnista · 2018-11-16T12:42:39Z

This PR adds a tutorial to illustrate how to analyze sleep data. (see #5684)

The purpose is to show how one can:

download the data
extract the raw signals and the annotations
extract some features from the raw signals
perform sleep stage classification with a random forest
quantify the performances of the classifier

Todo

Make the data available
- add fetcher
tutorial

Edited by @massich

codecov · 2018-11-16T13:15:02Z

Codecov Report

❗ No coverage uploaded for pull request base (master@e0df416). Click here to learn what that means.
The diff coverage is 97.51%.

@@            Coverage Diff            @@
##             master    #5718   +/-   ##
=========================================
  Coverage          ?   88.63%           
=========================================
  Files             ?      373           
  Lines             ?    69321           
  Branches          ?    11665           
=========================================
  Hits              ?    61440           
  Misses            ?     5030           
  Partials          ?     2851

larsoner · 2018-11-16T15:11:11Z

Choose dataset name. sleep is a good choice here.
Create a folder following our naming conventions. Since these are data we are providing, we can call it MNE-sleep-data.
The structure is generally MNE-sleep-data/subect-name-or-id/raw_data.fmt so maybe MNE-sleep-data/anonymous/whatever.edf
Add a version text file MNE-sleep-data/version.txt with 0.1 though we don't really use it yet
tar czfv MNE-sleep-data.tar.gz MNE-sleep-data
Upload to OSF
Look at the "revisions" for the file, it will list the MD5sum
Copy the datasets/sample directory over to datasets/sleep
For other changes, take a look here https://github.com/mne-tools/mne-python/pull/5525/files

agramfort · 2018-12-08T21:48:25Z

here is what the example returns:

saturday hacking session...

mne/utils.py

tutorials/plot_sleep.py

agramfort · 2018-12-09T16:41:36Z

@Slasnista @massich I just spent 1 hour to write a fetcher and cleanup the example.

I think I did what needs a deep understanding of mne internals. You should be able to finish this PR without me.

thanks for your efforts

Slasnista · 2018-12-12T13:20:46Z

Hi,

I have just added a few lines of code to pre-process the annotations. This way:

the annotations are given on 30s samples of signals which corresponds to the traditional way of annotating sleep stages
"Sleep stage 3" and "Sleep stage 4" are merged into a single sleep stage to have annotations closer to the AASM rules currently used.

Should such steps be implemented in the read_annotation function or should we let the user choose to perform them ?

massich · 2018-12-12T14:18:48Z

tutorials/plot_sleep.py

+annotations = mne.read_annotations(hyp_fname)
+
+##############################################################################
+# preprocessing annotations


I don't understand all this pre-processing. What is its purpose?

if we only want to merge Sleep stage 3 and Sleep stage 4 we should be able to do so with a function.

And I don't understand the resampling problem. Is it only a problem of this example or an implementation error that we need to test and fix in mne?

The problem comes from the format of annotations. Sleep stages are traditionally annotated over 30s of PSG signal (by both experts and algorithms). For this reason, it could be useful to output annotations already resampled and associated to 30s of signals.

Regarding the merging of sleep stage 3 and 4, nowadays people generally work with 5 sleep stages instead of 6 and merge potential samples of label 'sleep stage 4' with samples of label 'sleep stage 3'.

A way to overcome the resampling problem might be to have a resampling parameter inside the read_annotations function that allows the user to get already resampled annotations. What do you think ?

I don't really like it. read_annotations should only read. If this is a type of analysis is always needed then we should do some preprocessing or something.

This example also makes me question if we want to always read the annotations of a file. Maybe this is better:

psg_annotations = read_annotations(edf_file).to_psg() raw = read_raw_edf(edf_file, annotations=psg_annotations)

or

psg_annotations = read_annotations(edf_file).to_psg() raw = read_raw_edf(edf_file) raw.set_annotations(psg_annotations)

instead of

raw = read_raw_edf(edf_file) raw.annotations.to_psg(inplace=True)

I don't really like it. read_annotations should only read.

Agreed

This example also makes me question if we want to always read the annotations of a file.

I think we should. We should avoid adding options / kwargs to read_raw_* to modify how things are read / represented. The job of the read_raw_* functions should ideally be to transparently read all data from disk in a given file, in as close to the storage format as possible. Then we can have other functions to modify Raw instances as necessary for use cases. (For example, the montage argument of read_raw_* should really not be there, either -- we should do raw.set_montage(...) instead.)

Of your three code snippets, this one looks cleanest to me:

raw = read_raw_edf(edf_file) raw.annotations.to_psg(inplace=True)

But I think the following is better, since it's more explicit and avoids an otherwise redundant inplace kwarg:

raw = read_raw_edf(edf_file) raw.set_annotations(raw.annotations.to_psg())

I am -1 on the to_psg method. This adds some Sleep specific code to an Annotations object that should be agnostic.

I'd rather prefer a

df = annotations.to_dataframe()

and then you use df.resample or some other pandas code to do what you want.
Then you can recreate the annotations.
If it's a mess many some simple numpy code can do the job?

agramfort · 2018-12-12T21:09:29Z

tutorials/plot_sleep.py

+annot = annot.resample('30s').ffill()
+annot.reset_index(inplace=True)
+annot.onset = annot.onset.dt.total_seconds()
+annot["duration"] = 30.


I see it now. That's a bit long but it's exactly what I had in mind. Maybe just make it a function so we know where
pandas kungfu ends.

massich · 2018-12-12T21:24:29Z

I would go with something like ``` from mne.annotations import enforce_psg_annotation_protocol raw = read_raw_edf(fname) raw.set_annotations(enforce_psg_annotation_protocol(raw.annotations)) ``` But I do like ``` enforce_psg_annotation_protocol(raw.annotations, inplace=True) ```

…

On Wed, Dec 12, 2018, 22:09 Alexandre Gramfort ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In tutorials/plot_sleep.py <#5718 (comment)>: > +annot = pd.DataFrame() +annot["onset"] = annotations.onset +annot["description"] = annotations.description +annot["duration"] = annotations.duration + +# add temporarily a last event to have a correct resampling +last_onset = annot.onset.values[-1] +last_duration = annot.duration.values[-1] +annot.loc[annot.shape[0]] = [last_onset + last_duration, "end", 0] + +annot = annot.set_index('onset') +annot.index = pd.to_timedelta(annot.index, unit='s') +annot = annot.resample('30s').ffill() +annot.reset_index(inplace=True) +annot.onset = annot.onset.dt.total_seconds() +annot["duration"] = 30. I see it now. That's a bit long but it's exactly what I had in mind. Maybe just make it a function so we know where pandas kungfu ends. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5718 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGt-48JhDCN06jjZUl369W0ArKRhbqk-ks5u4XCLgaJpZM4YmL5m> .

agramfort · 2018-12-13T06:21:41Z

this would work for me. not sure about the word "enforce" as we tend to use make_ or setup_ but it's a nitpick

massich · 2018-12-13T10:33:56Z

After giving it some thought. I don't think it's a good idea to modify the data representation
(aka, the raw.annotations) so that a process/consumer (aka, events_from_annotations) can produce
the expected data (aka, event onsets). A natural solution is to crate another process/consumer
that transforms the data, and prepend it to the original process (pre-process the data).
The problem here is that this pre-process does not change the data into another form nor does
a clean up. This pre-process is a hack to not touch the process/consumer.

Therefore the events_from_annotations should do this task. Then the question becomes whats
the difference between the new/existing behavior, and how we trigger one or the other?
The current behavior uses only return the onsets of the annotations. So we could add a parameter
that returns as many onsets as fit (wit a separation time) during the annotation duration.

agramfort · 2018-12-13T10:48:20Z

yes agreed. If you add a chunk_duration param to events_from_annotations then you can pass the annotations untouched and get the valid results. It also avoids the pandas kung fu.

we can do this in this PR or in the next PR ie merge the pandas kung fu for now as the main objective here is to reach a basic sleep scoring task using scikit-learn.

mmagnuski · 2018-12-13T16:09:09Z

merge the pandas kung fu for now

you probably meant this:

agramfort · 2018-12-20T09:56:12Z

@Slasnista @massich I heavily simplified the code thanks to the chunk_duration parameter from events_from_annotation. No more pandas kung fu

tutorials/plot_sleep.py

massich · 2018-12-20T10:59:03Z

mne/annotations.py

-        kinds = ['%s (%s)' % (kind, sum(d.lower().startswith(kind)
-                                        for d in self.description))
-                 for kind in kinds]
+        counter = collections.Counter(self.description)


mmagnuski · 2018-12-20T19:15:57Z

No more pandas kung fu

Slasnista · 2018-12-21T09:40:58Z

yes I have already done it.

Shall I extract the features from another record to evaluate the model properly ?

agramfort · 2018-12-21T09:47:38Z

yes

…

agramfort · 2018-12-21T14:31:10Z

tutorials/plot_sleep.py

+    raw_train, events_train, event_id_train,
+    tmin=0., tmax=tmax, baseline=None)
+
+tmax = 30. - 1. / raw_test.info['sfreq']  # tmax in included


no need to duplicate this

agramfort · 2018-12-21T14:32:59Z

tutorials/plot_sleep.py

+prop_cycle = plt.rcParams['axes.prop_cycle']
+colors = prop_cycle.by_key()['color']
+[line.set_color(color) for line, color in zip(ax.get_lines(), colors)]
+plt.legend(list(epochs_train.event_id.keys()))


start by showing PSDs as it explains why power features can be good predictors. Then you can to the code for classification. I would actually load a second subject here and not before so the beginning is easier to follow.

agramfort · 2018-12-22T13:22:47Z

FunctionTransformer was a great idea.

I would still need to do power ratios and maybe adjust welch parameters.

@Slasnista what do you think?

massich · 2019-01-14T10:30:27Z

I changed to use pytest tmpdir to remove couple errors + slightly better setup time

~/code/mne-python remotes/slasnista/sleep_tutorial*
(mne) ❯ pytest mne/datasets/sleep_physionet/tests -vv
Test session starts (platform: linux, Python 3.6.6, pytest 4.0.0, pytest-sugar 0.9.2)
cachedir: .pytest_cache
rootdir: /home/sik/code/mne-python, inifile: setup.cfg
plugins: sugar-0.9.2, pudb-0.7.0, faulthandler-1.5.0, cov-2.6.0
collecting ... 
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records ✓                                          25% ██▌       
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age ✓                                             50% █████     
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records ✓                                    75% ███████▌  
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam ✓                                      100% ██████████
======================================================== slowest 20 test durations =========================================================
11.80s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
3.18s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
1.90s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
1.89s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
1.85s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records

Results (21.41s):
       4 passed
Exception ignored in: <bound method _TempDir.__del__ of '/tmp/tmp_mne_tempdir_190hcdox'>
Traceback (most recent call last):
  File "/home/sik/code/mne-python/mne/utils.py", line 548, in __del__
  File "/home/sik/miniconda3/envs/mne/lib/python3.6/shutil.py", line 494, in rmtree
TypeError: 'NoneType' object is not callable


~/code/mne-python sleep_tutorial* ⇡ 23s
(mne) ❯ pytest mne/datasets/sleep_physionet/tests -vv
Test session starts (platform: linux, Python 3.6.6, pytest 4.0.0, pytest-sugar 0.9.2)
cachedir: .pytest_cache
rootdir: /home/sik/code/mne-python, inifile: setup.cfg
plugins: sugar-0.9.2, pudb-0.7.0, faulthandler-1.5.0, cov-2.6.0
collecting ... 
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records ✓                                          25% ██▌       
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age ✓                                             50% █████     
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records ✓                                    75% ███████▌  
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam ✓                                      100% ██████████
======================================================== slowest 20 test durations =========================================================
13.42s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
3.28s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
1.92s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
1.91s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
1.39s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records

Results (22.48s):
       4 passed

massich · 2019-01-14T13:52:42Z

~/code/mne-python sleep_tutorial* 6s
(mne) ❯ pytest mne/datasets/sleep_physionet/tests -vv                        
Test session starts (platform: linux, Python 3.6.6, pytest 4.0.0, pytest-sugar 0.9.2)
cachedir: .pytest_cache
rootdir: /home/sik/code/mne-python, inifile: setup.cfg
plugins: sugar-0.9.2, pudb-0.7.0, mock-1.10.0, faulthandler-1.5.0, cov-2.6.0
collecting ... 
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records ✓                                          25% ██▌       
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age ✓                                             50% █████     
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records ✓                                    75% ███████▌  
 mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam ✓                                      100% ██████████
======================================================== slowest 20 test durations =========================================================
1.92s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
1.90s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
1.35s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
0.01s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s call     mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records
0.00s setup    mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_temazepam
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_age_records
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_sleep_physionet_age
0.00s teardown mne/datasets/sleep_physionet/tests/test_physionet.py::test_run_update_temazepam_records

Results (5.70s):
       4 passed

~/code/mne-python sleep_tutorial* 7s

agramfort · 2019-01-14T14:13:03Z

mne/datasets/sleep_physionet/tests/test_physionet.py

    """Test Sleep Physionet URL handling."""
+    mm = mocker.patch('mne.datasets.sleep_physionet._utils._fetch_file',


I don't get how this can work. mocker is not imported. Your mocker function does not write any file to disk

what do you mean that the mocker is not imported? pytest-mock exposes this fixture called mocker which takes care of setting it up and tearing down.

And I'm aware that I'm writing nothing. But _fetch_one delegates all the writing to the original _ferch_file and I'm bypassing this so _fetch_one works just as expected despite _fake_fetch_fle does not write a thing.

agramfort · 2019-01-14T16:11:44Z

mne/datasets/sleep_physionet/temazepam.py

+from ._utils import _fetch_one, _data_path, BASE_URL, TEMAZEPAM_SLEEP_RECORDS
+from ._utils import _check_subjects
+
+SLEEP_RECORDS = 'physionet_sleep_records.npy'


this should disappear

+1 i thought i had remove them all. my bad.

agramfort · 2019-01-14T16:12:27Z

mne/datasets/sleep_physionet/temazepam.py

+        The subjects to use. Can be in the range of 0-21 (inclusive).
+    drug : bool
+        If True it's the data with the Temazepam and if False it's
+        the placebo.


left over from old code of mine

massich · 2019-01-14T18:21:57Z

If everything is ok, everything should be green except 3.7, that would be green in #5834

larsoner · 2019-01-14T19:17:56Z

Real flake error

https://travis-ci.org/mne-tools/mne-python/jobs/479456801#L3071

…el deps in readme

larsoner

At last benchmark one was over 10 sec, thus the decorator. But maybe they are faster now

massich · 2019-01-14T21:15:55Z

At last benchmark one was over 10 sec, thus the decorator. But maybe they are faster now

Not the run_update_xx these were fast already. Since they only download small .xls .csv with records of the subjects and hashes of the recordings.

agramfort · 2019-01-14T21:19:05Z

Yes all tests are really fast now

massich · 2019-01-14T21:58:17Z

This is weird in the second env in travis there was a test that failed using _fetch_file. I restarted it. I guess it was just network issue. 'cos pytest-mock should clean up automatically and should be no side effects.

massich · 2019-01-14T23:14:05Z

green !!

massich · 2019-01-14T23:15:11Z

Thx a lot to everyone !!!

massich force-pushed the sleep_tutorial branch from 9925949 to b9abf7c Compare November 16, 2018 13:14

massich force-pushed the sleep_tutorial branch from 45e9276 to e5247af Compare November 16, 2018 14:19

agramfort mentioned this pull request Nov 26, 2018

Splitting file after concatenation #5487

Closed

agramfort force-pushed the sleep_tutorial branch from 9143615 to c17d095 Compare December 9, 2018 16:37

agramfort reviewed Dec 9, 2018

View reviewed changes

mne/utils.py Outdated Show resolved Hide resolved

agramfort reviewed Dec 9, 2018

View reviewed changes

tutorials/plot_sleep.py Outdated Show resolved Hide resolved

massich reviewed Dec 12, 2018

View reviewed changes

agramfort reviewed Dec 12, 2018

View reviewed changes

massich mentioned this pull request Dec 17, 2018

[MRG] Get multiple spaced events from single annotation #5795

Merged

agramfort force-pushed the sleep_tutorial branch from b2b5e8e to 0cb9a87 Compare December 20, 2018 09:04

massich reviewed Dec 20, 2018

View reviewed changes

tutorials/plot_sleep.py Outdated Show resolved Hide resolved

massich reviewed Dec 20, 2018

View reviewed changes

massich force-pushed the sleep_tutorial branch from 07e4a4c to 1f7e771 Compare December 20, 2018 16:31

agramfort reviewed Dec 21, 2018

View reviewed changes

mock _fetch_file to avoid downloading and speed up the testing

871d2de

agramfort reviewed Jan 14, 2019

View reviewed changes

massich added 5 commits January 14, 2019 15:56

wip

ee6b3f8

add mock packages to environment.yml

cacecdb

ENH: refactor mocked object call inspection

0ed6f79

TST: no longer requries good network

e3e38e3

add mock dependencies in travis

7e1655e

agramfort reviewed Jan 14, 2019

View reviewed changes

massich added 3 commits January 14, 2019 17:42

old stuff we forgot

8bddae2

fix docstring

129b903

update requriements for 3.7 travis

cc77141

agramfort added 3 commits January 14, 2019 21:44

just pytest-mock should be explicitely listed and don't list test lev…

ef08992

…el deps in readme

missing

f70e639

these tests are not slow. Just need good network

f934da3

larsoner reviewed Jan 14, 2019

View reviewed changes

massich merged commit b2de2e9 into mne-tools:master Jan 14, 2019

agramfort mentioned this pull request Jan 15, 2019

MNE usage with sleep data #5684

Closed

This was referenced Feb 13, 2019

Duplicate sleep-edf database MIT-LCP/physionet#97

Closed

[HOTFIX] Update sleep physionet to the complete dataset #5932

Merged

Extracting EDF+ annotations #4494

Closed

		"""Test Sleep Physionet URL handling."""
		mm = mocker.patch('mne.datasets.sleep_physionet._utils._fetch_file',

[MRG] Sleep tutorial #5718

[MRG] Sleep tutorial #5718

Conversation

Slasnista commented Nov 16, 2018 • edited by massich

Todo

codecov bot commented Nov 16, 2018 • edited

Codecov Report

larsoner commented Nov 16, 2018

agramfort commented Dec 8, 2018

agramfort commented Dec 9, 2018

Slasnista commented Dec 12, 2018

massich Dec 12, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

massich commented Dec 12, 2018 via email

agramfort commented Dec 13, 2018 via email

massich commented Dec 13, 2018

agramfort commented Dec 13, 2018

mmagnuski commented Dec 13, 2018

agramfort commented Dec 20, 2018

Choose a reason for hiding this comment

mmagnuski commented Dec 20, 2018

Slasnista commented Dec 21, 2018

agramfort commented Dec 21, 2018 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agramfort commented Dec 22, 2018

massich commented Jan 14, 2019

massich commented Jan 14, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

massich commented Jan 14, 2019

larsoner commented Jan 14, 2019

larsoner left a comment

Choose a reason for hiding this comment

massich commented Jan 14, 2019

agramfort commented Jan 14, 2019 via email

massich commented Jan 14, 2019

massich commented Jan 14, 2019

massich commented Jan 14, 2019

Slasnista commented Nov 16, 2018 •

edited by massich

codecov bot commented Nov 16, 2018 •

edited

massich Dec 12, 2018 •

edited