# Sleep EEG cleaning

## Introductory notes:
This notebook presents cleaning functionality:
* Resampling
* Bandpass and notch filtering
* Selecting bad channels
* Interpolating bad channels 
* Annotating bad data spans

Recommended readings:
1. [MNE: The Raw data structure](https://mne.tools/stable/auto_tutorials/raw/10_raw_overview.html)
2. [Learning eeg: artifacts](https://www.learningeeg.com/artifacts)
3. [MNE: Overview of artifact detection](https://mne.tools/stable/auto_tutorials/preprocessing/10_preprocessing_overview.html)
4. [MNE: Filtering and resampling data](https://mne.tools/stable/auto_tutorials/preprocessing/30_filtering_resampling.html) 
5. [MNE: Handling bad channels](https://mne.tools/stable/auto_tutorials/preprocessing/15_handling_bad_channels.html)
6. [MNE: Annotating continuous data](https://mne.tools/stable/auto_tutorials/raw/30_annotate_raw.html)

## Import data

### Import module
Add pipeline module to path and import its elements (just run this cell).

In [2]:
from sleepeeg.pipeline import CleaningPipe

### Initialize CleaningPipe object

`path_to_eeg` - can be any type of eeg file that MNE's [read_raw](https://mne.tools/stable/generated/mne.io.read_raw.html) function supports.

`output_dir` - a directory you want the results to be saved in

In [3]:
pipe = CleaningPipe(
    path_to_eeg=r"C:\Users\Gennadiy\Documents\data\HZ4\HZ4_SLEEP_20210629_132715.mff",
    output_dir=r"C:\Users\Gennadiy\Documents\data\HZ4\processing")

Reading EGI MFF Header from C:\Users\Gennadiy\Documents\data\HZ4\HZ4_SLEEP_20210629_132715.mff...
    Reading events ...
    Assembling measurement info ...


## Resample

`sfreq` - desired new sampling frequency

`save` - if true, saves resampled eeg data and metadata in .fif files. They will be saved in output directory you've provided.

Resampling can be a long process (1+ hour), be patient.

In [4]:
pipe.resample(
    sfreq=250,
    save=True,
    n_jobs='cuda')

Writing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\resampled_250hz_raw.fif
    Writing channel names to FIF truncated to 15 characters with remapping
Overwriting existing file.
Writing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\resampled_250hz_raw-1.fif
    Writing channel names to FIF truncated to 15 characters with remapping
Closing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\resampled_250hz_raw-1.fif
Closing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\resampled_250hz_raw.fif
[done]


## Filter

#### High pass

In [5]:
pipe.filter(
    l_freq=0.3,
    h_freq=None,
    picks=None,  # If None - filters all channels.
    n_jobs='cuda')

Filtering raw data in 1 contiguous segment
Setting up high-pass filter at 0.3 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal highpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 0.30
- Lower transition bandwidth: 0.30 Hz (-6 dB cutoff frequency: 0.15 Hz)
- Filter length: 2751 samples (11.004 sec)

Using CUDA for FFT FIR filtering


#### Notch

In [6]:
pipe.notch(
    freqs=None,  # By default will remove 50 Hz and its harmonics.
    picks='eeg',  # If None - filter all channels
    n_jobs='cuda')

Setting up band-stop filter

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandstop filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower transition bandwidth: 0.50 Hz
- Upper transition bandwidth: 0.50 Hz
- Filter length: 1651 samples (6.604 sec)

Using CUDA for FFT FIR filtering


## Select bad channels & annotate bad epochs

Create average reference projection. You can apply and remove the projection from inside the plot. Does not have an effect on raw signal itself.

In [7]:
pipe.mne_raw.set_eeg_reference(
    ref_channels='average',
    projection=True)

EEG channel type selected for re-referencing
Adding average EEG reference projection.
1 projection items deactivated
Average reference projection was added, but has not been applied yet. Use the apply_proj method to apply it.


0,1
Measurement date,"June 29, 2021 10:27:15 GMT"
Experimenter,Unknown
Digitized points,260 points
Good channels,"257 EEG, 5 BIO, 1 ECG, 1 EMG"
Bad channels,
EOG channels,Not available
ECG channels,ECG
Sampling frequency,250.00 Hz
Highpass,0.30 Hz
Lowpass,125.00 Hz


#### Select bad channels

`save_bad_channels` - if true, saves bad channels selected in this plot session to *bad_channels.txt* file.

`scalings` - Scaling factors for the traces. If a dictionary where any value is 'auto', the scaling factor is set to match the 99.5th percentile of the respective data. If 'auto', all scalings (for all channel types) are set to 'auto'. If None, defaults to:
```
dict(mag=1e-12, grad=4e-11, eeg=20e-6, eog=150e-6, ecg=5e-4,
     emg=1e-3, ref_meg=1e-12, misc=1e-3, stim=1,
     resp=1, chpi=1e-4, whitened=1e2)
```

`use_opengl` - Whether to use OpenGL when rendering the plot (requires [pyopengl](https://pyopengl.sourceforge.net/documentation/installation.html)).

Also, check out the `butterfly` mode by using 'b' shortcut while in the plot, it can help find additional bad channels which is hard to see in a normal view.

In [22]:
pipe.plot(
    save_bad_channels=True,
    save_annotations=False,
    butterfly=False,
    scalings="auto",
    use_opengl=False,
)

Channels marked as bad:
['E3', 'E18', 'E37', 'E31', 'E32', 'E82', 'E25', 'E2', 'E10', 'E11', 'E12', 'E91', 'E216', 'E209', 'E228', 'E118', 'E69', 'E250', 'E246', 'E248', 'E67', 'E255', 'E208', 'VREF', 'E244', 'E9', 'E245', 'E240', 'E137']


If you want to continue with previously saved bad channels, use `pipe.read_bad_channels()`. The function will import the channels from *bad_channels.txt* file.

In [9]:
pipe.read_bad_channels(
    # Path to the txt file with bad channel name per row. 
    # If None set to '{output_dir}/bad_channels.txt'
    path=None
    )

#### Interpolate bad channels

Interpolate bad channels using [spherical spline interpolation](https://mne.tools/stable/overview/implementation.html#bad-channel-repair-via-interpolation)

`reset_bads` - we saved bad channels in a txt file, so we can reset them in the metadata.

In [10]:
pipe.mne_raw.interpolate_bads(reset_bads=True)

Interpolating bad channels
    Automatic origin fit: head of radius 96.5 mm
Computing interpolation matrix from 237 sensor positions
Interpolating 20 sensors


0,1
Measurement date,"June 29, 2021 10:27:15 GMT"
Experimenter,Unknown
Digitized points,260 points
Good channels,"257 EEG, 5 BIO, 1 ECG, 1 EMG"
Bad channels,
EOG channels,Not available
ECG channels,ECG
Sampling frequency,250.00 Hz
Highpass,0.30 Hz
Lowpass,125.00 Hz


#### Select bad epochs

`butterfly` - will start the plot with the butterfly mode, you can change it to the normal mode by pressing 'b'.

`save_annotations` - will save created annotations for bad epochs in the *annotations.txt* file.

`scalings` - Scaling factors for the traces. If a dictionary where any value is 'auto', the scaling factor is set to match the 99.5th percentile of the respective data. If 'auto', all scalings (for all channel types) are set to 'auto'. If None, defaults to:
```
dict(mag=1e-12, grad=4e-11, eeg=20e-6, eog=150e-6, ecg=5e-4,
     emg=1e-3, ref_meg=1e-12, misc=1e-3, stim=1,
     resp=1, chpi=1e-4, whitened=1e2)
```

`use_opengl` - Whether to use OpenGL when rendering the plot (requires [pyopengl](https://pyopengl.sourceforge.net/documentation/installation.html)).

In [24]:
pipe.plot(
    save_bad_channels=False,
    save_annotations=True,
    butterfly=True,
    scalings="auto",
    use_opengl=False,
)

Channels marked as bad:
['E3', 'E18', 'E37', 'E31', 'E32', 'E82', 'E25', 'E2', 'E10', 'E11', 'E12', 'E91', 'E216', 'E209', 'E228', 'E118', 'E69', 'E250', 'E246', 'E248', 'E67', 'E255', 'E208', 'VREF', 'E244', 'E9', 'E245', 'E240', 'E137']
Overwriting existing file.


If you want to continue with previously saved annotations, use `pipe.read_annotations()`. The function will import the annotations from *annotations.txt* file.

In [11]:
pipe.read_annotations(
    # Path to txt file with mne-style annotations. 
    # If None set to '{output_dir}/annotations.txt'
    path=None
)

  self.mne_raw.set_annotations(read_annotations(p))


## Save cleaned and annotated signal to the file

In [12]:
pipe.save_raw('cleaned_raw.fif')

Writing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\cleaned_raw.fif
    Writing channel names to FIF truncated to 15 characters with remapping
Overwriting existing file.
Writing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\cleaned_raw-1.fif
    Writing channel names to FIF truncated to 15 characters with remapping
Closing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\cleaned_raw-1.fif
Closing C:\Users\Gennadiy\Documents\data\HZ4\processing\saved_raw\cleaned_raw.fif
[done]
