# ParametrizationDefinition

### nm_settings.json

In order to estimate multimodal features of neurophysiological data, certain parametrization steps are required. 
Here the following two parametrization files are explained: 
 - `nm_settings.json`
 - `nm_channels.csv`
 
The `nm_settings.json` specifies the general feature processing. Most importantly, all provided features can be enabled and disabled: 
```json
"methods": 
{
        "raw_resampling": false,
        "normalization": true,
        "kalman_filter": true,
        "re_referencing": true,
        "notch_filter": true,
        "bandpass_filter": true,
        "raw_hjorth": true,
        "sharpwave_analysis": true,
        "return_raw": true,
        "project_cortex": false,
        "project_subcortex": false,
        "pdc": false,
        "dtf": false
}
```

**raw_resampling** defines a resampling rate to which the original data is downsampled to. This can be of advantage, since high sampling frequencies automatically require higher computational cost. In the method-specific-settings the resampling frequency can be defined: 

```json
"raw_resampling_settings": {
        "resample_freq": 1000
    }
```

**normalization** allows for normalizing the past *normalization_time* according to the *mean*, *median* and some others that are defined in 'nm_normalization.py'.

```json
"raw_normalization_settings": {
        "normalization_time": 10,
        "normalization_method": "median"
    }

```

**kalman_filtering** is motivated by filtering estimated band power features using the white noise acceleration model (see ["Improved detection of Parkinsonian resting tremor with feature engineering and Kalman filtering"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6927801/) Yao et al 19) for a great reference. The white noise acceleration model get's specified by the $T_p$ prediction interval (Hz), and the process noise is then defined by $\sigma_w$ and $\sigma_v$: 

$
  Q=
  \left[ {\begin{array}{cc}
   \sigma_w^2\frac{T_p^{3}}{3} & \sigma_w^2\frac{T_p^2}{2}\\
   \sigma_w^2\frac{T_p^2}{3} & \sigma_w^2T_p\\
  \end{array} } \right]
$


```json
"kalman_filter_settings": {
        "Tp": 0.1,
        "sigma_w": 0.7,
        "sigma_v": 1,
        "frequency_bands": [
            "low gamma",
            "high gamma",
            "all gamma"
        ]
    }
```
Individual frequency bands (specified in the **frequency_ranges**) can be selected for Kalman Filtering (see ["Real-time epileptic seizure prediction using AR models and support vector machines"](https://pubmed.ncbi.nlm.nih.gov/20172805/) (Chisci et al 10)).

**re_referencing** constitutes an important aspect of electrophysiological signal processing. Most commonly bipolar and common average rereferencing are applied. Due to that reason a seperate file, `nm_channels.csv` can be specified before feature estimation, or it is automatically setup during parametrization when read via Bids. `nm_channels.csv` contains a column *rereference*, specifying the rereference methods 
 - *average* for common average rereference (across a channel type, e.g. ecog)
 - bipolar rereferencing, by specifying the channel name to rereference to, e.g. *LFP_BS_STN_L_1*
 - *none* for no rereferencing being used for this particular channel 

**notch_filter** enables notch filters at the specified line_noise and it's harmonics. 

**bandpass_filter** enables band power feature estimation. Settings are defined in such manner: 
```json
"bandpass_filter_settings": {
    "segment_lengths": {
        "theta": 1000,
        "alpha": 500,
        "low beta": 333,
        "high beta": 333,
        "low gamma": 100,
        "high gamma": 100,
        "HFA": 100
    },
    "bandpower_features": {
        "activity": true,
        "mobility": false,
        "complexity": false
    },
    "log_transform" : true
}
```

Here the frequency-band name (e.g. *theta*) get's specified, and in the the subsequent list the frequency range (here from 4 - 8 Hz) is defined. The *segment_length* parameter defines a time range in which FIR filtered data is used for feature estimation in milliseconds. Here the previous 1000 ms are used to estimate features based on the FIR filtered signals in the range of 4 to 8 Hz. This might be beneficial when using shorter frequency bands, e.g. gamma, where estimating band power in a range of e.g. 100 ms might result in a temporal more specified feature calculation. 
A common way to estimate band power is to take the variance of FIR filtered data. This is equavilent to the activity [Hjorth](https://en.wikipedia.org/wiki/Hjorth_parameters) parameter. The *bandpower_features* key in the *bandpass_filter_settings* allows to take the *activity*, *mobility* and *complexity* Hjorth parameters. For estimating Hjorth parameters of the raw unfiltered signal, the **raw_hjorth** method can be enabled. *log_transform* can be beneficial disentangling frequency bands for achieving higher performances with e.g. LDA.

**sharpwave_analysis** allows for calculation of temporal sharpwave features. See ["Brain Oscillations and the Importance of Waveform Shape"](https://www.sciencedirect.com/science/article/abs/pii/S1364661316302182) Cole et al 17 for a great motivation to use these features. Here, sharpwave features are estimated using a prior bandpass filter  between *filter_low_cutoff* and *filter_high_cutoff*. The sharpwave peak and trough features can be calculated, defined by the *estimate* key. According to a current data batch one or more sharpwaves can be detected. The subsequent feature is returned rather by the *mean, median, maximum, minimum or variance* as defined by the *estimator*. 
```json
"sharpwave_analysis_settings": {
    "sharpwave_features": {
        "peak_left": false,
        "peak_right": false,
        "trough": false,
        "width": false,
        "prominence": true,
        "interval": true,
        "decay_time": false,
        "rise_time": false,
        "sharpness": true,
        "rise_steepness": false,
        "decay_steepness": false,
        "slope_ratio": false
    },
    "filter_low_cutoff": 5,
    "filter_high_cutoff": 80,
    "detect_troughs": {
        "estimate": true,
        "distance_troughs": 10,
        "distance_peaks": 5
    },
    "detect_peaks": {
        "estimate": true,
        "distance_troughs": 5,
        "distance_peaks": 10
    },
    "estimator": {
        "mean": [
            "interval"
        ],
        "median": null,
        "max": [
            "prominence",
            "sharpness"
        ],
        "min": null,
        "var": null
    },
    "apply_estimator_between_peaks_and_troughs" : true
}
```
A separate tutorial on sharpwave features is provided in the documentation. 

Next, raw signals can be returned, specifed by the **return_raw** method. 

**projection_cortex** and **projection_subcortex** allows then feature projection of individual channels to a common subcortical or cortical grid, defined by *grid_cortex.tsv* and *subgrid_cortex.tsv*. For both projections a *max_dist* parameter needs to be specified, in which data is linearly interpolated, weighted by their inverse grid point distance. 

Additionally **coherence** enable connectiviy features for certain *frequency_bands* between *channels* pairs provided by a Python list. 

### nm_channels.csv

As described above, rereferencing will be estimated automatically given the specified channel types. To demonstrate a typical rereferencing scheme, here the example BIDS data in 'py_neuromodulation/examples/data' is read.

First, the path needs to specified and necessary libaries imported:

In [16]:
import py_neuromodulation as pn
import os

In [42]:
PATH_BIDS = os.path.join(pn.__path__[0], "..", "examples", "data")
PATH_RUN = os.path.join(
    PATH_BIDS,
    "sub-testsub",
    "ses-EphysMedOff",
    "ieeg",
    "sub-testsub_ses-EphysMedOff_task-buttonpress_run-0_ieeg.vhdr"
)
PATH_OUT = os.getcwd()

In [43]:
pn_stream = pn.nm_BidsStream.BidsStream(
    PATH_RUN=PATH_RUN,
    PATH_BIDS=PATH_BIDS,
    PATH_OUT=PATH_OUT
)

Extracting parameters from C:\Users\ICN_admin\Documents\py_neuromodulation\py_neuromodulation\..\examples\data\sub-testsub\ses-EphysMedOff\ieeg\sub-testsub_ses-EphysMedOff_task-buttonpress_run-0_ieeg.vhdr...
Setting channel info structure...
Reading channel info from C:\Users\ICN_admin\Documents\py_neuromodulation\py_neuromodulation\..\examples\data\sub-testsub\ses-EphysMedOff\ieeg\sub-testsub_ses-EphysMedOff_task-buttonpress_run-0_channels.tsv.
Reading in coordinate system frame MNI152NLin2009bAsym: None.
Reading electrode coords from C:\Users\ICN_admin\Documents\py_neuromodulation\py_neuromodulation\..\examples\data\sub-testsub\ses-EphysMedOff\ieeg\sub-testsub_ses-EphysMedOff_acq-StimOff_space-mni_electrodes.tsv.



The search_str was "C:\Users\ICN_admin\Documents\py_neuromodulation\py_neuromodulation\..\examples\data\sub-testsub\**\ieeg\sub-testsub_ses-EphysMedOff*events.tsv"
  raw_arr = mne_bids.read_raw_bids(bids_path)
  raw_arr = mne_bids.read_raw_bids(bids_path)
--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\ICN_admin\Anaconda3\envs\pn_env\lib\logging\__init__.py", line 1079, in emit
    msg = self.format(record)
  File "C:\Users\ICN_admin\Anaconda3\envs\pn_env\lib\logging\__init__.py", line 923, in format
    return fmt.format(record)
  File "C:\Users\ICN_admin\Anaconda3\envs\pn_env\lib\logging\__init__.py", line 659, in format
    record.message = record.getMessage()
  File "C:\Users\ICN_admin\Anaconda3\envs\pn_env\lib\logging\__init__.py", line 363, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "C:\Users\ICN_admin\Anaconda3\envs\pn_env\lib\runpy.py", line 197, in _run_module_as_main
   

Then the nm_channels.csv file is automatically specifed given the respective channel types. Since one channel contains an 'analog' substring, it is automaticaly assginged to be a *target* channel. 

Based on 'seeg' type (could be also 'dbs'), the subsequent depth local field potential channel is selected for rereferencing. For 'ecog' channels, the average rereference function is automatically assigned. 

Note here also that channels can be 'used' for feature analysis.

In [44]:
pn_stream.nm_channels

Unnamed: 0,name,rereference,used,target,type,status,new_name
0,ANALOG_R_ROTA_CH,,0,0,misc,good,ANALOG_R_ROTA_CH
1,ECOG_L_1_SMC_AT,average,1,0,ecog,good,ECOG_L_1_SMC_AT-avgref
2,ECOG_L_2_SMC_AT,average,1,0,ecog,good,ECOG_L_2_SMC_AT-avgref
3,ECOG_L_3_SMC_AT,average,1,0,ecog,good,ECOG_L_3_SMC_AT-avgref
4,ECOG_L_4_SMC_AT,average,1,0,ecog,good,ECOG_L_4_SMC_AT-avgref
5,ECOG_L_5_SMC_AT,average,1,0,ecog,good,ECOG_L_5_SMC_AT-avgref
6,ECOG_L_6_SMC_AT,average,1,0,ecog,good,ECOG_L_6_SMC_AT-avgref
7,EEG_AO,,0,0,misc,good,EEG_AO
8,LFP_L_1_STN_BS,LFP_L_567_STN_BS,1,0,seeg,good,LFP_L_1_STN_BS-LFP_L_567_STN_BS
9,LFP_L_234_STN_BS,LFP_L_1_STN_BS,1,0,seeg,good,LFP_L_234_STN_BS-LFP_L_1_STN_BS
