# Physiological or Pathological states?

__An end-to-end deep-learning model to estimate probability of pathological state using local-field-potential recordings__


Anthony Lee 2025-02-01

## Abstract
This project aims to utilize deep learning to extract local field potential (LFP) recording features for various movemvent disorders. Often movement disorders are a result of abnormal synchrony in the basal ganglia region of the brain and thus therapies may involve a surgically implanted stimulator (Deep Brain Stimulation (DBS)) to disrupt the abnormal neuronal synchrony. To be able to recognize pathological state from the sensed LFP data would provide in-time stimulation to relieve the symptoms resulted from the disorder. Instead of the usual manual feature extraction steps, this project explores utilizing a deep learning model to capture the features and the latent relationships between the extracted features.

## Introduction
The brain encodes information in complex forms beyond frequency modulation (FM) and amplitude modulations (AM), however most of the common analyses techniques focus on these aspect, such as decomposing into the frequency spectra and calculate the mean power of certain frequence band. Details such as waveform sharpness, waveform asymmetry, or cross-frequency coupling (Rodriguez et al., 2023) are not captured when doing manual feature extractions such as the power spectral density (PSD) analysis. Thus, the goal is to attempt to capture these features via a deep learning model.

This notebook thus creates a deep convolutional network with a fully connected classifier using LFP recordings sampled at 2048 Hz from patients with Essential Tremor. The data is labeled using motion captured by bipolar EMG sensors on the flexor carpi radialis (Rodriguez et al., 2023).

## Purpose
The purpose of this project is to study the feasibility of utilizing a deep learning network to capture the latent patterns in the neuronal LFP signals. Current DBS therapies are semi-closed with physicians heavily involved in the tuning process of the therapy protocol. However, even such involvement can only provided limited therapy patterns for the patient in a one-size-fits-all manner. To be able to allows the DBS system actively provide different therapy regimine depending on the state of the patient is crucial in limiting the side-effects of DBS as a therapy device. Potential decrease in physician's involvement could also help make the DBS therapy more accessible to the larger patient population as the human population overall is seeing increasing percentage of movement disorders.


## Methods

### About the data
The data is collected and published in (He et al., 2021) investigating the effect of high-frequency thalamic stimulation as a therapy for patients with Essential Tremor, a movement disorder. The data were collected in repeated trials whilst the patient repeat a specified movement such as stationaly posture, reaching for a target, or moving pegs or a pegboard. A pathological state is labeled by "thresholding [i.e., 0.25] the RMS power of the signal of interest" as measured by the accelerometer or EMG signals (Rodriguez et al., 2023).

To increase the number of data samples, a moving window of 1-second (equivalent to 2048 data points) is used to splice each recordings into overlapping segments of recordings. A mean is then taken of the respective window of label resulting in a value that is between 0 and 1 and thus can be interpreted as the likelihood or intensity of the pathological state. All the spliced data are then randomly split into a non-test/test sets with a split of 90/10. The non-test set is further randomly split into a train/validation sets with a split of 70/30. The test set is the holdout set that will not be introduced to the training model thus serving as the benchmark data.

### Model selection and architecture
Seven layers of one-dimensional (1D) convolutional-neural-network (CNN) is used as the feature extractor followed by a single layer fully-connected layer as the classifier. Each of the CNN layer is followed by a non-linear activation (i.e., SiLU) and a custom compressor layer that compresses the range of the data. The compressor layer design is based on (Rodriguez et al., 2023). The compressor layer has a learnable slope and bias parameter to accommodate for the wide range of LFP values; a 10x fold difference of LFP value range in milli-volt it not uncommon between individuals.

### Hyperparameters
The hyperparameters tuned for this model are "hidden layer dimension (`n_hidden`)", "number of layers (`depth`)", "kernel size (`kernel`)", "learning rate for the ADAM optimizer (`lr`)", and "weight decay for the ADAM optimizer (`weight_decay`)".

The search space was too large to search exhaustively without a distributed GPU setup, thus relying on the limited tests (3 trials) and hyperparameter utilized in (Rodriguez et al., 2023), the hyperparameters were set as 45, 7, 32, 1e-3, and 1e-8 respectively.

### Data augmentation
![Table Table](figures/DataTable.jpg)

The original data source has only 8 patients and each consists of 6-9 minutes of recordings for the "Posture" activity with the ON treatment state. In this case, each patient would have approximately 400-510x of non-overlapping one-second data segments for training. Neurons operate on the timescale of milliseconds thus a one-second window is sufficiently wide to observe the neuron's activity.

To further enhance the data, it was augmented with a sliding window to increase the number of data points to approximately 700,000-1,000,000 data points per patient, the result is a very large dataset that was tricky to manage without causing the computer to raise an Out-of-memory (OOM) error.

## EDA of the dataset
The original dataset consists of 8 patients with various activities (i.e., "Pegboard", "Posture", and "Pouring") with each of the activity having an ON-treatment and OFF-treatment state. Each of the activities indicate what the patient was doing when the recording was captured.

For this analysis, only the "Posture" and ON-treatment datasets are utilized for each of the patients' analysis because this is the only dataset that is avaialble for all patients and also has the least variability in terms of quality of the data labels as movement can affect how tremor is being detected and thus labeled.

### LFP recording
A one second zoomed-in view of the LFP data are shown below for when the data was labeld as "with tremor (TRUE)" and "without tremor (FALSE)". Because of how high the sampling rate is for this dataset (2048Hz), a zoomed-in provides a better view of the data shape and pattern.

Each patient's recording consists of four channels, corresponding to the four sensing channels on their DBS electrodes. Each of the sensing channel differ spatially and for further spatial mapping of the electrodes see (He et al., 2021).

For simplicity, the signals are averaged across the channels to create an aggregated-channel reading.

![LFP Recording Example with Tremor](figures/LFP_of_with_tremor_zoomed_in.svg)
![LFP Recording Example without Tremor](figures/LFP_of_without_tremor_zoomed_in.svg)

### PSD is indistinguishable
Analyzing the Power Spectral Density (PSD) is the common analysis method of neuronal signal, however, the PSD for the ON-state and OFF-state are indistinguishable, especially when the instrument used for data collection has such a high sampling frequency.

Movement disorders such as Parkinson's Disease have more distinguished power pattern in the beta-frequency range (Radcliffe et al., 2023), and such pattern can be seen in the figure below, with the PSD in the beta range having lower power when there are no tremor detected.

![PSD of with and without tremor](figures/PSD_of_tremor_and_no_tremor.svg)


## Discussion

### Sliding window too many data overfit?
The attempt to augment the data using the sliding window method may be leading to overfitting of the model earlier than expected as the number of samples becomes overwhelming much larger than the number of paramters in the model. Approximately 6-million datapoints in contrast to 800 parameters in the model.

Some testing have shown that the model became very insensitive to changes in LFP values, and further investigation is needed to understand the underlying cause.

### Manage dataset to prevent OOM
The augmented dataset becomes too large to store in-memory and also impractically to store locally as repeated file read and write would dramatically slow down the model training. Instead, I utilized some data views to creates views of the same underlying dataset that is stored in memory to increase the speed and storage efficiency.

This method is still suboptimal as access data through views are still IO-bound and has slowed down my model training dramatically (one epoch of training takes about 90min). However, this method is much faster than the repeated read and write from storage.

### Averaging across channel is too simplisitc
Combining the four channels of recording by averaging them may be too simplistic of an approach to aggregate the information resulting in channel information being lost. As seen below, the signal recorded in each channel actually vary drastically in range and by averaging across channels results in scaling up of the signals with low amplitude and scaling down of signals with high amplitudes.

This could be problematic as the amplitude of the neuron signal also encodes crucial information.

![Shared y LFP with Tremor](figures/LFP_of_with_tremor_zoomed_in_shared_y.svg)
![Shared y LFP without Tremor](figures/LFP_of_without_tremor_zoomed_in_shared_y.svg)

## Conclusion

Using Binary Cross Entropy (BCE) as the loss metric I was able to train the model and reach a convergence of training-loss and validation-loss at epoch 7. Further training would result in the validation loss decreasing more than the training loss indicating an overfitting. 

![Training and validation loss](figures/Train_val_loss_over_epoch.svg)

The results are promising, however, further investigation is needed to understand the information captured at each of the seven layers 1D convolutional layers.

This shows promise in closing the feedback loop of a DBS therapy device in providing just-in-time DBS stimulation as needed when pathological state is observed by the model. However, further understanding of the false-positive tolerance is needed before any actual clinical application. Additionally, the power draw of such model is still too great for DBS system with very limited power reserve. Additionally, long-term effect needs to be studied in how often does the model need to be re-trained or further trained over the lifespan of a patient.

## Next steps

### Power efficient model development
A technical next step is to profile the power consumption of such model once compiled into embedded system code. Further comparison with hardware encoded model and such software model should be conducted to understand the power needed for such model to be a potential viable solution for clinical use.

### Dissect the blackbox
The project showed promise in predicting the pathological state with very limited data diversity (8x patients), however value can still be drawn from analyzing the patterns that each convolutional layers are capturing to further enhance the understanding of how movement disorder neuronal signal patterns in the basal ganlion differ from physiological state. 

### Training and tuning using distributed system
As models grow, the need for using larger and more powerful system is required. Despite being able to compute this model locally, further adaptation is needed for this model to be able to run in a distributed manner. This is currently beyond the scope of this preliminary study, but further development of this project would be to improve the model for computing in a distributed cluster environment.


## Reference
Rodriguez, F., He, S., & Tan, H. (2023). The potential of convolutional neural networks for identifying neural states based on electrophysiological signals: Experiments on synthetic and real patient data. Frontiers in Human Neuroscience, 17, 1134599. https://doi.org/10.3389/fnhum.2023.1134599

He, S., Baig, F., Mostofi, A., Pogosyan, A., Debarros, J., Green, A. L., Aziz, T. Z., Pereira, E., Brown, P., & Tan, H. (2021). Closed-Loop Deep Brain Stimulation for Essential Tremor Based on Thalamic Local Field Potentials. Movement Disorders, 36(4), 863–873. https://doi.org/10.1002/mds.28513

Radcliffe, E. M., Baumgartner, A. J., Kern, D. S., Al Borno, M., Ojemann, S., Kramer, D. R., & Thompson, J. A. (2023). Oscillatory beta dynamics inform biomarker-driven treatment optimization for Parkinson’s disease. Journal of Neurophysiology, 129(6), 1492–1504. https://doi.org/10.1152/jn.00055.2023

Buhlmann, J., Hofmann, L., Tass, P. A., & Hauptmann, C. (2011). Modeling of a Segmented Electrode for Desynchronizing Deep Brain Stimulation. Frontiers in Neuroengineering, 4. https://doi.org/10.3389/fneng.2011.00015

