# Major Depressive Disorder Diagnosis

 - - -

## Previous Research Summary

Title: "Heart rate variability for treatment response between patients with major depressive disorder versus panic disorder: A 12-week follow-up study" (K.W. Choi et al.)
> Hypothesis #1: Patients with MDD and PD showed differenct HRV profiles compared to healthy controls. \
> Hypothesis #2: It is possible to predict the responder groups in the MDD and PD patients, using differences in HRV indicies between the stress and rest phases. \
> Methods: 28 MDD patients, 29 PD patients, 39 healthy control subjects - for 12 weeks follow-up. \
> Results: 
>> pNN50 --- Patients with MDD and PD demonstrated lower pNN50. \
>> LF/HF ratio --- Patients with MDD and PD showed higher LF/HF ratio than control during 'stress' phase. \
>> LF/HF ratio --- Responders in the PD group showed lower LF/HF ratio during 'stress' phase compared to non-responders. \
>> Heart Rate --- Responders in the MDD group showed lower heart rate during 'all three' phases compared to non-responders. \
>> LF/HF ratio and pNN50 --- Possible to predict treatment response in patients with MDD using LF/HF ratio and pNN50. \

> Variables(time-domain): \
>> SDNN (Standard deviation of average normal-normal intervals) -- sympathetic and parasympathetic activities. \
>> RMSSD (Root mean square of successive differences) -- parasympathetic modulation. \
>> pNN50 -- parasympathetic modulation. \

> Variables(frequency-domain): \
>> LF (low frequency, 0.04 ~ 0.15 Hz) -- modulated by sympathetic and parasympathetic activities. \
>> HF (high frequency, 0.15 ~ 0.4 Hz) -- modulated by parasympathetic activities. \
>> LF/HF ratio -- ratio of LF and HF -- measures balance between sympathetic and parasympathetic activities. \

- - -

In [None]:
import os
import time
import random
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import torch
import torchvision
import torch.nn as nn

In [None]:
from scipy.stats.stats import pearsonr
from scipy.stats import ttest_ind
from scipy.stats import bartlett
from scipy.stats import ks_2samp
from scipy.stats import shapiro
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report,confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import MinMaxScaler
from xgboost import XGBClassifier

from torch.nn import functional as F
from torch.autograd import Variable
from torch.utils.data import Dataset, TensorDataset

In [None]:
print("PyTorch Version: ",torch.__version__)
print("Torchvision Version: ",torchvision.__version__)

In [None]:
class Args:
    # arugments
    epochs=50
    bs=16
    lr=0.001
    momentum=0.9
    num_classes=3
    verbose='store_true'
    seed=674

args = Args()    

np.random.seed(args.seed)
random.seed(args.seed)
torch.manual_seed(args.seed)

In [None]:
#Setting torch environment

if torch.cuda.is_available():
    DEVICE = torch.device('cuda')
else:
    DEVICE = torch.device('cpu')
    
print('Using PyTorch version:', torch.__version__, ' Device: ', DEVICE)

- - -

# Data Handling

## Dataset check

In [None]:
# HRV 데이터셋 불러오기
hrv_df = pd.read_csv('E:/RESEARCH/Datasets/HRV/HRV_samsung/HRV_REV_all.csv', sep=',')
hrv_df.head()

In [None]:
hrv_df.shape

HRV measuring steps
* b1 - s - b2 - r - b3 - c
* Phase: b(baseline between each phase), s(stress phase), r(relaxation phase), c(recovery phase)
* Disorder(=label): 1(Depression), 2(Panic Disorder), 3(Control)
* Each has following variables (Total 13 variables)
> SDNN, NN50, PNN50, RMSSD, VLF, LF, HF, LF/HF, POWER, HR, RESP, SC, TEMP

In [None]:
hrv_df.columns

In [None]:
hrv_df["disorder"].value_counts() ## MDD 136, PD 149, Control 194

- - -

## Data preprocessing

* Select the analysis task
> MDDPD, MDDC, PDC, none

In [None]:
## Select the task to be analyzed
# task = "MDDC"
task = "NONE"

In [None]:
if task == "MDDPD":
    hrv = hrv_df[hrv_df["disorder"].isin([1,2])] ## for MDD vs PD task
elif task == "MDDC":
    hrv = hrv_df[hrv_df["disorder"].isin([1,3])] ## for MDD vs Control task
elif task == "PDC":
    hrv = hrv_df[hrv_df["disorder"].isin([2,3])]       ## for PD  vs Control task
else:
    hrv = hrv_df ## for MDD vs PD vs Control task

In [None]:
hrv.shape

In [None]:
## scaler setting for data standardization.
scaler = MinMaxScaler()

In [None]:
## Separating HRV dataset by experimental steps.
hrv_only = hrv.drop(columns=['sub', 'VISIT', 'disorder', 'age','gender','HAMD', 'HAMA', 'PDSS', 'ASI', 'APPQ','PSWQ','SPI','PSS','BIS','SSI']) ##leave the variables only about HRV features.
hrv_only[:] = scaler.fit_transform(hrv_only[:])  ##Standardizing. if not necessary, delete.  
hrv_b1 = hrv.filter(regex='^b1')
hrv_s = hrv.filter(regex='^s')
hrv_b2 = hrv.filter(regex='^b2')
hrv_r = hrv.filter(regex='^r')
hrv_b3 = hrv.filter(regex='^b3')
hrv_c = hrv.filter(regex='^c')

In [None]:
hrv_only.head()

In [None]:
hrv_only.shape

In [None]:
## Check whethere each phase contains the same variables.
print("HRV baseline #1 shape is:", hrv_b1.shape[1])
print("HRV stress shape is:", hrv_s.shape[1])
print("HRV baseline #2 shape is:", hrv_b2.shape[1])
print("HRV rest shape is:", hrv_r.shape[1])
print("HRV baseline #3 shape is:", hrv_b3.shape[1])
print("HRV c shape is:", hrv_b1.shape[1])

In [None]:
hrv_s = hrv_s.drop(columns=['sub'])

In [None]:
## Renaming the columns for further calculation.
## We need to generate new dataframes to compare the phases.
hrv_sub = hrv.loc[:, ['sub']]
hrv_disorder = hrv.loc[:,['disorder']] -1 ## 0(Depression), 1(Panic Disorder), 2(Control)
hrv_gender = hrv.loc[:,['gender']]
hrv_HAMD = hrv.loc[:,['HAMD']]
hrv_PDSS = hrv.loc[:,['PDSS']]

hrv_variables = ["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"]
hrv_b1_rename = hrv_b1.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)
hrv_b2_rename = hrv_b2.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)
hrv_b3_rename = hrv_b3.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)
hrv_s_rename = hrv_s.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)
hrv_r_rename = hrv_r.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)
hrv_c_rename = hrv_c.set_axis(["SDNN", "NN50", "PNN50", "RMSSD", "VLF", "LF", "HF", "LF/HF", "POWER", "HR", "RESP", "SC", "TEMP"], axis=1)

In [None]:
hrv_disorder.value_counts()

- - -

## Comparisons between Phases

* HRV measuring steps: b1 - s - b2 - r - b3 - c
* Each has following variables (Total 13 variables): SDNN, NN50, PNN50, RMSSD, VLF, LF, HF, LF/HF, POWER, HR, RESP, SC, TEMP

Since the experimental phase steps are "b1-s-b2-r-b3-c", there are total 5 between phases

### 1) Baseline 1 - Stress phase

In [None]:
hrv_b1_s_sub = hrv_b1_rename - hrv_s_rename
hrv_b1_s_sub.head()

### 2) Stress - Baseline 2 phase

In [None]:
hrv_s_b2_sub = hrv_s_rename - hrv_b2_rename
hrv_s_b2_sub.head()

### 3) Baseline2 - Rest phase

In [None]:
hrv_b2_r_sub = hrv_b2_rename - hrv_r_rename
hrv_b2_r_sub.head()

### 4) Rest - Baseline 3 phase

In [None]:
hrv_r_b3_sub = hrv_r_rename - hrv_b3_rename
hrv_r_b3_sub.head()

### 5) Baseline 3 - Recovery phase

In [None]:
hrv_b3_c_sub = hrv_b3_rename - hrv_c_rename
hrv_b3_c_sub.head()

### 6) Stress - Rest phase

* This is what SMC checks for the research

In [None]:
hrv_s_r_sub = hrv_s_rename - hrv_r_rename
hrv_s_r_sub.head()

- - -

## Data preprocessing for ML

* The "hrv_only" data shows all hrv features from all three phases

In [None]:
hrv_only.shape

In [None]:
hrv_arr = hrv_only.values
hrv_arr.shape

In [None]:
hrv_arr[0]

- - -

# Data Visualization

In [None]:
hrv.describe()

## Age and Disorder

In [None]:
sns.set_style('whitegrid')
g = sns.FacetGrid(hrv, col='disorder')
g.map(plt.hist, 'age', bins=20)

## Gender and Disorder

In [None]:
sns.set_style('whitegrid')
g = sns.FacetGrid(hrv, col='disorder')
g.map(plt.hist, 'gender', bins=20)

- - -

# Statistical Approaches

* Concept: How about generating additional dataset from current limited dataset, based on statistical theories?
> 1. Check the distribution of each data features(SDNN, ...) and visualize.
> 2. Calculate correlation coefficients between variables based on regression.
> 3. Calculate their mean, sd, and other statistics to find out its distribution.
>> However, most of them would be from normal distribution with different μ and σ based on the CLT.
> 4. Generate random dataset based on its distribution, correlation, and regression coefficients.

## HRV Variable Distributions

* dataset lists:
> baseline1 ~ stress  -- hrv_b1_s_sub \
> stress ~ baseline2  -- hrv_s_b2_sub \
> baseline2 ~ rest    -- hrv_b2_r_sub \
> rest ~ baseline3    -- hrv_r_b3_sub \
> baseline3 ~ recovery -- hrv_b3_c_sub \
> stress ~ rest  -- hrv_s_r_sub

In [None]:
data_vis = hrv_s_r_sub

* Generating new dataframe that we want to check the distribution of.

In [None]:
hrv_visual = pd.concat([data_vis, hrv_disorder],axis=1)

In [None]:
## Separating dataframe into three different groups (CONTROL, MDD, PD)
hrv_CON = hrv_visual[hrv_visual["disorder"] == 2]
hrv_MDD = hrv_visual[hrv_visual["disorder"] == 0]
hrv_PD = hrv_visual[hrv_visual["disorder"] == 1]

* Total 13 variables: SDNN, NN50, PNN50, RMSSD, VLF, LF, HF, LF/HF, POWER, HR, RESP, SC, TEMP

In [None]:
## Set the variable that we want to check
var = "TEMP"

In [None]:
CON = hrv_CON[var]
MDD = hrv_MDD[var]
PD = hrv_PD[var]

* Comparing one variable for three groups

In [None]:
plt.figure(figsize = (10,5))
sns.set_style("whitegrid")
plt.grid(True)
plt.xlabel('Variable: LF/HF ratio',fontsize=10)
plt.ylabel('Density',fontsize=10)

sns.kdeplot(CON)
sns.kdeplot(MDD)
sns.kdeplot(PD)

# plt.legend()
plt.legend(['Control', 'Major Depressive Disorder', 'Panic Disorder'], fontsize=10)

# plt.savefig('./data/figures/distributions/stress_rest/TEMP.png')

In [None]:
## T-test for equal mean value check
## if p-value < 0.05, two distributions do not have equal mean values.
print(">T-TEST")
print("Mean value check for CON and MDD, p-value: {:.3f}".format(ttest_ind(CON, MDD).pvalue))
print("Mean value check for CON and PD, p-value: {:.3f}".format(ttest_ind(CON, PD).pvalue))
print("Mean value check for MDD and PD, p-value: {:.3f}".format(ttest_ind(MDD, PD).pvalue))
print("-----------------------------------------------")


## Bartlett-test for equal variability check
## if p-value < 0.05, two distributions do not have equal variance.
print(">Bartlett-test")
print("Equal Variability test for CON and MDD, p-value: {:.3f}".format(bartlett(CON, MDD).pvalue))
print("Equal Variability test for CON and PD, p-value: {:.3f}".format(bartlett(CON, PD).pvalue))
print("Equal Variability test for MDD and PD, p-value: {:.3f}".format(bartlett(MDD, PD).pvalue))
print("-----------------------------------------------")


## Shapiro-Wilk test for normal distribution check
## if p-value < 0.05, distribution is not following normal distribution.
print(">Shapiro-Wilks test")
print("Normal distribution test for CON, p-value: {:.3f}".format(shapiro(CON).pvalue))
print("Normal distribution test for MDD, p-value: {:.3f}".format(shapiro(MDD).pvalue))
print("Normal distribution test for PD, p-value: {:.3f}".format(shapiro(PD).pvalue))
print("-----------------------------------------------")


## Kolmogorov-Smirnov test for equal distribution check
## if p-value < 0.05, two distributions are not following same distribution. 
print(">Kolmogorov-Smirnov test")
print("Equal distributions test between CON and MDD, p-value: {:.3f}".format(ks_2samp(CON, MDD).pvalue))
print("Equal distributions test between CON and PD, p-value: {:.3f}".format(ks_2samp(CON, PD).pvalue))
print("Equal distributions test between MDD and PD, p-value: {:.3f}".format(ks_2samp(MDD, PD).pvalue))
print("-----------------------------------------------")

In [None]:
SDNN = hrv_only['b1SDNN']
NN50 = hrv_only['b1NN50']
PNN50 = hrv_only['b1PNN50']
RMSSD = hrv_only['b1RMSSD']
VLF = hrv_only['b1VLF']
LF = hrv_only['b1LF']
HF = hrv_only['b1HF']
LFHF = hrv_only['b1LF/HF']
POWER = hrv_only['b1POWER']
RESP = hrv_only['b1RESP']
TEMP = hrv_only['b1TEMP']
HR = hrv_only['b1HR']

* All variables

In [None]:
plt.figure(figsize = (10,5))
sns.set_style("whitegrid")
plt.grid(True)
plt.xlabel('Standardized Variables',fontsize=10)
plt.ylabel('Density',fontsize=10)

sns.kdeplot(b1SDNN)
sns.kdeplot(b1NN50)
sns.kdeplot(b1RMSSD)
# sns.kdeplot(b1VLF)
sns.kdeplot(b1LF)
# sns.kdeplot(b1HF)
sns.kdeplot(b1LFHF)
# sns.kdeplot(b1POWER)
# sns.kdeplot(b1PNN50)
sns.kdeplot(b1RESP)
sns.kdeplot(b1TEMP)
sns.kdeplot(b1HR)

# plt.legend()
plt.legend(['b1SDNN', 'b1NN50', 'b1RMSSD', 'b1LF', 'b1LF/HF', 'b1RESP', 'b1TEMP', 'b1HR'], fontsize=10)

- - -

## Central Limit Theorem approach

- - -

## Correlation between data features

* To generate new dataset from each feature distribution, we have to realize the correlation and regression coefficients.

In [None]:
hrv_visual.columns

In [None]:
hrv_visual.corr()

* Visualize the correlation

In [None]:
plt.figure(figsize = (15,15))
corrMat = hrv_visual.corr()
sns.heatmap(corrMat, annot=True)
plt.show()

* Check whether each correlation coefficient is reliable

In [None]:
## pearsonr function shows individual correlation coefficient with p-value
pearsonr(hrv_visual['SDNN'], hrv_visual['NN50'])

In [None]:
## for loop to calculate correlation coefficient and following p-values for every variables.
col = list(hrv_visual)
corr_result = []
for i in range(0,len(col)-1):
    a = hrv_visual[hrv_visual.columns[i]]
    i += 1
    b = hrv_visual[hrv_visual.columns[i]]
    cor = pearsonr(a, b)
    corr_result.append(cor)

In [None]:
corr_result_df = pd.DataFrame(corr_result, columns=['correlation', 'p-value'])

In [None]:
var_names = []
for i in range(0,len(col)-1):
    cur_var = (col[i], col[i+1])
    var_names.append(cur_var)

In [None]:
var_names_df = pd.DataFrame(var_names, columns=['Variable #1', 'Variable #2'])

In [None]:
correlation_df = pd.concat([var_names_df, corr_result_df], axis=1)

In [None]:
correlation_df['reliability'] = np.where(correlation_df['p-value']<0.05, "o", "x")

In [None]:
correlation_df

In [None]:
sd = np.std(hrv_visual['SDNN'])

In [None]:
hrv_visual.mean()

In [None]:
hrv_visual.std()

## Regression Coefficients

* To generate new dataset from each feature distribution, we have to realize the correlation and regression coefficients.

In [None]:
hrv_visual.columns

In [None]:
# features = hrv_visual[['SDNN', 'NN50', 'PNN50', 'RMSSD', 'VLF', 'LF', 'HF', 'LF/HF', 'POWER', 'HR', 'RESP', 'SC', 'TEMP']]
# features = hrv_visual[['SDNN', 'NN50', 'PNN50', 'RMSSD', 'LF/HF', 'HR']]
features = hrv_visual[['PNN50', 'LF/HF', 'HR']] ## variables that mentioned from previous research.(professor Jeon.)

disorder = hrv_visual[['disorder']]

In [None]:
train_features, test_features, train_labels, test_labels = train_test_split(features, disorder)

In [None]:
model = LogisticRegression()
model.fit(train_features, train_labels)

In [None]:
print(model.score(train_features, train_labels))

In [None]:
print(model.coef_)

- - -

# Data Analysis

## Data Selection

In [None]:
X = hrv_b1_s_sub
Y = hrv_disorder

In [None]:
var_selection = ["SDNN", "NN50","PNN50", "RMSSD", "LF", "HF", "LF/HF", "HR"] ## Choose the variables that must be adopted for input values
X = X.loc[:,var_selection]

In [None]:
## Generating dataset with y label on it. 
hrv_data = pd.concat([hrv_s_r_sub, hrv_disorder], axis=1)

In [None]:
hrv_data.head()

## Train-Test Split

In [None]:
X.columns

In [None]:
## Split X and Y into training dataset and test dataset
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42)

In [None]:
print("x_train dataset shape is", x_train.shape)
print("y_train dataset shape is", y_train.shape)

print("x_test dataset shape is", x_test.shape)
print("y_test dataset shape is", y_test.shape)

In [None]:
## Converting dataframe format into numpy array
x_train_np = x_train.to_numpy()
y_train_np = y_train.to_numpy()
x_test_np = x_test.to_numpy()
y_test_np = y_test.to_numpy()

In [None]:
## Use TensorDataset to create dataset with ndarray
train_dataset = TensorDataset(torch.tensor(x_train_np), torch.tensor(y_train_np))
test_dataset = TensorDataset(torch.tensor(x_test_np), torch.tensor(y_test_np))

In [None]:
## Setting trainloader and testloader for training
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=args.bs, shuffle=True, num_workers=4)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=args.bs, shuffle=False, num_workers=4)

- - -

## Regression

In [None]:
logistic_reg = LogisticRegression(solver='lbfgs', max_iter = 4000)
logistic_reg.fit(x_train, y_train.values.ravel())

In [None]:
predictions = logistic_reg.predict(x_test)

In [None]:
print(confusion_matrix(y_test,predictions))

- - -

## Decision Tree

In [None]:
dt_model=DecisionTreeClassifier()
dt_model.fit(x_train, y_train)

In [None]:
dt_pred = dt_model.predict(x_test)

In [None]:
print(confusion_matrix(y_test,dt_pred))

In [None]:
print(classification_report(y_test,dt_pred))

- - -

## Random Forest Classification

In [None]:
rf= RandomForestClassifier(n_estimators=5000)
rf.fit(x_train, y_train.values.ravel())

In [None]:
rf_pre=rf.predict(x_test)

In [None]:
print(confusion_matrix(y_test, rf_pre))

In [None]:
print(classification_report(y_test, rf_pre))

- - -

## XGBoosts Classifier

In [None]:
xgboost = XGBClassifier(n_estimators=1000, eval_metric='mlogloss')
xgboost.fit(x_train, y_train)

In [None]:
xg_pred = xgboost.predict(x_test)

In [None]:
print(confusion_matrix(y_test, xg_pred))

In [None]:
print(classification_report(y_test, xg_pred))

- - -

## Multi-Layer Perceptron

* Simple MLP

In [None]:
input_size = x_train.shape[1]

In [None]:
class MLP_HRV(nn.Module):
    def __init__(self):
        super(MLP_HRV, self).__init__()
        self.layer1 = nn.Linear(input_size, 128)
        self.layer2 = nn.Linear(128, 128)
        self.layer3 = nn.Linear(128, 3)

    def forward(self, x):
        x = x.view(-1, input_size)
        x = self.layer1(x)
        x = F.relu(x)
        x = self.layer2(x)
        x = F.relu(x)
        x = self.layer3(x)
        x = F.log_softmax(x, dim=1)
        return x

In [None]:
model = MLP_HRV().to(DEVICE)
print(model)

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

In [None]:
x_train = torch.tensor(x_train.values)

In [None]:
for epoch in range(args.epochs):
    loss = 0
    batch = len(train_loader)

    for images, labels in train_loader: 
        images = images.view(-1, input_size).to(DEVICE) 
        labels = labels.to(DEVICE)
        
        optimizer.zero_grad()
        hypothesis = model(images)
        cost = criterion(hypothesis, labels)
        cost.backward()
        optimizer.step()
        loss += cost / batch

    print('Epoch:', '%03d' % (epoch + 1), 'Training loss =', '{:.5f}'.format(loss))

- - -

## Convolutional Neural Network

In [None]:
print("X shape is ", X.shape)
print("Y shape is ", Y.shape)

In [None]:
X.head()

- - -

## Autoencoder

* Here, we are going to use autoencoder algorithm to effectively extract the core features from dataset
* Autoencoder is useful for reducing high-dimensionality dataset

In [None]:
task = hrv_s_b2_sub
# data_auto = pd.concat([task, hrv_disorder], axis=1)
data_auto = pd.concat([hrv_only, hrv_disorder], axis=1)

In [None]:
data_auto.head()

In [None]:
class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()
        
        ## encoder is similar to the simple neural network
        self.encoder = nn.Sequential(
            nn.Linear(), # gradually reducing dimensionality
            nn.ReLU(),
            nn.Linear(),
            nn.ReLU(),
            nn.Linear(),
        )
        ## decoder is recovering the dimensionality to origianl dataset size
        self.decoder = nn.Sequential(
            nn.Linear(), # gradually increasing dimensionality
            nn.ReLU(),
            nn.Linear(),
            nn.ReLU(),
            nn.Linear(),            
        )
        
    def forward(self, x):
        encoded = self.encoder(x)         ## creating latent varialbe 'encoder'
        decoded = self.decoder(encoded)   ## generating recovered image 'decoded'
        return encoded, decoded

In [None]:
autoencoder = Autoencoder().to(DEVICE)
optimizer = torch.optim.Adam(autoencoder.parameters(), lr = args.lr)  ## Adam for optimization function.
criterion = nn.MSELoss()  ## Using MSE(Mean Squared Error) to calculate the differences between original data and decoded data