# Evaluating Dataset Performance in Mental Health Classification

| Method      | Type | F1-Mac | P-Mac | R-Mac | F1-Mic | P-Mic | R-Mic | Avg.   |
|-------------|------|--------|-------|-------|--------|-------|-------|--------|
| **Dataset 1**                                                                                   |
| SVM         | ML   | 0.46   | 0.71  | 0.42  | 0.62   | 0.75  | 0.52  | 0.58   |
| Light GBM   | ML   | 0.58   | 0.48  | 0.80  | 0.65   | 0.52  | 0.86  | 0.65   |
| XGBoost     | ML   | 0.57   | 0.62  | 0.54  | 0.65   | 0.69  | 0.62  | 0.62   |
| GAN-BERT    | DL   | 0.70   | 0.69  | 0.72  | 0.75   | 0.73  | 0.77  | 0.73   |
| BERT        | DL   | 0.74   | 0.72  | 0.77  | 0.79   | 0.76  | 0.83  | 0.77   |
| BART        | DL   | 0.76   | 0.70  | 0.81  | 0.80   | 0.74  | 0.86  | 0.78   |
| **Dataset 2**                                                                                   |
| SVM         | ML   | 0.04   | 0.04  | 0.04  | 0.07   | 0.13  | 0.04  | 0.06   |
| Light GBM   | ML   | 0.04   | 0.03  | 0.03  | 0.06   | 0.12  | 0.04  | 0.05   |
| XGBoost     | ML   | 0.06   | 0.06  | 0.05  | 0.07   | 0.10  | 0.05  | 0.07   |
| GAN-BERT    | DL   |        |       |       |        |       |       |        |
| BERT        | DL   | 0.80   | 0.81  | 0.80  | 0.79   | 0.79  | 0.79  | 0.79   |
| BART        | DL   |        |       |       |        |       |       |        |
| **Dataset 3**                                                                                   |
| SVM         | ML   | 0.12   | 0.25  | 0.11  | 0.16   | 0.26  | 0.12  | 0.17   |
| Light GBM   | ML   | 0.13   | 0.26  | 0.13  | 0.17   | 0.26  | 0.13  | 0.18   |
| XGBoost     | ML   | 0.12   | 0.25  | 0.11  | 0.16   | 0.26  | 0.12  | 0.17   |
| GAN-BERT    | DL   |        |       |       |        |       |       |        |
| BERT        | DL   | 0.90   | 0.91  | 0.90  | 0.90   | 0.90  | 0.90  | 0.90   |
| BART        | DL   |        |       |       |        |       |       |        |


## Dataset 1

In [4]:
emotion_list = ['anger', 'brain dysfunction (forget)', 'emptiness', 'hopelessness', 'loneliness', 'sadness', 'suicide intent', 'worthlessness']

### 1. SVM

In [3]:
from DepressionEmo.svm import SVM_MentalHealthClassifier

classifier = SVM_MentalHealthClassifier(
    train_path='./DepressionEmo/Dataset/train.json',
    model_path='./Model/Dataset1/svc_model.pkl',
    emotion_list=emotion_list,
    val_path='./DepressionEmo/Dataset/val.json',
    test_path='./DepressionEmo/Dataset/test.json',
    dataset_type='dataset1'
)

x_train, x_val, x_test, y_train, y_val, y_test = classifier.load_and_process_data()
model = classifier.train_model(x_train, y_train)
metrics = classifier.evaluate_model(model, x_test, y_test)

  from .autonotebook import tqdm as notebook_tqdm
  y = column_or_1d(y, warn=True)


Fitting SVC took 36.41 seconds
{'f1_micro': 0.6178690538880113, 'recall_micro': 0.5238095238095238, 'precision_micro': 0.753102267864784, 'f1_macro': 0.4619270084520876, 'recall_macro': 0.4203468366154963, 'precision_macro': 0.706436681375938}


### 2. Light GBM

In [4]:
from DepressionEmo.light_gbm import LightGBM_MentalHealthClassifier

classifier = LightGBM_MentalHealthClassifier(
    train_path='DepressionEmo/Dataset/train.json',
    val_path='DepressionEmo/Dataset/val.json',
    test_path='DepressionEmo/Dataset/test.json',
    model_save_path='./Model/Dataset1/LightGBM.txt',
    emotion_list=emotion_list,
    label_list=emotion_list,
    dataset_type='dataset1'
)

classifier.train_model()

{'f1_micro': 0.6454908929364727, 'recall_micro': 0.8590008867868756, 'precision_micro': 0.5169898594556129, 'f1_macro': 0.5782674332411973, 'recall_macro': 0.7994555332835106, 'precision_macro': 0.478785406973599}


{'f1_micro': 0.6454908929364727,
 'recall_micro': 0.8590008867868756,
 'precision_micro': 0.5169898594556129,
 'f1_macro': 0.5782674332411973,
 'recall_macro': 0.7994555332835106,
 'precision_macro': 0.478785406973599}

### 3. XGBoost

In [5]:
from DepressionEmo.xgb import XGB_MentalHealthClassifier

classifier = XGB_MentalHealthClassifier(
    train_path='DepressionEmo/Dataset/train.json',
    val_path='DepressionEmo/Dataset/val.json',
    test_path='DepressionEmo/Dataset/test.json',
    model_path='./Model/Dataset1/XGB.pkl',
    max_depth=8,
    n_estimators=100,
    learning_rate=0.5,
    emotion_list = ['anger', 'brain dysfunction (forget)', 'emptiness', 'hopelessness', 
                             'loneliness', 'sadness', 'suicide intent', 'worthlessness']
)

x_train, x_test, x_val, y_train, y_test, y_val = classifier.load_data()
model, label_encoder = classifier.train_model(x_train, y_train)
result = classifier.evaluate_model(model, label_encoder, x_test, y_test)
print("Evaluation Results:", result)

Evaluation Results: {'f1_micro': 0.6454133458352868, 'recall_micro': 0.6104049660065031, 'precision_micro': 0.6846816976127321, 'f1_macro': 0.5747847886708597, 'recall_macro': 0.5423582185167811, 'precision_macro': 0.6135163527186734}


### 4. BERT

Train

In [1]:
!python ./DepressionEmo/bert.py  --mode "train" --model_name "bert-base-cased" --epochs 25 --batch_size 8 --max_length 256 --train_path "DepressionEmo/Dataset/train.json" --val_path "DepressionEmo/Dataset/val.json" --test_path "DepressionEmo/Dataset/test.json"

2024-12-24 04:04:52.119529: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-24 04:04:52.119582: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-24 04:04:52.119603: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-24 04:04:52.124733: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
--------------------------------------------------

Test

In [2]:
!python ./DepressionEmo/bert.py --mode "test" --train_path "DepressionEmo/Dataset/train.json" --val_path "DepressionEmo/Dataset/val.json" --test_path "DepressionEmo/Dataset/test.json" --max_length 300 --test_batch_size 16

2024-12-24 04:43:54.524249: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-24 04:43:54.524299: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-24 04:43:54.524320: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-24 04:43:54.529784: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  model.load_state_dict(torch.load(saved_model_fil

### 5. GAN BERT

Train

In [2]:
!python ./DepressionEmo/gan.py --mode "train" --model_name "bert-base-cased" --lr_discriminator 2e-5 --lr_generator 2e-5 --epochs 25 --batch_size 8

2024-12-24 11:31:20.126126: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-24 11:31:20.126176: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-24 11:31:20.126200: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-24 11:31:20.131183: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
There are 1 GPU(s) available.
We will use the GPU:

Test

### 6. BART

## Dataset 2

In [6]:
label_list = [0,1,2]

In [7]:
emotion_list = ['Figurative Mentions', 'Non-Health Mentions', 'Health mentions']

### 1. SVM

In [8]:
from DepressionEmo.svm import SVM_MentalHealthClassifier

classifier = SVM_MentalHealthClassifier(
    train_path='./Dataset2/RHMD-Health-Mention-Dataset/RHMD_3_Class.csv',
    model_path='./Model/Dataset2/svc_model.pkl',
    emotion_list=emotion_list,
    label_list=label_list,
    dataset_type='dataset2'
)

x_train, x_val, x_test, y_train, y_val, y_test = classifier.load_and_process_data()
model = classifier.train_model(x_train, y_train)
metrics = classifier.evaluate_model(model, x_test, y_test)

  from .autonotebook import tqdm as notebook_tqdm
  y = column_or_1d(y, warn=True)


Fitting SVC took 10.27 seconds
{'f1_micro': 0.0658682634730539, 'recall_micro': 0.043941411451398134, 'precision_micro': 0.13147410358565736, 'f1_macro': 0.04330708661417323, 'recall_macro': 0.04280155642023346, 'precision_macro': 0.04382470119521912}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


### 2. Light GBM

In [9]:
from DepressionEmo.light_gbm import LightGBM_MentalHealthClassifier

classifier = LightGBM_MentalHealthClassifier(
    train_path='./Dataset2/RHMD-Health-Mention-Dataset/RHMD_3_Class.csv',
    model_save_path='./Model/Dataset2/LightGBM.txt',
    emotion_list=emotion_list,
    label_list=label_list,
    dataset_type='dataset2'
)

classifier.train_model()

{'f1_micro': 0.05997001499250375, 'recall_micro': 0.03994673768308921, 'precision_micro': 0.12024048096192384, 'f1_macro': 0.039486673247778874, 'recall_macro': 0.038910505836575876, 'precision_macro': 0.04008016032064128}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


{'f1_micro': 0.05997001499250375,
 'recall_micro': 0.03994673768308921,
 'precision_micro': 0.12024048096192384,
 'f1_macro': 0.039486673247778874,
 'recall_macro': 0.038910505836575876,
 'precision_macro': 0.04008016032064128}

### 3. XGBoost

In [10]:
from DepressionEmo.xgb import XGB_MentalHealthClassifier

classifier = XGB_MentalHealthClassifier(
    train_path='./Dataset2/RHMD-Health-Mention-Dataset/RHMD_3_Class.csv',
    model_path='./Model/Dataset2/XGB.pkl',
    max_depth=8,
    n_estimators=100,
    learning_rate=0.5,
    emotion_list = emotion_list,
    label_list=label_list,
    dataset_type='dataset2'
)

x_train, x_test, x_val, y_train, y_test, y_val = classifier.load_data()
model, label_encoder = classifier.train_model(x_train, y_train)
result = classifier.evaluate_model(model, label_encoder, x_test, y_test)
print("Evaluation Results:", result)


Evaluation Results: {'f1_micro': 0.10637278390033542, 'recall_micro': 0.07390146471371505, 'precision_micro': 0.18974358974358974, 'f1_macro': 0.06733393994540492, 'recall_macro': 0.07198443579766538, 'precision_macro': 0.06324786324786325}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


### 4. BERT

Train

In [1]:
!python ./DepressionEmo/bert.py  --mode "train" --model_name "bert-base-cased" --epochs 25 --batch_size 8 --max_length 256 --train_path "./Dataset2/RHMD-Health-Mention-Dataset/RHMD_3_Class.csv" --dataset_type "dataset2"

2024-12-23 17:41:23.037927: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-23 17:41:23.037982: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-23 17:41:23.038004: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-23 17:41:23.043364: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
--------------------------------------------------

Test

In [1]:
!python ./DepressionEmo/bert.py --mode "test" --train_path "./Dataset2/RHMD-Health-Mention-Dataset/RHMD_3_Class.csv" --max_length 300 --test_batch_size 16 --dataset_type "dataset2"

2024-12-23 18:39:10.056129: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-23 18:39:10.056186: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-23 18:39:10.056207: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-23 18:39:10.061956: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  model.load_state_dict(torch.load(saved_model_fil

## 5. GAN BERT

## 6. BART

## Dataset 3

In [11]:
emotion_list = [0, 1]

### 1. SVM

In [12]:
from DepressionEmo.svm import SVM_MentalHealthClassifier

classifier = SVM_MentalHealthClassifier(
    train_path='./Dataset3/dreaddit-train.csv',
    model_path='./Model/Dataset3/svc_model.pkl',
    emotion_list=emotion_list,
    label_list=emotion_list,
    test_path='./Dataset3/dreaddit-test.csv',
    dataset_type='dataset3'
)

x_train, x_val, x_test, y_train, y_val, y_test = classifier.load_and_process_data()
model = classifier.train_model(x_train, y_train)
metrics = classifier.evaluate_model(model, x_test, y_test)

Extra columns in train: []
Extra columns in test: []


  y = column_or_1d(y, warn=True)


Fitting SVC took 2.59 seconds
{'f1_micro': 0.1613316261203585, 'recall_micro': 0.11819887429643527, 'precision_micro': 0.2540322580645161, 'f1_macro': 0.1218568665377176, 'recall_macro': 0.1171003717472119, 'precision_macro': 0.12701612903225806}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


### 2. Light GBM

In [13]:
from DepressionEmo.light_gbm import LightGBM_MentalHealthClassifier

classifier = LightGBM_MentalHealthClassifier(
    train_path='./Dataset3/dreaddit-train.csv',
    model_save_path='./Model/Dataset3/LightGBM.txt',
    emotion_list=emotion_list,
    label_list=emotion_list,
    test_path='./Dataset3/dreaddit-test.csv',
    dataset_type='dataset3'
)

classifier.train_model()

Extra columns in train: []
Extra columns in test: []
{'f1_micro': 0.1710691823899371, 'recall_micro': 0.1275797373358349, 'precision_micro': 0.2595419847328244, 'f1_macro': 0.128060263653484, 'recall_macro': 0.12639405204460966, 'precision_macro': 0.1297709923664122}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


{'f1_micro': 0.1710691823899371,
 'recall_micro': 0.1275797373358349,
 'precision_micro': 0.2595419847328244,
 'f1_macro': 0.128060263653484,
 'recall_macro': 0.12639405204460966,
 'precision_macro': 0.1297709923664122}

### 3. XGBoost

In [14]:
from DepressionEmo.xgb import XGB_MentalHealthClassifier

classifier = XGB_MentalHealthClassifier(
    train_path='./Dataset3/dreaddit-train.csv',
    test_path='./Dataset3/dreaddit-test.csv',
    model_path='./Model/Dataset3/XGB.pkl',
    max_depth=8,
    n_estimators=100,
    learning_rate=0.5,
    emotion_list = emotion_list,
    label_list=emotion_list,
    dataset_type='dataset3'
)

x_train, x_test, x_val, y_train, y_test, y_val = classifier.load_data()
model, label_encoder = classifier.train_model(x_train, y_train)
result = classifier.evaluate_model(model, label_encoder, x_test, y_test)
print("Evaluation Results:", result)


Extra columns in train: []
Extra columns in test: []
Evaluation Results: {'f1_micro': 0.22081218274111675, 'recall_micro': 0.16322701688555347, 'precision_micro': 0.3411764705882353, 'f1_macro': 0.16603053435114504, 'recall_macro': 0.16171003717472118, 'precision_macro': 0.17058823529411765}


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


### 4. BERT

Train

In [1]:
!python ./DepressionEmo/bert.py  --mode "train" --model_name "bert-base-cased" --epochs 25 --batch_size 8 --max_length 256 --train_path "./Dataset3/dreaddit-train.csv" --test_path "./Dataset3/dreaddit-test.csv" --dataset_type "dataset3"

2024-12-24 11:01:05.369491: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-24 11:01:05.369553: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-24 11:01:05.369600: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-24 11:01:05.375511: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
--------------------------------------------------

Test

In [1]:
!python ./DepressionEmo/bert.py --mode "test" --train_path "./Dataset3/dreaddit-train.csv" --test_path "./Dataset3/dreaddit-test.csv" --max_length 300 --test_batch_size 16 --dataset_type "dataset3"

2024-12-24 11:26:04.272764: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-24 11:26:04.272817: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-24 11:26:04.272841: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-24 11:26:04.278016: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  model.load_state_dict(torch.load(saved_model_fil

### 5. GAN BERT

### 6. BART