Skip to content

Bachelor Thesis - Deep Learning-based Multi-modal Depression Estimation

Notifications You must be signed in to change notification settings

Ram-lankada/DepressionEstimation

 
 

Repository files navigation

Multi-modal Depression Estimation based on Sub-attentional Fusion

This is the whole project of my bacherlor thesis. It is about automatic depression estimation based on AI approach. The publication of the paper can be found at here or here

This repository includes:

  • The Source Code of final model, "Sub-attentional ConvBiLSTM with AVT modality"
  • The Source Code of other models
  • Trained weights of each model
  • Other useful scripts
  • Documentations and images

DAIC-WOZ Depression Database

DAIC-WOZ depression database is utilized in this thesis. Here is the official link, where you can send a request and receive your own username and password to access the database.

Some visualizations for each data type exploited in this work.

  1. Visual data of micro-facial expression

Visual data of micro-facial expression

  1. Acoustic data of log-mel spectrogram

Acoustic data of log-mel spectrogram

  1. Text data of sentence embeddings

Text data of sentence embeddings

To download the database automatically, there is a created script, download_DAIC-WOZ.py, in the daic_woz_preprocessing directory for you to use. Please enter this directory and run the script with the following two codes, respectively.

cd <path/to/daic_woz_preprocessing>

python download_DAIC-WOZ.py --out_dir=<where/to/store/absolute_path> --username=<the_give_username> --password=<the_given_password>

After downloading the dataset, if you wish to preprocess the data as we did, two kind of scripts for database generation are provided in the database_generation_v1 and database_generation_v2 directory under the folder, daic_woz_preprocessing.

  • For database_generation_v1: This will generate training, validation, test dataset based on train-set, develop-set, test-set, respectively.

    To use this script, please first go into the folder daic_woz_preprocessing/Excel for splitting data and find the GT file you need for splitting the data. In our case, it will be "train_split_Depression_AVEC2017.csv", "dev_split_Depression_AVEC2017.csv", and "full_test_split.csv". Or you can also utilize the one provided from the official DAIC_WOZ. Then open the script you want to use and go to the line if __name__ == '__main__':, where you can find # output root, # read gt file, # get all files path of participant nearby. Please enter the absolute path to where you want to store, where the GT file is, and where each data type is to each variable. Now you can run the following example code to generate your dataset:

    python <path/to/script/name>
    # for example: python daic_woz_preprocessing/database_generation_v1/database_generation_train.py

    Be aware !!! The generated database could be over 200GB since this generated dataset include all of the variable in this thesis for comparison studies in the experiments. Therefore, please exclude the "np.save" parts in the "sliding_window" function, which you don't need. For instance, the one with coordinate+confidence, spectrogram, hog_features, action units, etc. Moreover, you might also don't need the original dataset, which has been tested and demonstrated in this thesis that it is not that useful. Please also exlude all of the code related to it if you don't need it.

  • For database_generation_v2: Similar to database_generation_v1, but this time train dataset is generated based on "train-set + develop-set" and text dataset is test-set itself.

    To use this script, please first go into the folder daic_woz_preprocessing/Excel for splitting data and find the GT file you need for splitting the data. In our case, it will be "full_train_split_Depression_AVEC2017.csv" and "full_test_split.csv". Then for the rest of the steps, please refer to database_generation_v1 above.

Sub-attentional ConvBiLSTM with AVT modality

The overall archetecture of the Sub-attentional ConvBiLSTM model: Sub-attentional_ConvBiLSTM

The CNN layers and BiLSTM blocks are utilized as feature extractors, followed by 8 different attentional late fusion layers with 8 classification heads. For more detail, please refer to the paper.

The archetecture of each attentional fusion layer, which includes a gobal attention and a local attention, could be illustrated as below: attentional fusion layer

Installation

  1. Install Nvidia Driver, CUDA and cuDNN to enable Nvidia GPU

  2. Create a root directory (e.g. SD-MaskRCNN) and clone this repository into it

    git clone </path/to/repository>

    or you could just download this repository and unzip the downloaded file

  3. Install Anaconda from official link

  4. Install dependencies in an enviroment in Anaconda

    import directly the enviroment from the environment.yml file. run:

    conda env create --name <where/to/store/this/env+env_name> --file=<path/to/environment.yml/file>
    # for example: conda env create --name ./path/envname --file=environment.yml

    This will automatically create an enviroment named "SD-MaskRCNN", which includes all the libraries that DepressionEstimation needs

    • if the method above doesn't work, you then have to create a new enviroment and install requirements. After that their might still have some libraries missing. You could only fix this problem by gradually testing and debugging to see what error message you got. It will show what library or module is missing and you then install the library with conda command in this enviroment

       # Please replace "myenv" with the environment name.
       conda create --name myenv
       # go into root directory of DepressionEstimation
       cd <path/to/root/directory>
       # Install requirements
       pip3 install -r requirements.txt
    • For more infomation about Anaconda enviroments managing, please check this links

  5. Download the model weights and store them in the each model in models/<which model>/model_weights directory. The followings are the available pre-trained weights:

  6. (Optional) To train or test on MS COCO install pycocotools from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).

    or run this to install this package with conda

    conda install -c conda-forge pycocotools
    

Implementation

To implement the model, please choose a model you desire in the models directory first.

cd models/<desired model>
# for example: cd models/AVT_ConvLSTM_Sub-Attention

For each model folder, the following structure can be found

<AVT_ConvLSTM_Sub-Attention>/
config/
    config_inference.yaml
    config_phq-subscores.yaml
dataset/
    dataset.py
    utils.py
models/
    convlstm.py
    evaluator.py
    fusion.py
    ...
model_weights/
    <where to store the pretrained weights>
    ...
main_inference.py
main_phq-subscores.py
utils.py

If some folders did not exist, e.g. "model_weights", please create it by yourself. As one can notice, each configuration file (config) corresponds to each main script (main_xxxxx.py)and the utility script (utils.py) under the main scripts contains all the local functions for the main scripts. Please make sure the strucutre stay consistence like this, otherwise the model won't work.

Test a Model

To test a model, please first make sure the data is generated and change all the configuration in the config_inference.yaml according to your desire. Also download the pre-trained weights if needed. Then run:

python main_inference.py

For more complex setting, run the following code and set each value to your desire

python main_inference.py --config_file=<path/to/config.yaml> --device=<'cuda' or 'cpu'> --gpu=<'gpu ID' can be multiple like '2, 3'> --save=<True or False>
# for example: python main_inference.py --config_file=config/config_inference.yaml --device=cuda --gpu=2,3 --save=False

Train a New Model

To train a model, please also first make sure the data is generated and change all the configuration in the config_phq-subscores.yaml according to your desire. Then run:

python main_phq-subscores.py

For more complex setting, run the following code and set each value to your desire

python main_phq-subscores.py --config_file=<path/to/config.yaml> --device=<'cuda' or 'cpu'> --gpu=<'gpu ID' can be multiple like '2, 3'> --save=<True or False>
# for example: python main_phq-subscores.py --config_file=config/config_phq-subscores.yaml --device=cuda --gpu=2,3 --save=True

Results

visualization_of_recombination

About

Bachelor Thesis - Deep Learning-based Multi-modal Depression Estimation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 83.0%
  • Python 17.0%