Skip to content
ACL 2019 SRW: "Enriching Neural Models with Targeted Features for Dementia Detection" code.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
feature_sets
0_pitt_transcript_preprocessing_and_pickle.py
1_pitt_anagraphic_information.py initial commit Jul 28, 2019
2_psycolinguistic_features_computation_and_merge.py initial commit Jul 28, 2019
4_final_model.py
README.md Update README.md Jul 28, 2019
__init__.py initial commit Jul 28, 2019

README.md

Enriching Neural Model with Targeted Features for Dementia Detection

The implementation details of our model are contained in the file: 4_final_model.py. Here we are providing the latest version of the CNN-LSTM model including:

  • hand-crafted features
  • the attention mechanism
  • class weights balance

Download Pitt Corpus

In order to run the experiments it is necessary to download the Pitt Corpus transcripts from here: https://dementia.talkbank.org/access/English/Pitt.html

We do not include that data in our submission because it is private, and authorization to use them is needed. To obtain authorization, follow the instructions at: https://dementia.talkbank.org/.

Folder Structure

Once the data has been downloaded it is necessary to maintain this folder structure; empty folders are left in our supplement as a reference:

root:
 - data:
 -- Pitt_transcripts:
 --- Control:
 ---- cookie:
 ---- fluency:
 --- Dementia:
 ---- cookie:
 ---- fluency:
 ---- recall:
 ---- sentence:

Instructions

It is necessary to run the following steps:

  1. Run file 0_pitt_transcript_preprocessing_and_pickle.py. This will preprocess the interviews and create a .pickle file.

  2. Run 1_pitt_anagraphic_information.py. This script will produce a .pickle file containing demographic information for the patients starting from the file anagraphic_modded.csv (this section of the dataset is freely available).

  3. Run 2_psycolinguistic_features_computation_and_merge.py. This file will merge the above produced files and compute other linguistic features mentioned in the paper. This file will produce "pitt_full_interview_features.pickle," which is necessary to run the model.

  4. Download Glove embeddings 300d from: http://nlp.stanford.edu/data/glove.6B.zip and place them into the glove.6B folder.

  5. Run the 4_final_model.py file. This file will train the model and perform tests with three different data shuffles. It will produce a list of three dictionaries containing fundamental classifier metrics obtained on each split.

You can’t perform that action at this time.