Skip to content
ACL 2019 SRW: "Enriching Neural Models with Targeted Features for Dementia Detection" code.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
feature_sets initial commit Jul 28, 2019 initial commit Jul 28, 2019 Update Jul 28, 2019 initial commit Jul 28, 2019

Enriching Neural Model with Targeted Features for Dementia Detection

The implementation details of our model are contained in the file: Here we are providing the latest version of the CNN-LSTM model including:

  • hand-crafted features
  • the attention mechanism
  • class weights balance

Download Pitt Corpus

In order to run the experiments it is necessary to download the Pitt Corpus transcripts from here:

We do not include that data in our submission because it is private, and authorization to use them is needed. To obtain authorization, follow the instructions at:

Folder Structure

Once the data has been downloaded it is necessary to maintain this folder structure; empty folders are left in our supplement as a reference:

 - data:
 -- Pitt_transcripts:
 --- Control:
 ---- cookie:
 ---- fluency:
 --- Dementia:
 ---- cookie:
 ---- fluency:
 ---- recall:
 ---- sentence:


It is necessary to run the following steps:

  1. Run file This will preprocess the interviews and create a .pickle file.

  2. Run This script will produce a .pickle file containing demographic information for the patients starting from the file anagraphic_modded.csv (this section of the dataset is freely available).

  3. Run This file will merge the above produced files and compute other linguistic features mentioned in the paper. This file will produce "pitt_full_interview_features.pickle," which is necessary to run the model.

  4. Download Glove embeddings 300d from: and place them into the glove.6B folder.

  5. Run the file. This file will train the model and perform tests with three different data shuffles. It will produce a list of three dictionaries containing fundamental classifier metrics obtained on each split.

You can’t perform that action at this time.