This repository contains code developed for the paper entitled "GPT-D: Inducing Dementia-related Linguistic Anomalies by DeliberateDegradation of Artificial Neural Language Models"
NOTE ON DATA: While the data used in this paper are publicly available, we are not able to redistribute any of these data as per Data Use agreement with Dementia Bank and the Carolinas Conversations Collection. In order to obtain these data, individual investigators need to contact the Dementia Bank and CCC and request access to the data.
-
scripts
: folder that contains all codes, scripts and jupyter notebooksdata
: this folder contains cleaned data. Please note we are not allowed to redistrubte the data - please contact the data provider for permission. Once pre-processed, please put the data under this folder to process.baseline.py
: this script is designed to fine-tuned base BERT model and validate on other datasetcal_lex.py
: this script is designed to calcualte mean log lexical frequency based on the text generated by GPT-2 and GPT-D and run t-testcal_ppl.py
: this script is designed to calculate cross-validation of paired perpelxity paradiamcal_eval.py
: this script is designed to calculate 5 fold cross validation for the cumulative methoddesp.py
: this script is designed to run ground truth of datasetfind_best_conf.py
: this script is designed to find the best impairment configuration on different dataset for the cumulative methodutil_fun.py
: this script contains several utility functions we used for this paper
-
results
: folder that contains the wrapped-up results from the scriptsnotebooks
: folder that contains jupyter notebook visualizationacross_share.ipynb
: this notebook displays various best-performed impairment patterns generated by cumulative methodcross_validation.ipynb
: this notebook displays cross validation results for cumulative impairment methodvisualization.ipynb
: this notebook shows the saliency visualization of GPT-2 and GPT-D
Please refer to cumulative_eval_result.md
for cumulative method evaluation results.