Skip to content

This is the official repository for ACL 2022 paper, GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

Notifications You must be signed in to change notification settings

LinguisticAnomalies/hammer-nets

Repository files navigation

This repository contains code developed for the paper entitled "GPT-D: Inducing Dementia-related Linguistic Anomalies by DeliberateDegradation of Artificial Neural Language Models"

NOTE ON DATA: While the data used in this paper are publicly available, we are not able to redistribute any of these data as per Data Use agreement with Dementia Bank and the Carolinas Conversations Collection. In order to obtain these data, individual investigators need to contact the Dementia Bank and CCC and request access to the data.

Folders

  • scripts: folder that contains all codes, scripts and jupyter notebooks

    • data: this folder contains cleaned data. Please note we are not allowed to redistrubte the data - please contact the data provider for permission. Once pre-processed, please put the data under this folder to process.
    • baseline.py: this script is designed to fine-tuned base BERT model and validate on other dataset
    • cal_lex.py: this script is designed to calcualte mean log lexical frequency based on the text generated by GPT-2 and GPT-D and run t-test
    • cal_ppl.py: this script is designed to calculate cross-validation of paired perpelxity paradiam
    • cal_eval.py: this script is designed to calculate 5 fold cross validation for the cumulative method
    • desp.py: this script is designed to run ground truth of dataset
    • find_best_conf.py: this script is designed to find the best impairment configuration on different dataset for the cumulative method
    • util_fun.py: this script contains several utility functions we used for this paper
  • results: folder that contains the wrapped-up results from the scripts

    • notebooks: folder that contains jupyter notebook visualization
      • across_share.ipynb: this notebook displays various best-performed impairment patterns generated by cumulative method
      • cross_validation.ipynb: this notebook displays cross validation results for cumulative impairment method
      • visualization.ipynb: this notebook shows the saliency visualization of GPT-2 and GPT-D

Please refer to cumulative_eval_result.md for cumulative method evaluation results.

About

This is the official repository for ACL 2022 paper, GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published