Skip to content
Code and data to reproduce Stoltz and Taylor (2019) "Concept Mover's Distance"
R
Branch: master
Clone or download
Pull request Compare This branch is 1 commit behind dustinstoltz:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
1_FINAL_CMD_Function.R
1_cmd_bible_cleaned.R
1_cmd_homer_cleaned.R
1_cmd_jcss_packages.R
1_cmd_shakespeare_cleaned.R
1_cmd_sotu_cleaned.R
2_kjv_meta.csv
2_list_basic_concepts.csv
2_shakes_meta.csv
README.md

README.md

Concept Mover's Distance: Reproduction Guide

Dustin S. Stoltz and Marshall A. Taylor

This is the code and data necessary to reproduce the measures, graphs, and plots for Stoltz and Taylor (2019) "Concept Mover's Distance," forthcoming in the Journal of Computational Social Science. A preprint is available on SocArxiv at https://osf.io/preprints/socarxiv/5hc4z/.

In the paper, we propose a method for measuring a text's engagement with a focal con-cept using distributional representations of the meaning of words. This measure relies on Word Mover's Distance, which uses word embeddings to determine similarities between two documents. In our approach, which we call Concept Mover's Distance, a document is measured by the minimum distance the words in the document need to travel to arriveat the position of a "pseudo document" consisting of only words denoting a focal concept.

To reproduce the figures in the paper, download all scripts and CSVs to a local folder, and load the packages in the 1_cmd_jcss_packages.R script. The remaining scripts are self-contained, and refer to the respective section of the paper. Some of the figures require downloading text from Project Gutenberg which may take some time.


You can’t perform that action at this time.