Skip to content

fbkarsdorp/nnfit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A fundamental problem in research into language and cultural change is the difficulty of distinguishing processes of stochastic drift (also known as neutral evolution) from processes that are subject to certain selection pressures. In this article, we describe a new technique based on Deep Neural Networks, in which we reformulate the detection of evolutionary forces in cultural change as a binary classification task. Using Residual Networks for time series trained on artificially generated samples of cultural change, we demonstrate that this technique is able to efficiently, accurately and consistently learn which aspects of the time series are distinctive for drift and selection. We compare the model with a recently proposed statistical test, the Frequency Increment Test, and show that the neural time series classification system provides a possible solution to some of the key problems of this test.

DOI: https://doi.org/10.1017/ehs.2020.52

Getting started

See the supplementary materials for a brief tutorial describing how to train your own models.

Data

Code to reconstruct the past-tense data set can be obtained from https://github.com/mnewberry/ldrift. To run the past-tense analysis in notebooks/past-tense.ipynb, save the frequency list under data/coha-past-tense.txt.

Requirements

All code is implemented in Python 3.7. A detailed list of the requirements to run the code can be found in the requirements.txt file. This repository might be updated. To use the code used to run the analyses in the paper, please download the submission release: https://github.com/fbkarsdorp/nnfit/releases/tag/v1.0

Training

To train your own models, run src/train.py and follow the instructions therein.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.