cre

cre is an R package for fitting various interpretations of the Constant Rate Effect (CRE; Kroch 1989) to historical linguistic data. It is in active development and the currently available versions are not to be considered entirely stable (but versions in the master branch of this repository are stabler than those in the dev branch).

Installation

With devtools:

devtools::install_github("hkauhanen/cre")
library(cre)

Basic usage

A package vignette is in preparation. In the meantime, the fitting routine based on nonlinear least squares (Kauhanen & Walkden 2018) can be operated as follows (see the folder inst/extdata for a mock random data set in various formats):

df <- read.csv("inst/extdata/mockdata_long.csv")
fit <- fit.cre.nls(df, format="long", model="logistic", budget=100)

Setting model="logistic" fits the classical model of logistics agreeing in their slope (Kroch 1989); setting model="bias" fits the production bias model (Kauhanen & Walkden 2018); setting model="VRE" fits a family of logistics freely varying in slopes and intercepts. The budget argument specifies how much computational budget is devoted to the fitting; larger numbers explore a more finely resolved regime of parameter combinations and hence lead to better fits, but incur longer running times. See the manual for details.

The result, fit, is a list of five elements: the data set used, the best-fitting parameters found, the residual sum of squares, the residual sum of squares normalized by number of data points, and the number of data points.

Alternative input formats

In addition to long format, the data can be a wide-format table of frequencies or a data frame of (binary) response-level data. See again inst/extdata for examples. The following all lead to the same outcome:

# long-format frequency data
df <- read.csv("inst/extdata/mockdata_long.csv")
fit <- fit.cre.nls(df, format="long", model="logistic", budget=100)

# wide-format frequency data
df <- read.csv("inst/extdata/mockdata_wide.csv")
fit <- fit.cre.nls(df, format="wide", model="logistic", budget=100)

# response-level data
df <- read.csv("inst/extdata/mockdata_responses.csv")
df <- frequentize(df)
fit <- fit.cre.nls(df, format="long", model="logistic", budget=100)

Issues?

If you find a bug, please file an issue. If you have a feature request, please consider emailing Henri: henri@henr.in.

In the pipeline

The following features are in active development, more or less in the following order of urgency:

Implement routines for fitting traditional VARBRUL-style logistic regressions
Enrich documentation with examples
Document the more mysterious operations of the fitting routines (such as auto-guessing of parameter ranges)
Write vignettes
Statistical tests for rigorous selection between competing models
Tidy up code

References

Kauhanen, H. & Walkden, G. (2018) Deriving the Constant Rate Effect. Natural Language & Linguistic Theory, 36(2), 483–521. https://doi.org/10.1007/s11049-017-9380-1

Kroch, A. (1989) Reflexes of grammar in patterns of language change. Language Variation and Change, 1: 199–244.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
R		R
data-raw		data-raw
inst/extdata		inst/extdata
man		man
.RData		.RData
.Rbuildignore		.Rbuildignore
.Rhistory		.Rhistory
DESCRIPTION		DESCRIPTION
HISTORY.md		HISTORY.md
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
cre.pdf		cre.pdf

License

hkauhanen/cre

Folders and files

Latest commit

History

Repository files navigation

cre

Installation

Basic usage

Alternative input formats

Issues?

In the pipeline

References

About

Resources

License

Stars

Watchers

Forks

Languages