This package allows you to fit Maximum Entropy Optimality Theory models to data sets, generate the predictions made by such models for novel data, and compare the fit of different models using a variety of metrics.
The authors of this package are Connor Mayer, Adeline Tan, and Kie Zuraw.
If you publish work that uses maxent.ot
, please cite the following
paper and repository:
Mayer, C., Tan, A., & Zuraw, K. (in press). Introducing maxent.ot: an R package for Maximum Entropy constraint grammars. Phonological Data and Analysis.
Mayer, C., Tan, A., & Zuraw, K.(2024). maxent.ot: A package for doing Maximum Entropy Optimality Theory in R (Version 1.0.0) [Computer software]. 10.5281/zenodo.7246366
You can install the released version of maxent.ot from CRAN with:
install.packages("maxent.ot")
You can install the development version from GitHub with:
if (!require(devtools)) {
install.packages("devtools", repos = "http://cran.us.r-project.org")
}
if (!require(maxent.ot)) {
devtools::install_github("connormayer/maxent.ot")
}
This is a simple example workflow of fitting two MaxEnt OT models to the same data (with different constraint sets), examining their predicted frequencies, and comparing their fits using the likelihood ratio test.
library(maxent.ot)
# Get paths to input files.
# This file has two constraints
data_file_simple <- system.file(
"extdata", "sample_data_frame.csv", package = "maxent.ot"
)
# This file has three constraints
data_file_complex <- system.file(
"extdata", "sample_data_frame_large.csv", package = "maxent.ot"
)
# Read files into data frames
df_simple <- read.csv(data_file_simple)
df_complex <- read.csv(data_file_complex)
# Fit weights to both data sets with simple regularization
simple_model <- optimize_weights(df_simple, mu=0, sigma=10)
complex_model <- optimize_weights(df_complex, mu=0, sigma=10)
# Examine predicted probabilities of each model
# Also displayed: log likelihood (of weights given prediction data)
predict_probabilities(df_simple, simple_model$weights)
#> $loglik
#> [1] -1.444645
#>
#> $predictions
#> Input Output Freq Constraint1 Constraint2 Predicted Observed Error
#> 1 Input1 Output1-1 1 1 0 0.51384754 0.5 0.01384754
#> 2 Input1 Output1-2 1 0 1 0.48615246 0.5 -0.01384754
#> 3 Input2 Output2-1 1 0 0 0.94404279 1.0 -0.05595721
#> 4 Input2 Output2-2 0 0 1 0.05595721 0.0 0.05595721
predict_probabilities(df_complex, complex_model$weights)
#> $loglik
#> [1] -1.444644
#>
#> $predictions
#> Input Output Freq Constraint1 Constraint2 Constraint3 Predicted Observed
#> 1 Input1 Output1-1 1 1 0 1 0.51385019 0.5
#> 2 Input1 Output1-2 1 0 1 0 0.48614981 0.5
#> 3 Input2 Output2-1 1 0 0 1 0.94404422 1.0
#> 4 Input2 Output2-2 0 0 1 0 0.05595578 0.0
#> Error
#> 1 0.01385019
#> 2 -0.01385019
#> 3 -0.05595578
#> 4 0.05595578
# Compare model fit to training data using the likelihood ratio test
compare_models(simple_model, complex_model, method='lrt')
#> description chi_sq k_delta p_value
#> 1 df_complex~df_simple 2.451046e-06 1 0.9987508