psychtm: A package for text mining in psychological research

The goal of psychtm is to make text mining models and methods accessible for social science researchers, particularly within psychology. This package allows users to

Estimate the SLDAX topic model and popular models subsumed by SLDAX, including SLDA, LDA, and regression models;
Obtain posterior inferences;
Assess model fit using coherence and exclusivity metrics.

Installation

Once on CRAN, install the package as usual:

install.packages("psychtm")

Alternatively, you can install the most current development version:

If necessary, first install the devtools R package,

install.packages("devtools")

Option 1: Install the latest stable version from Github

devtools::install_github("ktw5691/psychtm")

Option 2: Install the latest development snapshot

devtools::install_github("ktw5691/psychtm@devel")

Example

This is a basic example which shows you how to (1) prepare text documents stored in a data frame; (2) fit a supervised topic model with covariates (SLDAX); and (3) summarize the regression relationships from the estimated SLDAX model.

library(psychtm)
library(lda) # Required if using `prep_docs()`

data(teacher_rate)  # Synthetic student ratings of instructors
docs_vocab <- prep_docs(teacher_rate, "doc")
vocab_len <- length(docs_vocab$vocab)
fit_sldax <- gibbs_sldax(rating ~ I(grade - 1),
                         data = teacher_rate,
                         docs = docs_vocab$documents,
                         V = vocab_len,
                         K = 2,
                         model = "sldax")
eta_post <- post_regression(fit_sldax)

summary(eta_post)
#> 
#> Iterations = 1:100
#> Thinning interval = 1 
#> Number of chains = 1 
#> Sample size per chain = 100 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>                 Mean       SD  Naive SE Time-series SE
#> I(grade - 1) -0.2656 0.007307 0.0007307      0.0007307
#> topic1        4.6165 0.122216 0.0122216      0.0804883
#> topic2        4.8189 0.034301 0.0034301      0.0034301
#> effect_t1    -0.2024 0.134106 0.0134106      0.0884898
#> effect_t2     0.2024 0.134106 0.0134106      0.0884898
#> sigma2        1.1422 0.028296 0.0028296      0.0028296
#> 
#> 2. Quantiles for each variable:
#> 
#>                  2.5%     25%     50%     75%    97.5%
#> I(grade - 1) -0.27849 -0.2711 -0.2659 -0.2601 -0.25175
#> topic1        4.34365  4.5709  4.6584  4.6945  4.76228
#> topic2        4.75032  4.7994  4.8181  4.8420  4.87593
#> effect_t1    -0.51412 -0.2639 -0.1828 -0.1086 -0.01216
#> effect_t2     0.01216  0.1086  0.1828  0.2639  0.51412
#> sigma2        1.08793  1.1245  1.1445  1.1599  1.20649

For a more detailed example of the key functionality of this package, explore the vignette(s) for a good starting point:

browseVignettes("psychtm")

How to Cite the Package

Wilcox, K. T., Jacobucci, R., Zhang, Z., Ammerman, B. A. (2021). Supervised latent Dirichlet allocation with covariates: A Bayesian structural and measurement model of text and covariates. PsyArXiv. https://doi.org/10.31234/osf.io/62tc3

Common Troubleshooting

Ensure that appropriate C++ compilers are installed on your computer:

Mac users will have to download Xcode and its related Command Line Tools (found within Xcode’s Preference Pane under Downloads/Components).
Windows users may need to install Rtools. For easier command line use, be sure to select the option to install Rtools to their path.
Most Linux distributions should already have up-to-date compilers.

Limitations

This package uses a Gibbs sampling algorithm that can be memory-intensive for a large corpus.

Getting Help

If you think you have found a bug, please open an issue and provide a minimal complete verifiable example.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
R		R
build		build
data		data
inst		inst
man		man
src		src
tests		tests
vignettes		vignettes
DESCRIPTION		DESCRIPTION
MD5		MD5
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

psychtm: A package for text mining in psychological research

Installation

Option 1: Install the latest stable version from Github

Option 2: Install the latest development snapshot

Example

How to Cite the Package

Common Troubleshooting

Limitations

Getting Help

About

Releases

Packages

Languages

cran/psychtm

Folders and files

Latest commit

History

Repository files navigation

psychtm: A package for text mining in psychological research

Installation

Option 1: Install the latest stable version from Github

Option 2: Install the latest development snapshot

Example

How to Cite the Package

Common Troubleshooting

Limitations

Getting Help

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages