Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
23 lines (17 sloc) 1.75 KB


Tools for Statistical Content Analysis
created at TU Dortmund University.


tosca is a framework for statistical methods in content analysis. We offer a pipeline for preprocessing, model text corpora using a link to the implemantation of Latent Dirichlet Allocation from the lda package. Useful plot routines for both - pre- and post-modeled corpora - are given for the descriptive analysis of text corpora and topic models. Moreover, an implementation of Chang's intruder words and intruder topics is provided; as well as reasoned sampling of text ids to get effective sets of texts for human labeling/coding regarding accuracy of estimating Precision and Recall.


See examples how to use tosca at the Vignette.


For a BibTeX entry please use citation(package = "tosca").


This R package is licensed under the GPLv3. For wishes, issues, and bugs please use the issue tracker.

Build Status Coverage Status CRAN Status Badge CRAN Downloads Total Downloads

You can’t perform that action at this time.