Tools to create and manage CWB-indexed corpora
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data-raw
docs
inst
man
vignettes
.Rbuildignore
.travis.yml
DESCRIPTION
NAMESPACE
NEWS.md
README.md
_pkgdown.yml
appveyor.yml
cran-comments.md

README.md

License Travis-CI Build Status AppVeyor Build Status codecov

Tools to Create, Manipulate and Manage Corpora for the Corpus Workbench (CWB)

The Corpus Workbench (CWB) is a indexing and query engine to efficiently work with large, linguistically annotated corpora. The 'cwbtools' package offers a set of tools to create, manipulate and manage CWB indexed corpora from within R in a convenient fashion. It complements packages that use the CWB as a backend for text mining with R, namely the packages 'rcqp' and 'RcppCWB' for low-level access to CWB indexed corpora, and 'polmineR' as a toolset to implement common text mining workflows.

Installation

The package is 'GitHub only' package so far. The most convenient way to install it will be to use an installation mechanism offered by the devtools package. The procedure is the same for Windows, Linux, and macOS. On Windows, having Rtools installed on your system may be necessary to use the full functionality of 'devtools'.

First, check that devtools is installed ...

if (!"devtools" %in% installed.packages()[,"Package"]) install.packages("devtools")

Then install the cwbtools package.

devtools::install_github("PolMine/cwbtools")

Acknowledgements

The CWB is a classical indexing and query engine. Its character as an open source project is of great value. The enduring effort of the CWB developers is gratefully acknowledged.