Computational Text Analysis Workshop Materials
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

Computational Text Analysis Workshop


This workshop was originally prepared for the 2015 Digital Humanities @ Berkeley Summer Institute. It has since been taught elsewhere.

This course introduces students to modern quantitative text analysis techniques, with the ultimate goal of providing the skills necessary to apply the methods in their own research. We will use the open source programming language R. Demonstration corpora are provided.

Topics Covered

  • Acquiring and Preprocessing texts
  • Discriminating Words
  • Dictionary Methods and Sentiment Analysis
  • The Vector Space Model and the Geometry of Text (Multi-dimensional Scaling, Most Similar Texts, Clustering)
  • Topic Models
  • Quantifying Style: Grammar, Alliteration, and other Poetic Concerns

See the entire syllabus here.


This workship will be using the R programming language. See the software requirements here.

Students are strongly encouraged to complete this brief tutorial to learn the basic syntax of the R programming language.


Rochelle Terman: