Skip to content
A 1-hr introduction to Computational Text Analysis for beginners. Created for TextXD in 2019 by Jaren Haber
Jupyter Notebook HTML
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

Computational text analysis: A brief introduction for TextXD 2019



This workshop will equip newcowers with the cornerstones of a foundation for applying computational text analysis methods in their work. The focus is on high-level descriptions of what existing methods do and user-friendly implementations. It's drawn from the first day of a four-day workshop on computational text analysis held at the D-Lab. The following days cover regular expressions, unsupervised methods, and supervised methods.

Workshop goals

  • Provide a general roadmap of computational text analysis (CTA)
  • Build intuitions about using text as data
  • Gain practice with preprocessing and more
  • Understand at a high-level:
    • how a few primary CTA methods work
    • what kinds of questions they answer
    • how to design and implement a CTA project


We will get our hands dirty implementing some of the methods. This will be in Python. If you would like to follow along with the implementation details, you will need some familiarity with Python. If you haven't programmed in Python or at all, you are of course welcome to participate and learn the big ideas behind the methods.

Getting started & software prerequisites

For simplicity, just click the "Launch Binder" button to create a virtual environment ready for this workshop.

If you want to run the code on your computer, you have two options. You could use Anaconda to make installation easy: download Anaconda . Or if you already have Python 3.x installed with the full list of libraries listed under requirements.txt, you're welcome to clone this repository and follow along on your own machine. You can also install all the necessary packages like so:

pip3 install -r requirements.txt


It's OK Not To Know! That's our motto at D-Lab. D-Lab is open to researchers and professionals from all disciplines and levels of experience.



If you spot a problem with these materials, please make an issue describing the problem.


These materials have evolved over a number of years. They were first developed for the D-Lab by Laura Nelson & Teddy Roland, with contributions and revisions made by Ben Gebre-Medhin, Geoff Bacon, and most recently by Caroline Le Pennec-Caldichoury.

dlab logo

You can’t perform that action at this time.