Skip to content
Material for a 3 day workshop on computational text analysis for humanists and social scientists
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Computational Text Analysis for Humanists and Social Scientists

A Digital Humanities Workshop for Claremont Colleges

22-24 May 2017, 10AM - 4PM

Instructors: Hovig Tchalian and Laura Nelson


Increasingly, humanity’s cultural material is being captured and stored in the form of electronic text. From historical documents, literature and poems, diaries, political speeches, and government documents, to emails, text messages, and social media, students from the humanities and social sciences now have access to immense amounts of rich, and diverse, text. Scholars are increasingly using computational methods to analyze these new sources of text in order to ask, and answer, a diverse array of questions about the social world: Does social media reflect public political opinion, or drive it? What determines trust in online communities? What types of blog posts get censored in China and why? Are diurnal and seasonal mood cycles cross-cultural? What was the form of cultural and institutional change through the “civilizing process” in England between the 16th and 20th centuries? What is the life cycle of a literary genre? What are textual allusions in Classical Latin poetry? Can the FBI really analyze 650,000 emails in 3 days? (Spoiler: Yes, they can!)

In this workshop you will learn cutting-edge methods to analyze large amounts of texts to explore questions fundamental to the humanities and social sciences. We will not have computers read the text for us. Instead, we will harness the superior ability for computers to count and extract patterns from text, and use this output to enhance our own critical thinking and interpretive analyses.

Topics Covered

  • Using Python and Jupyter for Text Analysis
  • Natural Language Processing
  • Document Term Matrix
  • Word Weighting
  • Dictionaries
  • Vector Space Models
  • Topic Models
  • Interpreting Output
  • Research Implications of Text Analysis


This workshop will be taught in the open source programming language Python using the Jupyter environment. While no prior programming experience is required or assumed, your time in this workshop will be much more productive if you come in with some basic knowledge of Python. Participants are strongly encouraged to complete this brief tutorial to learn the basic syntax of the Python programming language.

You can’t perform that action at this time.