DH Computational Text Analysis Workshop Materials
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
00-Introduction
01-Intro-to-NLP
02-Intro to Python
03-Operationalizing
04-Discriminating-Words
05-Dictionary-Method
06-Literary Distinction (Probably)
07-Topic Modeling
08-Workshop-Overview
A-Syllabus.md
B-Annotated Bibliography.md
README.md

README.md

DHBSI 2016: Computational Text Analysis

15-19 August 2016

Instructors: Laura Nelson and Teddy Roland

Overview

Scholars across multiple disciplines are finding themselves face-to-face with massive amounts of digitized data. In the humanities and many social science disciplines, this data is often in the form of unstructured text. This course will introduce students to cutting edge ways of structuring and analyzing digitized text-as-data, and will do so by exploring questions fundamental to the humanities. The ultimate goal is to encourage students to think about novel ways they can apply these techniques to their own data and research questions, and to provide the skills necessary to apply the methods in their own research. We will use the open source (and free!) programming language Python. We will also provide demonstration corpora.

Topics Covered

  • Principles of Natural Language Processing
  • Introduction to Python for NLP
  • Discriminating Words
  • Dictionary Methods
  • Textual Classification
  • Topic Modeling

Requirements

This workshop will be taught in the open source programming language Python. Participants should install Anaconda for Python 3.5 on their laptops prior to the first class.

Suggested Reading

  1. Ted Underwood. Seven ways humanists are using computers to understand text
  2. H. Andrew Schwartz and Lyle H. Ungar (2015). Data-Driven Content Analysis of Social Media: A Systematic Overview of Automated Methods