Skip to content

jhaber-zz/computational-text-analysis-intro-2019

Repository files navigation

Computational text analysis: A brief introduction for TextXD 2019

Binder

Overview

This workshop will equip newcowers with the cornerstones of a foundation for applying computational text analysis methods in their work. The focus is on high-level descriptions of what existing methods do and user-friendly implementations. It's drawn from the first day of a four-day workshop on computational text analysis held at the D-Lab. The following days cover regular expressions, unsupervised methods, and supervised methods.

Workshop goals

  • Provide a general roadmap of computational text analysis (CTA)
  • Build intuitions about using text as data
  • Gain practice with preprocessing and more
  • Understand at a high-level:
    • how a few primary CTA methods work
    • what kinds of questions they answer
    • how to design and implement a CTA project

Prerequisites

We will get our hands dirty implementing some of the methods. This will be in Python. If you would like to follow along with the implementation details, you will need some familiarity with Python. If you haven't programmed in Python or at all, you are of course welcome to participate and learn the big ideas behind the methods.

Getting started & software prerequisites

For simplicity, just click the "Launch Binder" button to create a virtual environment ready for this workshop.

If you want to run the code on your computer, you have two options. You could use Anaconda to make installation easy: download Anaconda . Or if you already have Python 3.x installed with the full list of libraries listed under requirements.txt, you're welcome to clone this repository and follow along on your own machine. You can also install all the necessary packages like so:

pip3 install -r requirements.txt

IOKN2K

It's OK Not To Know! That's our motto at D-Lab. D-Lab is open to researchers and professionals from all disciplines and levels of experience.

Resources

Contributing

If you spot a problem with these materials, please make an issue describing the problem.

Acknowledgments

These materials have evolved over a number of years. They were first developed for the D-Lab by Laura Nelson & Teddy Roland, with contributions and revisions made by Ben Gebre-Medhin, Geoff Bacon, and most recently by Caroline Le Pennec-Caldichoury.

dlab logo

About

A 1-hr introduction to Computational Text Analysis for beginners. Created for TextXD in 2019 by Jaren Haber

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published