Skip to content
This repository has been archived by the owner on May 31, 2023. It is now read-only.


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

Topic modeling for the programming languages literature.

You can run our tool!


The analysis directory holds the R scripts we used to generate figures for the paper.

The lda directory holds the Python and bash scripts we used to run David Blei's LDA-C. Outputs get put in the out directory.

The sessions directory is the (not quite finished) analysis of session data for POPL.

The www directory is the website frontend and backend.

Using our tool

You'll need David Blei's LDA-C, compiled and with lda on your path. You'll also need the Python library nltk, with the stopwords and wordnet modules installed.

To do the R analysis, you'll need R with ggplot2 installed.