Bookworm Mallet integration
Python R Makefile
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Bookworm Mallet integration

This is a Bookworm extension.

This particular extensions supplements a Bookworm's "master_bookcounts" file with a master_topicWords file that otherwise resembles but that includes a topic column, created by MALLET. All the necessary work will happen on running make in the directory; some dependencies are not automatically installed.

Unlike a "good" bookworm extension, this one actually has a few bits of code in the API to support it. (Because that syntax for something other than master_bookcounts isn't transparently supported.) But I think it's worth it in this case, because it's make it possible to break apart a topic model at the unigram level.


This runs off of Mallet, the work of a lot of people.

The stopwords list begins with Matt Jockers' list of names to filter for topic modeling.