This is a Bookworm extension.

This particular extensions supplements a Bookworm's "master_bookcounts" file with a master_topicWords file that otherwise resembles but that includes a topic column, created by MALLET. All the necessary work will happen on running make in the directory; some dependencies are not automatically installed.

Unlike a "good" bookworm extension, this one actually has a few bits of code in the API to support it. (Because that syntax for something other than master_bookcounts isn't transparently supported.) But I think it's worth it in this case, because it's make it possible to break apart a topic model at the unigram level.


This runs off of Mallet, the work of a lot of people.

The stopwords list begins with Matt Jockers' list of names to filter for topic modeling.