Bookworm Mallet integration
Python R Makefile
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
Makefile
Prepper.R
README.md
bookwormStopwords.txt
enableMalletFields.py
encodeMalletOutput.py
loadWords.SQL
makefakevocab.py
oldMakefile

README.md

Bookworm-Mallet

Bookworm Mallet integration

This is a Bookworm extension.

This particular extensions supplements a Bookworm's "master_bookcounts" file with a master_topicWords file that otherwise resembles but that includes a topic column, created by MALLET. All the necessary work will happen on running make in the directory; some dependencies are not automatically installed.

Unlike a "good" bookworm extension, this one actually has a few bits of code in the API to support it. (Because that syntax for something other than master_bookcounts isn't transparently supported.) But I think it's worth it in this case, because it's make it possible to break apart a topic model at the unigram level.

Acknowledgements

This runs off of Mallet, the work of a lot of people.

The stopwords list begins with Matt Jockers' list of names to filter for topic modeling.