topic modeling on Monarch data
Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
cs261-writeup
deliverables
notebook
old
LDA.ipynb
LDA.py
README
classify.py
dataset stats.py
docmaker.py
extract.py
html2text.py
ldasimlib.py
ldasimscript.py
parallel_parse.py
preparedata.py
settings.py
text_processing.py
utils.py

README

This repository is for tools for studying topic modelling using spam data from Monarch.

It uses MALLET to do the topic inference, using LDA.

## Setup and usage

To execute, first unpack the mallet script:
tar -xzf mallet-2.0.6.tar-gz

settings.py has the global settings for all the scripts