MALLET

gioiastevens edited this page Apr 29, 2014 · 12 revisions

How to find MALLET in DH Box

You can access the bash shell at xx.xxx.xxx.xxx:4200. Once there, input your DH Box username and password and then access MALLET. The files are located under /dhbox/MALLET/bin

Overview

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

MALLET site

[MALLET: MAchine Learning for LanguagE Toolkit] (http://mallet.cs.umass.edu/)

Documentation

[MALLET Documentation] (http://mallet.cs.umass.edu/download.php)

Tutorials

[Machine Learning with MALLET by David Mimno] (http://mallet.cs.umass.edu/mallet-tutorial.pdf)

[The Programming Historian "Getting Started with Topic Modeling and MALLET" by Shawn Graham, Scott Weingart, and Ian Milligan] (http://programminghistorian.org/lessons/topic-modeling-and-mallet)

Examples

[Topic Modeling Martha Ballard’s Diary by Cameron Blevins] (http://historying.org/2010/04/01/topic-modeling-martha-ballards-diary/)

[Mining the Dispatch by Rob Nelson] (http://dsl.richmond.edu/dispatch/)