Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
Geographical Topic Model Using multi-Dirichlet process mixtures

(C) Copyright 2012, Christoph Carl Kling

Based on "Knoceans" by Gregor Heinrich Gregor Heinrich (gregor :: arbylon : net)
and JGibbsLDA by Xuan-Hieu Phan and Cam-Tu Nguyen (ncamtu :: gmail : com)
published under GNU GPL.

Tartarus Snowball stemmer by Martin Porter and Richard Boulton published under 
BSD License (see ), with Copyright 
(c) 2001, Dr Martin Porter, and (for the Java developments) Copyright (c) 2002, 
Richard Boulton. 

Java Delaunay Triangulation (JDT) by boaz88 :: gmail : com published under Apache License 2.0 

MGTM is free software; you can redistribute it and/or modify it 
under the terms of the GNU General Public License as published by the Free 
Software Foundation; either version 3 of the License, or (at your option) 
any later version.

MGTM is distributed in the hope that it will be useful, but WITHOUT 
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS 
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA


This is the implementation of MGTM, a geographical topic model using multi-Dirichlet processes.

The source code of MGTM is found in /sourcecode/nhdp3/ 

The topic sampler is found in
The parameter samplers in
The model options in
The MisesFisher clustering and the Delaunay triangulation call in

A list of variables used in the model is given in variables.html

Example call of MGTM for car dataset (available on request from MGTM[at]

java -Xmx3000M -jar MGTD.jar -dir ./example/ -dfile car.txt -est -L 500 -beta 0.5 -gamma 1.0 -alpha0 1.0 -Alpha 1.0 -sampleHyper true -gammaa 1.0 -gammab 0.1 -alpha0a 1.0 -alpha0b 0.1 -Alphaa 0.1 -Alphab 0.1 -delta 10.0 -savestep 5 -twords 20 -niters 200

dir is the directory of the dataset. The output is stored in this directory.
L gives the number of geographical regions (clusters) for the initial clustering
gamma, beta, alpha0, Alpha, delta are the initial parameters for the Dirichlet distributions. 
Alphaa, Alphab and the corresponding parameters for the other parameters are Gamma-distributed hyper-parameters for the Dirichlet parameters.
The number of topics is inferred.

Data format:
The first line gives the number of documents in the file.
Every following line corresponds to a document, using the format: 
latitude longitude word1 word2 ... 

Example file format for three documents:

56.3 6.4 this is a test
46.2 5.2 words are separated by spaces
65.3 12.3 that is all you need


No description, website, or topics provided.






No releases published


No packages published