The code (demo) is about the paper "Tag-Weighted Dirichlet Allocation"
C C++ Makefile
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
demo
src
.gitignore
README.md
Tag-Weighted Dirichlet Allocation.pdf
TagWeightedDirichletAllocation.pdf

README.md

The code (demo) is about the paper "Tag-Weighted Dirichlet Allocation"

The paper is at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6729528
Author: Shuangyin Li, Guan Huang, Ruiyang Tan, Rong Pan
Sun Yat-sen University

Any question about code please contact us by emails.
shuangyinli AT cse.ust.hk
panr AT sysu.edu.cn.

License

Copyright 2013 Shuangyin Li, Guan Huang, Ruiyang Tan, Rong Pan
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Install

cd src/ && make

Usage

###Input file format:
DocNumLabels label1 label2 ... @ DocNumWords word1 word2 ...
DocNumLabels label1 label2 ... @ DocNumWords word1 word2 ...
DocNumLabels label1 label2 ... @ DocNumWords word1 word2 ...

Each row represent one document with labels. DocNumLables means the number labels of document. DocNumWords means the number words of document. Each label is integer and represent one label. Each word is integer and represent one word.

demo/twda.demo.input is a simple demo input file.
demo/label.txt is the label dictionary file. The word in row 1 means the label0.
demo/words.dic is the word dictionary file.


###Training:

./twda est <input data file> <setting.txt> <num_topics> <model save dir>

Example:

./src/twda est demo/twda.demo.input src/setting.txt 10 demo/model

Some model training parameters are set in the file "setting.txt".

###Inference:

./twda inf <input data file> <setting.txt> <model dir> <prefix> <output dir>

Example:

./src/twda inf demo/twda.demo.input src/setting.txt demo/model/ final demo/output/

We can get the doc-topics-dis.txt file in output dir. The file indicates the topic distribution in input data file. The values in the file should be exp(.) so that we can konw that exact probablility.