SEMAFORE: SEmantic visualization with MAniFOld REgularization
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
sample
README
semafore.pl

README

SEMAFORE: SEmantic visualization with MAniFOld REgularization
-------------------------------------------------------------


INTRODUCTION

This is an implementation of SEMAFORE - a semantic visualization method from Le & Lauw (AAAI 2014, JAIR 2016).

Usage:

	perl semafore.pl	--num_topics $num_topics
				--dim $dim
				--lambda $lambda
				--alpha $alpha,
				--beta $beta,
				--gamma $gamma,
				--basis_function $basis_function
				--EM_iter $EM_iter,
				--Quasi_iter $Quasi_iter
				--data $data
				--graph $graph
				--output_file $output_file

Arguments:
	$num_topics: number of topics
	$dim: number of dimensions (default 2)
	$lambda: regularization parameter (default 10)
	$alpha: Dirichlet parameter (default 0.01)
	$beta: covariance for Gaussian prior of topic coordinates (default 0.1*$num_docs)
	$gamma: covariance for Gaussian prior of document coordinates (default 0.1*$num_topics)
	$basis_function: 0 for Gaussian, 1 for Student-t (with 1 degree of freedom) (default 0)
	$EM_iter: number of iterations for EM (default 100)
	$Quasi_iter: maximum iterations of Quasi-Newton (default 10)
	$data: input data
	$graph: neighborhood graph
	$output_file: output file

Details:

+ This implementation needs Algorithm::LBFGS library for quasi-Newton method L-BFGS.
  The library can be downloaded at http://search.cpan.org/~laye/Algorithm-LBFGS-0.16/lib/Algorithm/LBFGS.pm.
  To install,
	
	  cpan Algorithm::LBFGS
	
+ If $graph is not passed, SEMAFORE turns into Probabilistic Latent Semantic Visualization (PLSV) (Iwata et al., 2008)
	
+ Example of input data with 3 documents (numbers are ids of words):
	0 1 1 2 2 3 4 4 5 6 7 7 8 8 8 8 9 10 11 12 13 13 14 14 15 15 15 16
	17 18 19 20 20 21 22 23 24 25 25 25 25 25 25 26 27 27 28 29
	30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 50 51 52 53 54 54 55 56 57 58
			
+ Neighborhood graph is represented by a matrix A: NxN. N is the number of documents and A[i,j]=A[j,i] is the weight of the edge ij.
  For example,
	0 1 0
	1 0 1
	0 1 0

			
HOW TO CITE
If you use SEMAFORE for your research, please cite:

	@article{le16a,
	    title={Semantic Visualization with Neighborhood Graph Regularization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    journal={Journal of Artificial Intelligence Research},
	    volume={55},
	    pages={1091--1133},
	    year={2016}
	}
	
	The paper can be downloaded from: http://jair.org/papers/paper4983.html
	
Or

	@inproceedings{le2014manifold,
	    title={Manifold learning for jointly modeling topic and visualization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	    year={2014}
	}
	
	The paper can be downloaded from: http://www.hadylauw.com/publications/aaai14.pdf


BIBLIOGRAPHY

@inproceedings{iwata2008probabilistic,
    title={Probabilistic latent semantic visualization: topic model for visualizing documents},
    author={Iwata, Tomoharu and Yamada, Takeshi and Ueda, Naonori},
    booktitle={Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining},
    pages={363--371},
    year={2008},
    organization={ACM}
}