Skip to content

tuanlvm/SEMAFORE

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
SEMAFORE: SEmantic visualization with MAniFOld REgularization
-------------------------------------------------------------


INTRODUCTION

This is an implementation of SEMAFORE - a semantic visualization method from Le & Lauw (AAAI 2014, JAIR 2016).

Usage:

	perl semafore.pl	--num_topics $num_topics
				--dim $dim
				--lambda $lambda
				--alpha $alpha,
				--beta $beta,
				--gamma $gamma,
				--basis_function $basis_function
				--EM_iter $EM_iter,
				--Quasi_iter $Quasi_iter
				--data $data
				--graph $graph
				--output_file $output_file

Arguments:
	$num_topics: number of topics
	$dim: number of dimensions (default 2)
	$lambda: regularization parameter (default 10)
	$alpha: Dirichlet parameter (default 0.01)
	$beta: covariance for Gaussian prior of topic coordinates (default 0.1*$num_docs)
	$gamma: covariance for Gaussian prior of document coordinates (default 0.1*$num_topics)
	$basis_function: 0 for Gaussian, 1 for Student-t (with 1 degree of freedom) (default 0)
	$EM_iter: number of iterations for EM (default 100)
	$Quasi_iter: maximum iterations of Quasi-Newton (default 10)
	$data: input data
	$graph: neighborhood graph
	$output_file: output file

Details:

+ This implementation needs Algorithm::LBFGS library for quasi-Newton method L-BFGS.
  The library can be downloaded at http://search.cpan.org/~laye/Algorithm-LBFGS-0.16/lib/Algorithm/LBFGS.pm.
  To install,
	
	  cpan Algorithm::LBFGS
	
+ If $graph is not passed, SEMAFORE turns into Probabilistic Latent Semantic Visualization (PLSV) (Iwata et al., 2008)
	
+ Example of input data with 3 documents (numbers are ids of words):
	0 1 1 2 2 3 4 4 5 6 7 7 8 8 8 8 9 10 11 12 13 13 14 14 15 15 15 16
	17 18 19 20 20 21 22 23 24 25 25 25 25 25 25 26 27 27 28 29
	30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 50 51 52 53 54 54 55 56 57 58
			
+ Neighborhood graph is represented by a matrix A: NxN. N is the number of documents and A[i,j]=A[j,i] is the weight of the edge ij.
  For example,
	0 1 0
	1 0 1
	0 1 0

			
HOW TO CITE
If you use SEMAFORE for your research, please cite:

	@article{le16a,
	    title={Semantic Visualization with Neighborhood Graph Regularization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    journal={Journal of Artificial Intelligence Research},
	    volume={55},
	    pages={1091--1133},
	    year={2016}
	}
	
	The paper can be downloaded from: http://jair.org/papers/paper4983.html
	
Or

	@inproceedings{le2014manifold,
	    title={Manifold learning for jointly modeling topic and visualization},
	    author={Le, Tuan MV and Lauw, Hady W},
	    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
	    year={2014}
	}
	
	The paper can be downloaded from: http://www.hadylauw.com/publications/aaai14.pdf


BIBLIOGRAPHY

@inproceedings{iwata2008probabilistic,
    title={Probabilistic latent semantic visualization: topic model for visualizing documents},
    author={Iwata, Tomoharu and Yamada, Takeshi and Ueda, Naonori},
    booktitle={Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining},
    pages={363--371},
    year={2008},
    organization={ACM}
}

About

SEMAFORE: SEmantic visualization with MAniFOld REgularization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages