Skip to content

skywalker5/S3CMTF_code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

##########################################################################################
#   S3CMTF: Fast, Accurate, and Scalable Method for Coupled Matrix-Tensor Factorization
#   
#   Authors: Dongjin Choi (skywalker5@snu.ac.kr), Seoul National University
#            Jun-gi Jang (elnino4@snu.ac.kr), Seoul National University
#            U Kang (ukang@snu.ac.kr), Seoul National University
#   
#   Version : 1.0
#   Date: 2017-06-22
#   Main contact: Dongjin Choi
#
#   This software is free of charge under research purposes.
#   For commercial purposes, please contact the author.
##########################################################################################

1. Introduction

    S3CMTF is a fast and scalable coupled matrix-tensor factorization method for sparse tensor-matrix coupled data.
    
	S3CMTF performs fast coupled matrix-tensor factorization into HOSVD format 
	with high accuracy by lock-free parallel SGD updates for factor matricices
	and core tensor.
    
	Moreover, S3CMTF has two versions: a basic version S3CMTF-base and an improved 
	algorithm S3CMTF-opt.
	Both outperform the previous coupled matrix-tensor factorization algorithms.

2. Files
	
	This package contains following files

	- Makefile : generate excutable S3CMTF
	- S3CMTF-base.cpp : source code of S3CMTF-base
	- S3CMTF-opt.cpp : source code of S3CMTF-opt
	- demo/ : demo running files
	  - demo/run.sh : shell script for demo run
	  - demo/yelp_toy_train.tensor : small train tensor data for demo run
	  - demo/yelp_toy_test.tensor : small test tensor data for demo run
	  - demo/user_user_toy.matrix : coupled matrix data for user-user relation
	  - demo/business_city_toy.matrix : coupled matrix data for business-city relation
	  - demo/business_category_toy.matrix : coupled matrix data for business-category relation
	  - demo/config_demo.txt : SNeCT configuration for demo run
	  - demo/S3CMTF-base : S3CMTF-base executable file
	  - demo/S3CMTF-opt : S3CMTF-opt executable file
	- lib/ : library files
	  - lib/armadillo-7.700.0.tar.xz : armadillo linear algebra library
	  - lib/blas-3.7.0.tgz : BLAS linear algebra package
	  - lib/lapack-3.7.0.tgz : LAPACK linear algebra package

3. Usage
	
	[Step 1] Install Armadillo and OpenMP libraries.

		Armadillo and OpenMP libraries are required to run S3CMTF.

		Armadillo library is attached in lib directory, and also available at http://arma.sourceforge.net/download.html.

		Notice that Armadillo needs LAPACK and BLAS libraries, and they are also attached in lib directory.

		Above OpenMP version 2.0 is required for S3CMTF.

	[Step 2] Adjusting config and source files

		Before you run S3CMTF, you need to edit configuration and source files appropriately.

		The format of configuration file is as follows.

		[Line 1]  Tensor order 'N'
		[Line 2]  Tensor dimensionalities I_n (n=1...N) 'I_1' 'I_2' ... 'I_N'
		[Line 3]  Tensor rank J_n (n=1...N) 'J_1' 'J_2' ... 'J_N'
		[Line 4]  Number of parallel threads P
		[Line 5]  Number of entries of train data tensor
		[Line 6]  Number of entries of test data tensor
		[Next lines repeated for the number of coupled matrices]
			[Line 6+i]  'Coupled mode' 'Path of i-th coupled matrix' 'Dimensionality of i-th coupled dimension' 'Number of entries of i-th coupled matrix'
			
		
		After editing configuration file, all pre-defined values in source files (S3CMTF-base.cpp, S3CMTF-opt.cpp) must be adjusted.
		The details of pre-defined values are as follows.

		#define MAX_ORDER 							<- Must be larger than tensor order N
		#define MAX_INPUT_DIMENSIONALITY 			<- Must be larger than tensor dimensionalities I_n
		#define MAX_CORE_TENSOR_DIMENSIONALITY		<- Must be larger than tensor rank J_n
		#define MAX_ENTRY		  					<- Must be larger than number of observable entries of input tensor 
		#define MAX_CORE_SIZE						<- Must be larger than number of observable entries of core tensor
		#define MAX_ITER	 						<- Maximum iteration number, default value is set to 2000
 
 		*The example of configuration file is in demo folder.
 		*Notice that S3CMTF ver 1.0 uses static variables for faster performance. We are planning to make dynamic version for next version.
 	
 	[Step 3] Compile and run S3CMTF

		If you successfully install all libraries, "make" command will create two binary files, which are "S3CMTF-base" and "S3CMTF-opt".

		"make S3CMTF-base" only creates "S3CMTF-base", and "make S3CMTF-opt" only creates "S3CMTF-opt"

		Two binary files take nine arguments, which are path of configuration file, path of input train tensor file, path of input test tensor file, path of result directory, number of coupled matrices, initial learning rate, lambda_reg, and lambda_c.

		ex) ./S3CMTF-opt config.txt input_train.txt input_test.txt result 1 0.001 0.1 0.1 

		If you put command properly, S3CMTF will write all values of factor matrices and core tensor in the path of result.

		ex) result/FACTOR1, result/CORETENSOR, result/CFACTOR1

4. Demo

	Please follow this procedure to run demo to understand how to run S3CMTF-base and S3CMTF-opt.

	1. cd demo
	2. sh run.sh

	Then, S3CMTF-base and S3CMTF-opt are run on a demo tensor and three coupled matrices. Demo tensor has a size of 10,000x10,000x100 with 9,991 observable entries and splitted into train and test tensor.
	After execution, you can see factorization results in 'result' directory, while the intermediate process is presented on screen. following output files are generated.
	'demo/result/FACTOR<n>' shows the n-th factor matrix. The factor matrix is a stack of vectors representing latent factor weights of each entity.
	'demo/result/CFACTOR<n>' shows the n-th coupled factor matrix. The coupled factor matrix is a stack of vectors representing latent factor weights of each coupled entity.
	'demo/result/CORETENSOR' shows the core tensor of Tucker decomposition. The core tensor represents the degree of strength between factors in different dimensions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published