Scalable Tensor Factorization
Ext-RESCAL is a memory efficient implementation of RESCAL, a state-of-the-art algorithm for DEDICOM tensor factorization. Ext-RESCAL is written in Python and relies on the SciPy Sparse module.
- 3-D sparse tensor factorization 
- Joint 3-D sparse tensor and 2-D sparse matrix factorization (extended version) 
- The implementation provably scales well to the domains with millions of nodes on the affordable hardware
- Handy input format
 M. Nickel, V. Tresp, H. Kriegel. A Three-way Model for Collective Learning on Multi-relational Data // Proceedings of the 28th International Conference on Machine Learning (ICML'2011). - 2011.
 M. Nickel, V. Tresp, H. Kriegel. Factorizing YAGO: Scalable Machine Learning for Linked Data // Proceedings of the 21st international conference on World Wide Web (WWW'2012). - 2012.
- Link Prediction
- Collaborative Filtering
- Entity Search
- Python 2.7+
- Numpy 1.6+
- SciPy 0.12+
1) Let's imagine we have the following semantic graph:
Each tensor slice represents an adjacency matrix of the corresponding predicate (member-of, genre, cites). Run the RESCAL algorithm to decompose a 3-D tensor with 2 latent components and zero regularization on the test data:
python rescal.py --latent 2 --lmbda 0 --input tiny-example --outputentities entity.embeddings.csv --outputfactors latent.factors.csv --log rescal.log
The test data set represents a tiny entity graph of 3 adjacency matrices (tensor slices) in the row-column representation. See the directory tiny-example. Ext-RESCAL will output the latent factors for the entities into the file entity.embeddings.csv.
2) Then, we assume that there is an entity-term matrix:
Run the extended version of RESCAL algorithm to decompose a 3-D tensor and 2-D matrix with 2 latent components and regularizer equal to 0.001 on the test data (entity graph and entity-term matrix):
python extrescal.py --latent 2 --lmbda 0.001 --input tiny-mixed-example --outputentities entity.embeddings.csv --outputterms term.embeddings.csv --outputfactors latent.factors.csv --log extrescal.log
Development and Contribution
This is a fork of the original code base provided by Maximilian Nickel. Ext-RESCAL has been developed by Nikita Zhiltsov. Ext-RESCAL may contain some bugs, so, if you find any of them, feel free to contribute the patches via pull requests into the develop branch.
0.3 (March 12, 2013):
- Fix random sampling for the basic task
- Add output of latent factors
0.2 (February 26, 2013):
- Add an opportunity to approximate the objective function via random sampling
- Bug fixes
- Change the default settings
0.1 (January 31, 2013):
- The basic implementation of both the algorithms
The original algorithms are an intellectual property of the authors of the cited papers.
The author is not responsible for implications from the use of this software.
Licensed under the GNU General Public License version 3 (GPLv3) ; you may not use this work except in compliance with the License. You may obtain a copy of the License in the LICENSE file, or at: