Bayesian Factorization with Side Information in C++ with Python wrapper
Pull request Compare This branch is 2309 commits ahead, 4 commits behind jaak-s:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
conda-recipes
docker/smurff-0.10.2
docs
lib/smurff-cpp
python
.gitignore
.travis.yml
LICENSE
README.rst

README.rst

SMURFF - Scalable Matrix Factorization Framework

Build Status Anaconda-Server Badge

What is Bayesian Matrix Factorization

Matrix factorization is a common machine learning technique for recommender systems, like books for Amazon or movies for Netflix.

Matrix Factorizaion

The idea of these methods is to approximate the user-movie rating matrix R as a product of two low-rank matrices U and V such that R ≈ U × V . In this way U and V are constructed from the known ratings in R, which is usually very sparsely filled. The recommendations can be made from the approximation U × V which is dense. If M × N is the dimension of R then U and V will have dimensions M × K and N × K.

Bayesian probabilistic matrix factorization (BPMF) has been proven to be more robust to data-overfitting compared to non-Bayesian matrix factorization.

What is SMURFF

SMURFF is a highly optimized and parallelized framework for Bayesian Matrix and Tensors Factorization. SMURFF supports multiple matrix factorization methods:

  • BPMF, the basic version;
  • Macau, adding support for high-dimensional side information to the factorization;
  • GFA, doing Group Factor Anaysis.

Macau and BPMF can also perform tensor factorization.

Examples

Documentation is generated from Jupyter Notebooks. You can find the notebooks in docs/notebooks and the resulting documentation on smurff.readthedocs.io

Installation

Using conda:

conda install -c vanderaa smurff

Compile from source code: see INSTALL.rst

Contributors

  • Jaak Simm (Macau C++ version, Cython wrapper, Macau MPI version, Tensor factorization)
  • Tom Vander Aa (OpenMP optimized BPMF, Matrix Cofactorization and GFA, Code Reorg)
  • Adam Arany (Probit noise model)
  • Tom Haber (Original BPMF code)
  • Andrei Gedich
  • Ilya Pasechnikov
  • Thanh Le Van (sythetic out-of-matrix prediction example)

Acknowledgements

This work was partly funded by the European projects ExCAPE (http://excape-h2020.eu) and EXA2CT, and the Flemish Exaptation project.