Julia implementation of low-rank determinantal point process (DPP) learning and prediction algorithms. Two learning algorithms are provided: the first is an optimization-based algorithm that uses stochastic gradient ascent (SGA), and the second is a Bayesian algorithm that uses stochastic gradient Hamiltonian Monte Carlo (SGHMC).
For details on low-rank DPPs, including SGA-based learning and prediction, see the Low-Rank Factorization of Determinantal Point Processes paper (slides). For more on Bayesian low-rank DPPs, see the Bayesian Low-Rank Determinantal Point Processes paper (slides).
Within Julia, use the package manager:
Pkg.add(PackageSpec(url="git://github.com/cgartrel/LowRankDPP.jl.git"))
The Amazon baby registry dataset is included in the data/
directory. This
dataset is described in the Expectation-Maximization for Learning Determinantal
Point Processes paper.
DPPExamples.jl
contains a number of examples that show how to convert CSV data
files into the JLD files required by this
low-rank DPP package, perform low-rank DPP learning using the SGA and SGHMC
learning algorithms, compute predictions using models generated by both types of
learning algorithms, and compute prediction metrics (mean percentile rank and
precision@N).
To run the examples for the full CSV data conversion, learning, prediction, and
prediction metrics pipeline for SGA-based models, use the following functions
from DPPExamples.jl
:
using LowRankDPP
convertCsvToBasketsExample()
dppLearningExample()
predictionExample()
predictionMetricsExample()
To run the examples for the learning and prediction pipeline for SGHMC-based
models, use the following functions from DPPExamples.jl
:
using LowRankDPP
dppLearningBayesianExample()
predictionForMCMCSamplesExample()
The provided hyperparameter settings of the learning algorithms should work for
most of the included Amazon baby registry data. However, these hyperparameters
will likely need to be tuned for other datasets. In particular, the epsFixed
,
epsInitialDecay
, and numIterationsFixedEps
settings in
doStochasticGradientAscent
(from DPPLearning.jl
) will need to be tuned to
ensure proper convergence to a local maximum for SGA learning, while the
stepSizeLarger
, stepSizeIntermediate
, stepSizeSmaller
,
numIterationsLargerStepSize
, and numIterationsIntermediateStepSize
settings
in runStochasticGradientHamiltonianMonteCarloSampler
(from
DPPLearningBayesian.jl
) will need to be tuned to ensure proper convergence to
a local mode for SGHMC learning.