Notes on Regularization

In some cases, e.g., if the data is sparse, the iterative algorithms underlying the parameter inference functions might not converge. A pragmatic solution to this problem is to add a little bit of regularization.

Inference functions in choix provide a generic regularization argument: alpha. When α = 0, regularization is turned off; setting α > 0 turns it on. In practice, if regularization is needed, we recommend starting with small values (e.g., 10^− 4) and increasing the value if necessary.

Below, we briefly how the regularization parameter is used inside the various parameter inference functions.

Markov-chain based algorithms

For Markov-chain based algorithms such Luce Spectral Ranking and Rank Centrality, α is used to initialize the transition rates of the Markov chain.

In the special case of pairwise-comparison data, this can be loosely understood as placing an independent Beta prior for each pair of items on the respective comparison outcome probability.

Minorization-maximization algorithms

In the case of Minorization-maximization algorithms, the exponentiated model parameters e^θ₁, …, e^θ_n are endowed each with an independent Gamma prior distribution, with scale α + 1. See Caron & Doucet (2012) for details.

Other algorithms

The scipy-based optimization functions use an ℓ₂-regularizer on the parameters θ₁, …, θ_n. In other words, the parameters are endowed each with an independent Gaussian prior with variance 1/α.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regularization.rst

regularization.rst

Notes on Regularization

Markov-chain based algorithms

Minorization-maximization algorithms

Other algorithms

Files

regularization.rst

Latest commit

History

regularization.rst

File metadata and controls

Notes on Regularization

Markov-chain based algorithms

Minorization-maximization algorithms

Other algorithms