# Mallows Model (MM) for complete permutations

The Mallows Model (MM) is an exponential family of probability models for permutation data. We consider $\sigma$ to be a ranking. Formally, the MM using the a distance (Kendall's-$\tau$, Hamming...) is expressed as follows: 
$$p(\sigma)=\dfrac{\exp(-\theta d(\sigma, \sigma_0))}{\psi(\theta)}$$ 
where $\psi(\theta) =  \prod_{\substack{j=1}}^{n-1} \frac{1-\exp(-\theta(n-j+1))}{1 - \exp(-\theta)}$.  
$\sigma_0$ represents the central permutation and is the mode of the distribution iff the dispersion parameter $\theta > 0$. In this case, the greater the distance of a permutation to $\sigma_0$ the lower is its probability (it decreases exponentially). The dispersion parameter $\theta$  controls the speed of this fall.

MM is also often defined as: $$p(\sigma)=\dfrac{\phi^{d(\sigma, \sigma_0)}}{\prod_{\substack{j=1}}^{n-1} \frac{1-\phi^{(n-j+1)}}{1 - \phi}}$$ which implies that $\phi = \exp(-\theta)$. The next function allows to obtain $\theta$ given $\phi$.

*Remark:* $\phi$ is in $[0,1]$ here.

In [1]:
import numpy as np
import mallows_model as mm

In [2]:
phi = .7

In [3]:
theta = mm.phi_to_theta(phi)
theta

0.35667494393873245

Also, we can transform the parameter $\theta$ to $\phi$ using the function:

In [4]:
phi = mm.theta_to_phi(theta)
phi

0.7

Using the following function we can automatically convert $\theta$ to $\phi$ or $\phi$ to $\theta$ providing one of them as input to the function but not both.

In [5]:
mm.check_theta_phi(phi=phi, theta=None)

(array(0.35667494), array(0.7))

Usually, the MM is given with the following equivalent expression $$p(\sigma) \propto \phi^{d(\sigma,\sigma_0)},$$
where $\phi^{d(\sigma,\sigma_0)}=\exp(-\theta d(\sigma,\sigma_0))$.

# Mallows Model for Top-$k$ rankings

**Definition** A top-$k$ ranking ($k \le n$) $\sigma$ is a ranking $\sigma = (\sigma(1), \sigma(2), \dotsc, \sigma(n))$ for which only the first $k$ ranks  are known. Hence ranks of items $i$ such that $\sigma(i) \le k$.

The probability of top-$k$ ranking $\sigma$ is 

$$p(\sigma) = \exp(-\theta d(\sigma, \sigma_0))\frac{\psi(n-k, \theta)}{\psi(n, \theta)}$$

where $\psi(n,\theta)$ is the normalisation constant.

An example of top-5 ranking (with $n=10$) with the package would be:

In [6]:
alpha = np.array([ 4.,  0., np.NaN, 1., np.NaN, 3., np.NaN, 2., np.NaN, np.NaN])
alpha

array([ 4.,  0., nan,  1., nan,  3., nan,  2., nan, nan])