# Advanced Learning - Class 6

### Multidimensional Scaling (MDS)

MDS is a method used for the visualization of **any kind of data** (e.g. networks, quantitative, categorical, texts, etc.) for which **we can compute a distance**.

MDS can be particularly useful for **dimensionality reduction** of quantitative data. The idea of MDS is to find a representation of the data in a low-dimensional space (d) wich **preserves the topology of the input data**. 

*In practice, the distance between two datapoints in the latent space should be as close as possible to the distance of these two points in the original space*.

This is an <u>optimization problem</u> such that:

$$\underset{Z}{\text{min }}\underset{i=1}{\overset{n}{\sum}}\underset{j=1; j\neq i}{\overset{n}{\sum}}||\text{dist}_1(x_i, x_j) - \text{dist}_2(z_i, z_j)||^2$$

Where $x_i$ is the original data, and $z_i$ is its projection in the latent space/visualization space.

<hr>

**Exercise**: Use MDS to position the nodes with the following network structure

*original/adjacency matrix*

| | A | B | C | D | E | 
| --- | --- | --- | --- | --- | --- |
| A | 0 | 1 | 1 | 1 | 0 |
| B | 1 | 0 | 0 | 0 | 1 |
| C | 1 | 0 | 0 | 0 | 1 |
| D | 1 | 0 | 0 | 0 | 1 |
| E | 0 | 1 | 1 | 1 | 0 |

*distance of the shortest path*

| | A | B | C | D | E | 
| --- | --- | --- | --- | --- | --- |
| A | 0 | 1 | 1 | 1 | 2 |
| B | | 0 | 2 | 2 | 1 |
| C | | | 0 | 2 | 1 |
| D | | | | 0 | 1 |
| E | | | | | 0 |

MDS offers a simple solution to visualize networks:

- (+) easy and allows to quantify the deformations
- (-) no way to choose the appropriate dimensionality for the visualization space ($d\in\{2, 3\}$)
- (-) This is a deterministic approach that does not take into account the possible uncertainty on the data (particularly on the edges)

<hr>

### The Latent Space Model (LSM, Hoff, Hardcock and Raftey, 2001)

LSM is one of the first statistical models proposed to model and visualize networks.

One of the key features of LSM is its ability to **model the uncertainty on the observed edges**. The goal of the LSM is to provide a **latent representation of the data** such that:

- Two points close in the space are very likely to connect
- Two points that are far away in the latent space will have a low probability to connect

Let's first consider the random variable $X$ such that:

\begin{align}
X_{i,j} &= 1 \text{ if i is connected to j}\\
X_{i,j} &= 0 \text{ if there are not connected}
\end{align}

The LSM model assumes that the probability of $X_{i,j}$ is:

$$logit(P(X_{i,j}=1|\theta)) = \alpha + \beta Y_{i,j} - ||Z_i-Z_j||^2$$

Where $Z_i$ and $Z_j$ are the coordinates of the nodes $i$ and $j$ in the latent space and $Y_{i,j}$ is some covariates about the pair $(i, j)$ with $\theta = \{\alpha, \beta, Z_1, Z_2, ..., Z_n\}$.

Thanks to this modeling, **the LSM model will put close together nodes that have a high probability to connect**.

The covariate $Y_{i,j}$ may be, for instance, some (possibly multivariate) information, about the pair:

> $Y_{i,j}$ is the number of years two people have known each other

In this model, the data are $X_{i,j})_{i,j\in\{1,...,n\}}$ (and eventually the $Y_{i,j}$) and the model parameters are $\theta=\{\alpha, \beta, Z_1, ..., Z_n\}$.

The graphical model for the LSM is:

![lsm_latent](lsm_latent.png)

\begin{align}
Y &= \beta^tX+\epsilon\\
\epsilon &\sim \mathcal{N}(0, \sigma^2)\\
\end{align}

<u>Estimating the model parameters:</u>

It is possible to use the Maximum Likelihood approach to estimate $\theta=\{\alpha, \beta, Z_1, ..., Z_n\}$ from the data:

$$log\mathcal{L}(X,\theta)=\underset{(i,j), i\neq j}{\overset{n}{\sum}}\big[X_{i,j}(\alpha+\beta Y_{i,j}-d_{i,j}^2)-log(1+exp(\alpha + \beta Y_{i,j} - d_{i,j}^2))\big]$$

Where $d^2_{i,j}=||Z_i - Z_j||^2$.

Unfortunately, there is no closed-form solution for $\hat{\theta}_{ML}$ and we have to numerically optimize this function.

- Adding covariates $Y_{i,j}$:
    - a number of years in common in a society
    - a type of relationship (Categorical variable): $Y_{i,j}\in\{1,\ldots,k\}\rightarrow Y_{i,j}=(0,0,1,0,0)$ given $Y_{i,j}=3$. $\beta$ is now a vector
    
- Choice of the distance $||Z_i - Z-j||^2_0$:
    - it could be an Euclidian distance $||.||_2$ (the most natural), but it could be any other distance

<u>In R</u>

In R, LSM is implemented with the "latentnet" package, which implements the orignal approach of Hoff et al. (2001), i.e. MCMC for a Bayesian version of the LSM.

- $LSM + Z_i\sim N(\mu, \Sigma)$

And the UBLPCM package implements a UBER algorithm for this model.