# Basic use of the mview module

In [1]:
import sys; sys.path.insert(0,"../")
import numpy as np
%matplotlib notebook

import mview

The mview module contains many ways to set up and solve MPSE problems. Here, we discuss the simplest way to use the module to set up and solve the most basic MPSE problem.

In this notebook, we will just consider the following set of dissimilarities: 32 nodes are assigned 3D coordinates, 3 distance matrices are computed, each by computing the distances between the nodes after being projected into 2D images by one of the standard orthogonal projections.

In [2]:
N = 32 #number of points
X = np.random.randn(N,3) #positions in 3D

Y1 = X[:,[1,2]] #projection of data into 2D, viewed from x-direction
Y2 = X[:,[2,0]]
Y3 = X[:,[0,1]]

from scipy.spatial import distance_matrix
D1 = distance_matrix(Y1,Y1) #pairwise distance matrix of Y1
D2 = distance_matrix(Y2,Y2)
D3 = distance_matrix(Y3,Y3)

## Using mview.basic()

The simplest way to use the mview module is via the function mview.basic. 

In its most basic form, this function does the following:

1) takes as input a list of distance/dissimilarity matrices $D=[D_1,D_2,\dots,D_K]$.

2) computes a random initial embedding and a random initial projection matrix for each dissimilarity matrix.

3) runs (an adaptive scheme of) gradient descent to find an embedding $X$ and projection parameters $Q = [Q_1,Q_2,\dots,Q_K]$ that minimize the MPSE stress function.

4) returns an instance of the MPSE class, which contains results, along with methods for plotting (or continuing computations).

The list of defaults is very large, so I'll cover them more optional parameters are introduced. But for now it is enough to know that by default mview.basic tries to find a 3D embedding by supposing that the given dissimilarity matrices correspond to distances from 2D images, each obtained from orthogonal projections from 3D to 2D.

In [3]:
mv = mview.basic([D1,D2,D3],verbose=2) #most basic run of mview.basic

mview.MPSE():
  data details:
    number of perspectives : 3
    number of samples : 32
  visualization details:
    embedding dimension : 3
    image dimension : 2
    fixed embedding : False
    fixed projections : False
    visualization type : mds
  setup visualization instance for perspective 1 :
    mview.MDS():
      data details:
        number of samples : 32
      embedding details:
        embedding dimension : 2
    initial embedding : random
    initial stress : 5.74e-02
  setup visualization instance for perspective 2 :
    mview.MDS():
      data details:
        number of samples : 32
      embedding details:
        embedding dimension : 2
    initial embedding : random
    initial stress : 5.60e-02
  setup visualization instance for perspective 3 :
    mview.MDS():
      data details:
        number of samples : 32
      embedding details:
        embedding dimension : 2
    initial embedding : random
    initial stress : 6.42e-02


UnboundLocalError: local variable 'X0' referenced before assignment

Here, the output mv is an object in the MPSE, containing all the results of the experiment. A list of the attributes is shown below (only some are relevant for the purposes of this introduction). 

The most relevant of these are the final embedding 'X', the final projection parameters 'Q', the final cost 'cost', the total time 'time', and the computation dictionary 'H', containing the cost history 'H["costs"]'.

In [4]:
print(mv.__dict__.keys())

dict_keys(['verbose', 'title', 'level', 'DD', 'D', 'K', 'N', 'd1', 'd2', 'family', 'constraint', 'proj', 'X', 'X_is_fixed', 'Q', 'Q_is_fixed', 'visualization_method', 'visualization', 'cost_function', 'cost_function_all', 'F', 'FX', 'FQ', 'time', 'X0', 'Q0', 'Y', 'cost', 'individual_cost', 'H'])


The final embedding (after running the gradient descent scheme) is given by attribute 'X', as shown here:

In [5]:
print(mv.X)

[[ 1.44718637e+00 -6.03087381e-01  1.38024893e+00]
 [ 7.25359189e-01  1.55308728e-01  9.49555139e-01]
 [-3.70295824e-01  6.52961357e-01  7.38894154e-01]
 [-1.15500803e+00 -4.19361208e-01 -4.23582329e-01]
 [-4.03335498e-01 -3.88879692e-01 -1.26914297e+00]
 [ 1.55138609e+00 -1.29642013e-01  1.93440593e+00]
 [ 1.31784196e+00  7.96010600e-01 -1.36739844e+00]
 [-4.66251199e-01  1.08529845e+00 -4.53206855e-01]
 [ 4.97375061e-01 -2.62580943e-01 -7.97870153e-01]
 [ 1.13692116e+00  9.70546906e-01 -4.38779819e-01]
 [ 1.64157193e-01  1.25975713e+00  8.79528933e-01]
 [ 1.30945953e-01 -2.44481220e+00  9.01589073e-01]
 [-1.65290018e+00  2.28989755e-01 -2.39723075e-01]
 [ 3.92137726e-01 -1.98251278e-01  5.47753412e-01]
 [-3.10849954e+00 -3.29756347e-01  6.11708305e-01]
 [ 8.38376878e-01 -6.44764293e-01  1.41765816e-01]
 [-1.31702880e-01 -1.08842456e+00  1.20161271e+00]
 [ 1.04195906e+00  4.96660405e-01  1.21197725e-01]
 [ 9.45149555e-01  1.43985772e+00 -1.98946839e+00]
 [ 3.97355342e-02  3.78836399e-

The final orthogonal matrices are given by the attribute 'Q'. Note that in this setup, these will be 2x3 matrices (since these are projections from 3D to 2D)

In [6]:
print(mv.Q)

[[[ 0.99165312 -0.01476832  0.12808583]
  [-0.12222079 -0.42405383  0.8973519 ]]

 [[-0.02516483 -0.91419585 -0.40449064]
  [ 0.95735609  0.09445274 -0.27303477]]

 [[ 0.25959201 -0.08499167  0.9619711 ]
  [ 0.13178491 -0.98368347 -0.12247272]]]


Even if the global minimum of the MPSE stress function was found, these projection matrices will not be the same as the standard projections matrices (what we used to produce the original dissimilarities), since the MPSE stress function is invariant under 'rotations' of the embedding and projections.

We can still check that these are orthogonal matrices and check whether these are orthogonal to each other (which may not be the case if the global minima was not found).

In [7]:
#These should be the identity matrix if the computed projection matrices is 
#orthogonal:
with np.printoptions(precision=3, suppress=True):
    for k in range(3):
        print(mv.Q[k] @ mv.Q[k].T)

[[1. 0.]
 [0. 1.]]
[[ 1. -0.]
 [-0.  1.]]
[[ 1. -0.]
 [-0.  1.]]


In [8]:
#These should be 0 if the normal to the projections are orthogonal to the others:
print(np.dot(np.cross(mv.Q[0][0],mv.Q[0][1]),
             np.cross(mv.Q[1][0],mv.Q[1][1])))
print(np.dot(np.cross(mv.Q[0][0],mv.Q[0][1]),
             np.cross(mv.Q[2][0],mv.Q[2][1])))
print(np.dot(np.cross(mv.Q[1][0],mv.Q[1][1]),
             np.cross(mv.Q[2][0],mv.Q[2][1])))

7.909800068350847e-05
-0.001188418088072074
-0.0002546949056574238


The final normalized MPSE (MDS) cost is given by the attribute 'cost'.

In [9]:
print(mv.cost)

0.00110920168443455


The total computation time (in seconds) is given by attribute 'time'.

In [10]:
print(mv.time)

4.734323978424072


The computation history is given by attribute 'H'. This is a dictionary containing different specs in the computation, as shown below.

In [11]:
print(mv.H.keys())

dict_keys(['iterations', 'markers', 'costs', 'X_iters', 'X_steps', 'X_grads', 'X_lrs', 'Q_iters', 'Q_steps', 'Q_grads', 'Q_lrs'])


For example, the computed/estimated costs at each iteration are given by key 'costs' in this dictionary.

In [12]:
print(mv.H['costs'])

[4.90569735e-01 5.95938608e-01 4.87701292e-01 4.47024359e-01
 3.45823580e-01 3.00820506e-01 2.87016288e-01 3.01154988e-01
 2.61409792e-01 2.51977868e-01 2.41060588e-01 2.26101003e-01
 2.07597218e-01 1.84071336e-01 1.59034910e-01 1.43725134e-01
 1.33634412e-01 1.21444886e-01 1.13652093e-01 1.09498400e-01
 1.03928035e-01 9.53957684e-02 8.67939108e-02 8.64178512e-02
 8.57236718e-02 8.48413798e-02 8.43310897e-02 8.41437067e-02
 8.39008195e-02 8.35811510e-02 8.33072968e-02 8.31131971e-02
 8.29733458e-02 8.29827777e-02 8.28677026e-02 8.27796389e-02
 8.26656088e-02 8.26276177e-02 8.25830667e-02 8.25412747e-02
 8.25013859e-02 8.24643780e-02 8.24340672e-02 8.24093142e-02
 8.24355993e-02 8.24629381e-02 8.24876090e-02 8.23774825e-02
 8.23467291e-02 8.23366214e-02 8.23303139e-02 8.23237164e-02
 8.23163244e-02 8.23078045e-02 8.22981832e-02 8.22938575e-02
 8.23069612e-02 8.23366300e-02 8.23240202e-02 8.22806680e-02
 8.22594059e-02 8.22535509e-02 8.22472954e-02 8.22368785e-02
 8.22138034e-02 8.213776

## Quick plots

To plot the final embedding, we can use the MPSE method below.

Note: The 'axes' are the normal components to the computed projections (so in this case should be orthogonal to each other).

In [13]:
mv.figureX()

<IPython.core.display.Javascript object>

To plot the final projected images, use the MPSE method below.

Note: the colors are added automatically; the more gradual the change in the colors, the better the embedding (at least in this default setting).

In [14]:
mv.figureY()

<IPython.core.display.Javascript object>

To plot the computation history, use the MPSE method below. The following quantities are plotted:

1) cost : stress (computed or estimated at each iteration)

2) gradient size : size of the gradient (root-mean-square of entries by default)

3) learning rate : learning rate used at each iteration (varies if using an adaptive scheme)

4) step size : the size of the step jump between iterations (root-mean-square of entires by default)

The last 3 quantities are given separately for the embedding $X$ and the projection parameters $Q$ (which we think as a $Kx2x3$ array containing all of the projection matrices).


In [15]:
mv.figureH()

<IPython.core.display.Javascript object>

If this run of mview.basic didn't result in the global minima, make sure to run the whole notebook again!

## Other attributes of interest

The other attributes in the resulting MPSE object are not immediately of use to us, but I'll describe as the need arises. Other methods can be used on the MPSE object too (such as running a different GD scheme where we left off), and I'll describe those elsewhere.