# Mass-multivariate analysis of 2D <span style="color:#b18bbb">landmark</span> data in <span style="color:#8476b5">Python</span>


**This notebook**...
* ... provides an overview of mass-multivariate, two-sample hypothesis testing of 2D landmark data in Python
* ... explains the details of script `contours_massmv_single.py` which appears in this repository
* ... is limited to mass-multivariate analysis (i.e., multivariate test statistics calculated at each point, with inference conducted in an omnibus sense over all points)
* ... is directed at novice Python users, who may be using Python for the first time.
* ... is likely not useful for intermediate or advanced Python users; please refer isntead to the scripts in `./lmfree2d/Python/`

**Dependencies** (alphabetical order):
* [lmfree2d](https://github.com/0todd0000/lmfree2d) &nbsp; &nbsp; (the `lmfree2d.py` module in this repository)
* [numpy](https://numpy.org)
* [scipy](https://scipy.org)
* [spm1d](http://www.spm1d.org)

___

## Install software

See the `contours_massmv` notebook for installation details.

Note that the **scipy** package is included by default with the Anaconda package.

___

# Prepare the workspace

Import all of the packages we'll need for this notebook.

In [11]:
import os
import numpy as np
from scipy import spatial
import lmfree2d as lm

___

## Load data



In [7]:
dirREPO  = lm.get_repository_path()
name     = 'Bell'
fname    = os.path.join(dirREPO, 'Data', name, 'landmarks.csv')
a        = np.loadtxt(fname, delimiter=',', skiprows=1)

print( a.shape )

(80, 4)


The variable `a` is an (80 x 4) array that contains the contents of the `landmarks.csv` file. The four columns are:

* Shape :  integers that identify shapes (1 to 10)
* Landmark :  integers that identify landmarks (1 to 8)
* X : the landmarks' X coordinates
* Y : the landmarks' Y coordinates

Let's assemble a into meaningful variable names:

In [9]:
shape    = a[:,0]
landmark = a[:,1]
r        = a[:,[2,3]]  # XY coordinates

print( r.shape )

(80, 2)


___

## Spatially align the landmarks

Usually landmarks are aligned using Generalized Procrustes Analysis (GPA), as demonstrated in the `landmarks_uv` notebook for R. Below a simpler method from **scipy** is used so that all processing can be achieved in Python. For general analysis, it would be better to use GPA in R.

In [13]:
r1 = r[shape==1]
r2 = r[shape==2]

r22 = spatial.procrustes(r1, r2)

print(r22.shape)

AttributeError: 'tuple' object has no attribute 'shape'

In [4]:

df       = pd.read_csv(fname, sep=',')
### convert to 3D array (nshapes, nlandmarks, 2)
nshapes  = df['SHAPE'].max()
nlm      = df['LANDMARK'].max()
r        = np.reshape( df[['X','Y']].values, (nshapes,nlm,2) )
### separate into groups:
r0,r1    = r[:5], r[5:]
### run nonparametric permutation test:
res      = two_sample_mass_multivariate_test(r0, r1)
print(res)


<module 'lmfree2d' from '/Users/todd/GitHub/lmfree2d/Python/lmfree2d.py'>
