# Attempt to recreate cross-validated MANOVA in Python
Based on the paper by [Allefeld & Haynes (2014)](http://www.sciencedirect.com/science/article/pii/S1053811913011920).

# GLM
First, calculate parameter(s) $\beta$ based on design ($X$) and dependent variables ($y$):

\begin{align}
\beta = (X'X)^{-1}X'y
\end{align}

In [4]:
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
import numpy as np

y, categories = load_iris(return_X_y=True)
ohe = OneHotEncoder(sparse=False)
X = ohe.fit_transform(categories[:, np.newaxis]).astype(float)

In [5]:
betas = np.linalg.pinv(X.T.dot(X)).dot(X.T).dot(y)

The within-class covariance $\Sigma$, given the residuals $\Xi$ is:

\begin{align}
\Sigma = \frac{1}{N}(\Xi'\Xi)
\end{align}

And the between-class covariance is:

\begin{align}
\frac{1}{N}\beta'_{\Delta}X'X\beta_{\Delta}
\end{align}

In [27]:
contrast = np.array([
    [1, 0, -1],
    [0, 1, -1]
])

c_bar = np.linalg.pinv(contrast.T.dot(contrast)).dot(contrast.T)
bdelta = contrast.T.dot(c_bar.T).dot(betas)
bdelta.shape

(3, 4)

In [28]:
resids = y - X.dot(betas)
error_cov = resids.T.dot(resids)
error_cov.shape

(4, 4)

In [29]:
beta_cov = bdelta.T.dot(X.T).dot(X).dot(bdelta)
beta_cov.shape

(4, 4)

In [30]:
hotellings_T = np.trace(beta_cov.dot(np.linalg.pinv(error_cov)))
hotellings_T

32.549524663569905

In [32]:
from statsmodels.multivariate.manova import MANOVA
mod=MANOVA(endog=y, exog=X)
res = mod.mv_test([('a', contrast)])
res.summary()

0,1,2,3
,,,

0,1,2,3,4,5,6
,a,Value,Num DF,Den DF,F Value,Pr > F
,Wilks' lambda,0.0235,8.0000,288.0000,198.7110,0.0000
,Pillai's trace,1.1872,8.0000,290.0000,52.9486,0.0000
,Hotelling-Lawley trace,32.5495,8.0000,203.4024,583.4914,0.0000
,Roy's greatest root,32.2720,4.0000,145.0000,1169.8585,0.0000
