---------------------
#### Multivariate Analysis of Variance (MANOVA) 
- is an extension of ANOVA that allows for the simultaneous analysis of multiple dependent variables. 
----------------------

In [46]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.multivariate.manova import MANOVA
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

In [47]:
# Load Iris dataset
iris = load_iris()
data = pd.DataFrame(data=np.c_[iris['data'], iris['target']], columns=iris['feature_names'] + ['target'])

In [48]:
iris_df

Unnamed: 0,sepal_length_cm,sepal_width_cm,petal_length_cm,petal_width_cm,target,species
0,5.1,3.5,1.4,0.2,0.0,setosa
1,4.9,3.0,1.4,0.2,0.0,setosa
2,4.7,3.2,1.3,0.2,0.0,setosa
3,4.6,3.1,1.5,0.2,0.0,setosa
4,5.0,3.6,1.4,0.2,0.0,setosa
...,...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,2.0,virginica
146,6.3,2.5,5.0,1.9,2.0,virginica
147,6.5,3.0,5.2,2.0,2.0,virginica
148,6.2,3.4,5.4,2.3,2.0,virginica


In [54]:
fit = MANOVA.from_formula('sepal_length_cm+sepal_width_cm+petal_length_cm+petal_width_cm ~ species', 
                          data=iris_df)

In [55]:
print(fit.mv_test())

                   Multivariate linear model
                                                                
----------------------------------------------------------------
       Intercept         Value  Num DF  Den DF   F Value  Pr > F
----------------------------------------------------------------
          Wilks' lambda  0.0170 4.0000 144.0000 2086.7720 0.0000
         Pillai's trace  0.9830 4.0000 144.0000 2086.7720 0.0000
 Hotelling-Lawley trace 57.9659 4.0000 144.0000 2086.7720 0.0000
    Roy's greatest root 57.9659 4.0000 144.0000 2086.7720 0.0000
----------------------------------------------------------------
                                                                
----------------------------------------------------------------
        species          Value  Num DF  Den DF   F Value  Pr > F
----------------------------------------------------------------
          Wilks' lambda  0.0234 8.0000 288.0000  199.1453 0.0000
         Pillai's trace  1.1919 8.0000 290.00

#### interpretaion

- Null Hypothesis : NO GROUP DIFFERENCES

Multivariate Tests:

- Wilks' Lambda (λ): A multivariate analog of the F-test in ANOVA. A smaller value of λ indicates a significant difference between groups.
- Pillai's Trace: Another multivariate F-test statistic. Values close to 0 indicate no significant difference, while larger values suggest significant differences.
- Hotelling's Trace: Similar to Pillai's Trace, but may be more sensitive to departures from multivariate normality.


Interpretation: Look for the significance level (p-value) associated with these tests. If the p-value is below your chosen significance level (e.g., 0.05), you reject the null hypothesis of no group differences.