## ANOVA decomposition with interaction terms

Let $X=(X_{1},X_{2})$ denote the input vector and 
\begin{equation*}
    m(X) = \beta_{0} + \beta_{1} X_{1} + \beta_{2} X_{2} + \beta_{3} X_{1} X_{2}. 
\end{equation*}
At this point in time, we do not consider the error term $\varepsilon$. 

The grand mean is
\begin{equation*}
    \mu =  \beta_{0} + \beta_{1} E_{X_{1}} \left( X_{1} \right)
    + \beta_{2} E_{X_{2}} \left( X_{1} \right) +
    \beta_{3} E_{X_{1}} \left( X_{1} \right) E_{X_{2}} \left( X_{1} \right) +
\end{equation*}
A main effects are
\begin{equation*}
    m_{\{i\}} \left(X\right)
    = \beta_{i} \left(X_{i} - E_{X_{i}} \left(X_{i}\right)\right) + \beta_{3} \left(X_{i} - E_{X_{i}} \left(X_{i}\right)\right) E_{X_{-i}}
    \left(X_{-i}\right), \ i \in \{1,2\}. 
\end{equation*}
The main effect of $S=\{1,2\}$ is
\begin{equation*}
\begin{aligned}
m_{\{1,2\}} &= m (X) - m_{\{1\}} (X) - m_{\{2\}} (X) - \mu \\
&= \beta_{3} X_{1} X_{2} - \beta_{3} X_1 E_{X_{2}} \left(X_{2}\right)  - \beta_{3} X_2 E_{X_{1}} \left(X_{1}\right) - \beta_{3} E_{X_{1}} \left( X_{1} \right) E_{X_{2}} \left( X_{1} \right).
\end{aligned}
\end{equation*}
The resulting decomposition is
\begin{equation*}
    m \left(X \right) = \mu + m_{\{1\}} \left(X\right) + m_{\{2\}} \left(X\right) +  m_{1,2} \left(X\right) + \varepsilon
\end{equation*}

In [1]:
from anova_class import *

In [2]:
n = 10000 # number of observations
d = 3 # number of variables

beta1 = 3
beta2 = 5
beta3 = 7

In [3]:
#Create sample

hsample = pd.DataFrame(np.random.uniform(0,1,size=(n, d)))

hsample['Y'] =beta1*hsample[0] + beta2 *hsample[1] + beta3 * hsample[0]*hsample[1] + np.random.normal(0,1)



In [4]:
np.cov(hsample[0],hsample[1])

array([[0.08249569, 0.00062017],
       [0.00062017, 0.0819782 ]])

### ANOVA decomposition

Problems: The variance does not sum up to the total variance, although decomposition precision is good. 

Further, $\frac{\sigma_{2}^{2}}{\sigma^{2}} = 1.0710562747914085$. From a theoretical perspective this is impossible. 

In [5]:
ANOVA(hsample).linear_anova(beta1,beta2,beta3)

Decomposition precision:0.0
Percentage of total variance : $\sigma$,$\sigma_{1}$,$\sigma_{2}$,$\sigma_{12}$


[1.0, 0.3563066853413444, 0.605666359240587, 0.03396455237777819]