From 230df1f5fd245b0ecf1bfc8b47897b63c2da0be8 Mon Sep 17 00:00:00 2001 From: Joram Soch Date: Sun, 6 Nov 2022 16:04:02 +0100 Subject: [PATCH 1/3] added 2 definitions --- D/anova1.md | 59 ++++++++++++++++++++++++++++++++++++ D/anova2.md | 86 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 145 insertions(+) create mode 100644 D/anova1.md create mode 100644 D/anova2.md diff --git a/D/anova1.md b/D/anova1.md new file mode 100644 index 00000000..97d69257 --- /dev/null +++ b/D/anova1.md @@ -0,0 +1,59 @@ +--- +layout: definition +mathjax: true + +author: "Joram Soch" +affiliation: "BCCN Berlin" +e_mail: "joram.soch@bccn-berlin.de" +date: 2022-11-06 10:23:00 + +title: "One-way analysis of variance" +chapter: "Statistical Models" +section: "Univariate normal data" +topic: "Analysis of variance" +definition: "One-way ANOVA" + +sources: + - authors: "Bortz, Jürgen" + year: 1977 + title: "Einfaktorielle Varianzanalyse" + in: "Lehrbuch der Statistik. Für Sozialwissenschaftler" + pages: "ch. 12.1, pp. 528ff." + url: "https://books.google.de/books?id=lNCyBgAAQBAJ" + - authors: "Denziloe" + year: 2018 + title: "Derive the distribution of the ANOVA F-statistic under the alternative hypothesis" + in: "StackExchange CrossValidated" + pages: "retrieved on 2022-11-06" + url: "https://stats.stackexchange.com/questions/355594/derive-the-distribution-of-the-anova-f-statistic-under-the-alternative-hypothesi" + +def_id: "D181" +shortcut: "anova1" +username: "JoramSoch" +--- + + +**Definition:** Consider measurements $y_{ij} \in \mathbb{R}$ from disctinct objects $j = 1, \ldots, n_i$ in separate groups $i = 1, \ldots, k$. + +Then, in one-way analysis of variance (ANOVA), these measurements are assumed to come from [normal distributions](/D/norm) + +$$ \label{eq:anova1} +y_{ij} \sim \mathcal{N}(\mu_i, \sigma^2) \quad \text{for all} \quad i = 1, \ldots, k \quad \text{and} \quad j = 1, \dots, n_i +$$ + +where + +* $\mu_i$ is the [expected value](/D/mean) in group $i$ and + +* $\sigma^2$ is the common [variance](/D/var) across groups. + +Alternatively, the model may be written as + +$$ \label{eq:anova1-alt} +\begin{split} +y_{ij} &= \mu_i + \varepsilon_{ij} \\ +\varepsilon_{ij} &\overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2) +\end{split} +$$ + +where $\varepsilon_{ij}$ is the [error term](/D/slr) belonging to observation $j$ in category $i$ and $\varepsilon_{ij}$ are the [independent and identically distributed](/D/iid). \ No newline at end of file diff --git a/D/anova2.md b/D/anova2.md new file mode 100644 index 00000000..bcca0847 --- /dev/null +++ b/D/anova2.md @@ -0,0 +1,86 @@ +--- +layout: definition +mathjax: true + +author: "Joram Soch" +affiliation: "BCCN Berlin" +e_mail: "joram.soch@bccn-berlin.de" +date: 2022-11-06 13:41:00 + +title: "Two-way analysis of variance" +chapter: "Statistical Models" +section: "Univariate normal data" +topic: "Analysis of variance" +definition: "Two-way ANOVA" + +sources: + - authors: "Bortz, Jürgen" + year: 1977 + title: "Zwei- und mehrfaktorielle Varianzanalyse" + in: "Lehrbuch der Statistik. Für Sozialwissenschaftler" + pages: "ch. 12.2, pp. 538ff." + url: "https://books.google.de/books?id=lNCyBgAAQBAJ" + - authors: "ttd" + year: 2021 + title: "Proof on SSAB/s2~chi2(I-1)(J-1) under the null hypothesis HAB: dij=0 for i=1,...,I and j=1,...,J" + in: "StackExchange CrossValidated" + pages: "retrieved on 2022-11-06" + url: "https://stats.stackexchange.com/questions/545807/proof-on-ss-ab-sigma2-sim-chi2-i-1j-1-under-the-null-hypothesis" + +def_id: "D182" +shortcut: "anova2" +username: "JoramSoch" +--- + + +**Definition:** Let there be two factors $A$ and $B$ with levels $i = 1, \ldots, a$ and $j = 1, \ldots, b$ that are used to group measurements $y_{ijk} \in \mathbb{R}$ from distinct objects $k = 1, \ldots, n_{ij}$ into $a \cdot b$ categories $(i,j) \in \left\lbrace 1, \ldots, a \right\rbrace \times \left\lbrace 1, \ldots, b \right\rbrace$. + +Then, in two-way analysis of variance (ANOVA), these measurements are assumed to come from [normal distributions](/D/norm) + +$$ \label{eq:anova2-p1} +y_{ij} \sim \mathcal{N}(\mu_{ij}, \sigma^2) \quad \text{for all} \quad i = 1, \ldots, a, \quad j = 1, \ldots, b, \quad \text{and} \quad k = 1, \dots, n_{ij} +$$ + +with + +$$ \label{eq:anova2-p2} +\mu_{ij} = \mu + \alpha_i + \beta_j + \gamma_{ij} +$$ + +where + +* $\mu$ is called the "grand mean"; + +* $\alpha_i$ is the additive "main effect" of the $i$-th level of factor $A$; + +* $\beta_j$ is the additive "main effect" of the $j$-th level of factor $B$; + +* $\gamma_{ij}$ is the non-additive "interaction effect" of category $(i,j)$; + +* $\mu_{ij}$ is the [expected value](/D/mean) in category $(i,j)$; and + +* $\sigma^2$ is common [variance](/D/var) across all categories. + +Alternatively, the model may be written as + +$$ \label{eq:anova2-alt} +\begin{split} +y_{ijk} &= \mu + \alpha_i + \beta_j + \gamma_{ij} + \varepsilon_{ijk} \\ +\varepsilon_{ijk} &\sim \mathcal{N}(0, \sigma^2) +\end{split} +$$ + +where $\varepsilon_{ijk}$ is the [error term](/D/slr) corresponding to observation $k$ belonging to the $i$-th level of $A$ and the $j$-th level of $B$. + +As the two-way ANOVA model is underdetermined, the parameters of the model are additionally subject to the constraints + +$$ \label{eq:anova2-cons} +\begin{split} +\sum_{i=1}^{a} w_{ij} \alpha_i &= 0 \quad \text{for all} \quad j = 1, \ldots, b \\ +\sum_{j=1}^{b} w_{ij} \beta_j &= 0 \quad \text{for all} \quad i = 1, \ldots, a \\ +\sum_{i=1}^{a} w_{ij} \gamma_{ij} &= 0 \quad \text{for all} \quad j = 1, \ldots, b \\ +\sum_{j=1}^{b} w_{ij} \gamma_{ij} &= 0 \quad \text{for all} \quad i = 1, \ldots, a +\end{split} +$$ + +where the weights are $w_{ij} = n_{ij}/n$ and the total sample size is $n = \sum_{i=1}^{a} \sum_{j=1}^{b} n_{ij}$. \ No newline at end of file From 20144f0a13a07f00893c10469fdf26258df28200 Mon Sep 17 00:00:00 2001 From: Joram Soch Date: Sun, 6 Nov 2022 16:04:54 +0100 Subject: [PATCH 2/3] added 3 proofs --- P/anova1-f.md | 191 ++++++++++++++++++++++++++++++++++++++++++++++++ P/anova1-ols.md | 67 +++++++++++++++++ P/anova2-ols.md | 154 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 412 insertions(+) create mode 100644 P/anova1-f.md create mode 100644 P/anova1-ols.md create mode 100644 P/anova2-ols.md diff --git a/P/anova1-f.md b/P/anova1-f.md new file mode 100644 index 00000000..4520ad6e --- /dev/null +++ b/P/anova1-f.md @@ -0,0 +1,191 @@ +--- +layout: proof +mathjax: true + +author: "Joram Soch" +affiliation: "BCCN Berlin" +e_mail: "joram.soch@bccn-berlin.de" +date: 2022-11-06 13:05:00 + +title: "F-test for main effect in one-way analysis of variance" +chapter: "Statistical Models" +section: "Univariate normal data" +topic: "Analysis of variance" +theorem: "F-test for main effect in one-way ANOVA" + +sources: + - authors: "Denziloe" + year: 2018 + title: "Derive the distribution of the ANOVA F-statistic under the alternative hypothesis" + in: "StackExchange CrossValidated" + pages: "retrieved on 2022-11-06" + url: "https://stats.stackexchange.com/questions/355594/derive-the-distribution-of-the-anova-f-statistic-under-the-alternative-hypothesi" + +proof_id: "P370" +shortcut: "anova1-f" +username: "JoramSoch" +--- + + +**Theorem:** Assume the [one-way analysis of variance](/D/anova1) model + +$$ \label{eq:anova1} +y_{ij} = \mu_i + \varepsilon_{ij}, \; \varepsilon_{ij} \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2), \; i = 1, \ldots, k, \; j = 1, \dots, n_i \; , +$$ + +and consider the [null](/D/h0) and [alternative](/D/h1) hypothesis + +$$ \label{eq:anova1-h0} +\begin{split} +H_0: &\; \mu_1 = \ldots = \mu_K \\ +H_1: &\; \mu_i \neq \mu_j \quad \text{for at least one} \quad i,j \in \left\lbrace 1, \ldots, k \right\rbrace, \; i \neq j \; . +\end{split} +$$ + +Then, the [test statistic](/D/tstat) + +$$ \label{eq:anova1-f} +F = \frac{\frac{1}{k-1} \sum_{i=1}^{k} n_i (\bar{y}_i - \bar{y})^2}{\frac{1}{n-k} \sum_{i=1}^{k} \sum_{j=1}^{n_i} (y_{ij} - \bar{y}_i)^2} +$$ + +follows an [F-distribution](/D/f) under the null hypothesis: + +$$ \label{eq:anova1-f-h0} +F \sim \mathrm{F}(k-1, n-k), \; \text{if} \; H_0 \; . +$$ + + +**Proof:** Let $\mu$ be the common [mean](/D/mean) under the [null hypothesis](/D/h0) $\mu_1 = \ldots = \mu_K = \mu$. Under $H_0$, we have: + +$$ \label{eq:yij-h0} +y_{ij} \sim \mathcal{N}(\mu, \sigma^2) \quad \text{for all} \quad i = 1, \ldots, k, \; j = 1, \ldots, n_i \; . +$$ + +Thus, the [random variable](/D/rvar) $U_{ij} = (y_{ij} - \mu)/\sigma$ [follows a standard normal distribution](/P/norm-snorm) + +$$ \label{eq:Uij-h0} +U_{ij} = \frac{y_{ij} - \mu}{\sigma} \sim \mathcal{N}(0, 1) \; . +$$ + +Now consider the following sum: + +$$ \label{eq:sum-Uij-s1} +\begin{split} +\sum_{i=1}^{k} \sum_{j=1}^{n_i} U_{ij}^2 &= \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{y_{ij} - \mu}{\sigma} \right)^2 \\ +&= \frac{1}{\sigma^2} \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( (y_{ij} - \bar{y}_i) + (\bar{y}_i - \bar{y}) + (\bar{y} - \mu) \right)^2 \\ +&= \frac{1}{\sigma^2} \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left[ (y_{ij} - \bar{y}_i)^2 + (\bar{y}_i - \bar{y})^2 + (\bar{y} - \mu)^2 + 2 (y_{ij} - \bar{y}_i) (\bar{y}_i - \bar{y}) + 2 (y_{ij} - \bar{y}_i) (\bar{y} - \mu) + 2 (\bar{y}_i - \bar{y}) (\bar{y} - \mu) \right] \; . +\end{split} +$$ + +Because the following sum over $j$ is zero for all $i$ + +$$ \label{eq:sum-yij} +\begin{split} +\sum_{j=1}^{n_i} (y_{ij} - \bar{y}_i) &= \sum_{j=1}^{n_i} y_{ij} - n_i \bar{y}_i \\ +&= \sum_{j=1}^{n_i} y_{ij} - n_i \cdot \frac{1}{n_i} \sum_{j=1}^{n_i} y_{ij} \\ +&= 0, \; i = 1, \ldots, k +\end{split} +$$ + +and the following sum over $i$ and $j$ is also zero + +$$ \label{eq:sum-yib} +\begin{split} +\sum_{i=1}^{k} \sum_{j=1}^{n_i} (\bar{y}_i - \bar{y}) &= \sum_{i=1}^{k} n_i (\bar{y}_i - \bar{y}) \\ +&= \sum_{i=1}^{k} n_i \bar{y}_i - \bar{y} \sum_{i=1}^{k} n_i \\ +&= \sum_{i=1}^{k} n_i \cdot \frac{1}{n_i} \sum_{j=1}^{n_i} y_{ij} - n \cdot \frac{1}{n} \sum_{i=1}^{k} \sum_{j=1}^{n_i} y_{ij} \\ +&= 0 \; , +\end{split} +$$ + +where $n = \sum_{i=1}^{k} n_i$, the sum in \eqref{eq:sum-Uij-s1} reduces to + +$$ \label{eq:sum-Uij-s2} +\begin{split} +\sum_{i=1}^{k} \sum_{j=1}^{n_i} U_{ij}^2 &= \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left[ \left( \frac{y_{ij} - \bar{y}_i}{\sigma} \right)^2 + \left( \frac{\bar{y}_i - \bar{y}}{\sigma} \right)^2 + \left( \frac{\bar{y} - \mu}{\sigma} \right)^2 \right] \\ +&= \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{y_{ij} - \bar{y}_i}{\sigma} \right)^2 + \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{\bar{y}_i - \bar{y}}{\sigma} \right)^2 + \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{\bar{y} - \mu}{\sigma} \right)^2 \; . +\end{split} +$$ + +[Cochran's theorem](/P/snorm-cochran) states that, if a sum of squared [standard normal](/D/snorm) [random variables](/D/rvar) can be written as a sum of squared forms + +$$ \label{eq:cochran-p1} +\begin{split} +\sum_{i=1}^{n} U_i^2 = \sum_{j=1}^{m} Q_j \quad &\text{where} \quad Q_j = U^\mathrm{T} B^{(j)} U \\ +&\text{with} \quad \sum_{j=1}^{m} B^{(j)} = I_n \\ +&\text{and} \quad r_j = \mathrm{rank}(B^{(j)}) \; , +\end{split} +$$ + +then the terms $Q_j$ are [independent](/D/ind) and each term $Q_j$ follows a [chi-squared distribution](/D/chi2) with $r_j$ degrees of freedom: + +$$ \label{eq:cochran-p2} +Q_j \sim \chi^2(r_j), \; j = 1, \ldots, m \; . +$$ + +Let $U$ be the $n \times 1$ column vector of all observations + +$$ \label{eq:U} +U = \left[ \begin{matrix} u_1 \\ \vdots \\ u_k \end{matrix} \right] +$$ + +where the group-wise $n_i \times 1$ column vectors are + +$$ \label{yi} +u_1 = \left[ \begin{matrix} (y_{1,1}-\mu)/\sigma \\ \vdots \\ (y_{1,n_1}-\mu)/\sigma \end{matrix} \right], \quad \ldots, \quad u_k = \left[ \begin{matrix} (y_{k,1}-\mu)/\sigma \\ \vdots \\ (y_{k,n_k}-\mu)/\sigma \end{matrix} \right] \; . +$$ + +Then, we observe that the sum in \eqref{eq:sum-Uij-s2} can be represented in the form of \eqref{eq:cochran-p1} using the matrices + +$$ \label{eq:sum-Uij-s3-Bj} +\begin{split} +B^{(1)} &= I_n - \mathrm{diag}\left( \frac{1}{n_1} J_{n_1}, \; \ldots, \; \frac{1}{n_K} J_{n_K} \right) \\ +B^{(2)} &= \mathrm{diag}\left( \frac{1}{n_1} J_{n_1}, \; \ldots, \; \frac{1}{n_K} J_{n_K} \right) - \frac{1}{n} J_n \\ +B^{(2)} &= \frac{1}{n} J_n +\end{split} +$$ + +where $J_n$ is an $n \times n$ matrix of ones and $\mathrm{diag}\left( A_1, \ldots, A_n \right)$ denotes a block-diagonal matrix composed of $A_1, \ldots, A_n$. The matrices in \eqref{eq:sum-Uij-s3-Bj} fulfill $B^{(1)} + B^{(2)} + B^{(3)} = I_n$ and their ranks are given by: + +$$ \label{eq:sum-Uij-s3-Bj-rk} +\begin{split} +\mathrm{rank}\left( B^{(1)} \right) &= n-k \\ +\mathrm{rank}\left( B^{(2)} \right) &= k-1 \\ +\mathrm{rank}\left( B^{(3)} \right) &= 1 \; . +\end{split} +$$ + +Let's write down the [explained sum of squares](/D/ess) and the [residual sum of squares](/D/rss) for [one-way analysis of variance](/D/anova1) as + +$$ \label{eq:ess-rss} +\begin{split} +\mathrm{ESS} &= \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \bar{y}_i - \bar{y} \right)^2 \\ +\mathrm{RSS} &= \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( y_{ij} - \bar{y}_i \right)^2 \; . +\end{split} +$$ + +Then, using \eqref{eq:sum-Uij-s2}, \eqref{eq:cochran-p1}, \eqref{eq:cochran-p2}, \eqref{eq:sum-Uij-s3-Bj} and \eqref{eq:sum-Uij-s3-Bj-rk}, we find that + +$$ \label{eq:ess-rss-dist} +\begin{split} +\frac{\mathrm{ESS}}{\sigma^2} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{\bar{y}_i - \bar{y}}{\sigma} \right)^2 &= Q_2 = U^\mathrm{T} B^{(2)} U \sim \chi^2(k-1) \\ +\frac{\mathrm{RSS}}{\sigma^2} = \sum_{i=1}^{k} \sum_{j=1}^{n_i} \left( \frac{y_{ij} - \bar{y}_i}{\sigma} \right)^2 &= Q_1 = U^\mathrm{T} B^{(1)} U \sim \chi^2(n-k) \; . +\end{split} +$$ + +Because $\mathrm{ESS}/\sigma^2$ and $\mathrm{RSS}/\sigma^2$ are also independent by \eqref{eq:cochran-p2}, the F-statistic from \eqref{eq:anova1-f} is equal to the ratio of two independent [chi-squared distributed](/D/chi2) [random variables](/D/rvar) divided by their degrees of freedom + +$$ \label{eq:anova1-f-ess-tss} +\begin{split} +F &= \frac{(\mathrm{ESS}/\sigma^2)/(k-1)}{(\mathrm{RSS}/\sigma^2)/(n-k)} \\ +&= \frac{\mathrm{ESS}/(k-1)}{\mathrm{RSS}/(n-k)} \\ +&= \frac{\frac{1}{k-1} \sum_{i=1}^{k} \sum_{j=1}^{n_i} (\bar{y}_i - \bar{y})^2}{\frac{1}{n-k} \sum_{i=1}^{k} \sum_{j=1}^{n_i} (y_{ij} - \bar{y}_i)^2} \\ +&= \frac{\frac{1}{k-1} \sum_{i=1}^{k} n_i (\bar{y}_i - \bar{y})^2}{\frac{1}{n-k} \sum_{i=1}^{k} \sum_{j=1}^{n_i} (y_{ij} - \bar{y}_i)^2} +\end{split} +$$ + +which, [by definition of the F-distribution](/D/f), is distributed as: + +$$ \label{eq:anova1-f-qed} +F \sim \mathrm{F}(k-1, n-k), \; \text{if} \; H_0 \; . +$$ \ No newline at end of file diff --git a/P/anova1-ols.md b/P/anova1-ols.md new file mode 100644 index 00000000..91f1122b --- /dev/null +++ b/P/anova1-ols.md @@ -0,0 +1,67 @@ +--- +layout: proof +mathjax: true + +author: "Joram Soch" +affiliation: "BCCN Berlin" +e_mail: "joram.soch@bccn-berlin.de" +date: 2022-11-06 11:18:00 + +title: "Ordinary least squares for one-way analysis of variance" +chapter: "Statistical Models" +section: "Univariate normal data" +topic: "Analysis of variance" +theorem: "Ordinary least squares for one-way ANOVA" + +sources: + +proof_id: "P369" +shortcut: "anova1-ols" +username: "JoramSoch" +--- + + +**Theorem:** Given the [one-way analysis of variance](/D/anova1) assumption + +$$ \label{eq:anova1} +y_{ij} = \mu_i + \varepsilon_{ij}, \; \varepsilon_{ij} \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2), \; i = 1, \ldots, k, \; j = 1, \dots, n_i \; , +$$ + +the parameters minimizing the [residual sum of squares](/D/rss) are given by + +$$ \label{eq:anova1-ols} +\hat{\mu}_i = \bar{y}_i +$$ + +where $\bar{y}_i$ is the [sample mean](/D/mean-samp) of all observations in [group](/D/anova1) $i$: + +$$ \label{eq:mean-samp} +\hat{\mu}_i = \bar{y}_i = \frac{1}{n_i} \sum_{j=1}^{n_i} y_{ij} \; . +$$ + + +**Proof:** The [residual sum of squares](/D/rss) for this model is + +$$ \label{eq:rss} +\mathrm{RSS}(\mu) = \sum_{i=1}^{k} \sum_{j=1}^{n_i} \varepsilon_{ij}^2 = \sum_{i=1}^{k} \sum_{j=1}^{n_i} (y_{ij} - \mu_i)^2 +$$ + +and the derivatives of $\mathrm{RSS}$ with respect to $\mu_i$ are + +$$ \label{eq:rss-der} +\begin{split} +\frac{\mathrm{d}\mathrm{RSS}(\mu)}{\mathrm{d}\mu_i} &= \sum_{j=1}^{n_i} \frac{\mathrm{d}}{\mathrm{d}\mu_i} (y_{ij} - \mu_i)^2 \\ +&= \sum_{j=1}^{n_i} 2 (y_{ij} - \mu_i) (-1) \\ +&= 2 \sum_{j=1}^{n_i} (\mu_i - y_{ij}) \\ +&= 2 n_i \mu_i - 2 \sum_{j=1}^{n_i} y_{ij} \quad \text{for} \quad i = 1, \ldots, k \; . +\end{split} +$$ + +Setting these derivatives to zero, we obtain the estimates of $\mu_i$: + +$$ \label{eq:rss-der-zero} +\begin{split} +0 &= 2 n_i \hat{\mu}_i - 2 \sum_{j=1}^{n_i} y_{ij} \\ +\hat{\mu}_i &= \frac{1}{n_i} \sum_{j=1}^{n_i} y_{ij} \quad \text{for} \quad i = 1, \ldots, k \; . +\end{split} +$$ \ No newline at end of file diff --git a/P/anova2-ols.md b/P/anova2-ols.md new file mode 100644 index 00000000..8d321c0c --- /dev/null +++ b/P/anova2-ols.md @@ -0,0 +1,154 @@ +--- +layout: proof +mathjax: true + +author: "Joram Soch" +affiliation: "BCCN Berlin" +e_mail: "joram.soch@bccn-berlin.de" +date: 2022-11-06 15:55:00 + +title: "Ordinary least squares for two-way analysis of variance" +chapter: "Statistical Models" +section: "Univariate normal data" +topic: "Analysis of variance" +theorem: "Ordinary least squares for two-way ANOVA" + +sources: + +proof_id: "P371" +shortcut: "anova2-ols" +username: "JoramSoch" +--- + + +**Theorem:** Given the [two-way analysis of variance](/D/anova2) assumption + +$$ \label{eq:anova2} +\begin{split} +y_{ijk} &= \mu + \alpha_i + \beta_j + \gamma_{ij} + \varepsilon_{ijk} \\ +\varepsilon_{ijk} &\overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0, \sigma^2), \; i = 1, \ldots, a, \; j = 1, \ldots, b, \; k = 1, \dots, n_{ij} \; , +\end{split} +$$ + +the parameters minimizing the [residual sum of squares](/D/rss) and satisfying the [constraints for the model parameters](/D/anova2) are given by + +$$ \label{eq:anova2-ols} +\begin{split} +\hat{\mu} &= \bar{y}_{\bullet \bullet \bullet} \\ +\hat{\alpha}_i &= \bar{y}_{i \bullet \bullet} - \bar{y}_{\bullet \bullet \bullet} \\ +\hat{\beta}_j &= \bar{y}_{\bullet j \bullet} - \bar{y}_{\bullet \bullet \bullet} \\ +\hat{\gamma}_{ij} &= \bar{y}_{i j \bullet} - \bar{y}_{i \bullet \bullet} - \bar{y}_{\bullet j \bullet} + \bar{y}_{\bullet \bullet \bullet} +\end{split} +$$ + +where $\bar{y}_{\bullet \bullet \bullet}$, $\bar{y}_{i \bullet \bullet}$, $\bar{y}_{\bullet j \bullet}$ and $\bar{y}_{i j \bullet}$ are the following [sample means](/D/mean-samp): + +$$ \label{eq:mean-samp} +\begin{split} +\bar{y}_{\bullet \bullet \bullet} &= \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\bar{y}_{i \bullet \bullet} &= \frac{1}{n_{i \bullet}} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\bar{y}_{\bullet j \bullet} &= \frac{1}{n_{\bullet j}} \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\bar{y}_{i j \bullet} &= \frac{1}{n_{ij}} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +with the sample size numbers + +$$ \label{eq:samp-size} +\begin{split} +n_{ij} &- \text{number of samples in category} \; (i,j) \\ +n_{i \bullet} &= \sum_{j=1}^{b} n_{ij} \\ +n_{\bullet j} &= \sum_{i=1}^{a} n_{ij} \\ +n &= \sum_{i=1}^{a} \sum_{j=1}^{b} n_{ij} \; . +\end{split} +$$ + + +**Proof:** In two-way ANOVA, model parameters [are subject to the constraints](/D/anova2) + +$$ \label{eq:anova2-cons} +\begin{split} +\sum_{i=1}^{a} w_{ij} \alpha_i &= 0 \quad \text{for all} \quad j = 1, \ldots, b \\ +\sum_{j=1}^{b} w_{ij} \beta_j &= 0 \quad \text{for all} \quad i = 1, \ldots, a \\ +\sum_{i=1}^{a} w_{ij} \gamma_{ij} &= 0 \quad \text{for all} \quad j = 1, \ldots, b \\ +\sum_{j=1}^{b} w_{ij} \gamma_{ij} &= 0 \quad \text{for all} \quad i = 1, \ldots, a +\end{split} +$$ + +where $w_{ij} = n_{ij}/n$. The [residual sum of squares](/D/rss) for this model is + +$$ \label{eq:rss} +\mathrm{RSS}(\mu,\alpha,\beta,\gamma) = \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} \varepsilon_{ijk}^2 = \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} (y_{ij} - \mu - \alpha_i - \beta_j - \gamma_{ij})^2 +$$ + +and the derivatives of $\mathrm{RSS}$ with respect to $\mu$, $\alpha$, $\beta$ and $\gamma$ are + +$$ \label{eq:rss-der-mu} +\begin{split} +\frac{\mathrm{d}\mathrm{RSS}}{\mathrm{d}\mu} &= \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} \frac{\mathrm{d}}{\mathrm{d}\mu} (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij})^2 \\ +&= \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} -2 (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij}) \\ +&= \sum_{i=1}^{a} \sum_{j=1}^{b} \left( 2 n_{ij} \mu + 2 n_{ij} (\alpha_i + \beta_j + \gamma_{ij}) - 2 \sum_{k=1}^{n_{ij}} y_{ijk} \right) \\ +&= 2 n \mu + 2 \left( \sum_{i=1}^{a} n_{i \bullet} \alpha_i + \sum_{j=1}^{b} n_{\bullet j} \beta_j + \sum_{i=1}^{a} \sum_{j=1}^{b} n_{ij} \gamma_{ij} \right) - 2 \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-alpha} +\begin{split} +\frac{\mathrm{d}\mathrm{RSS}}{\mathrm{d}\alpha_i} &= \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} \frac{\mathrm{d}}{\mathrm{d}\alpha_i} (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij})^2 \\ +&= \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} -2 (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij}) \\ +&= 2 n_{i \bullet} \mu + 2 n_{i \bullet} \alpha_i + 2 \left( \sum_{j=1}^{b} n_{ij} \beta_j + \sum_{j=1}^{b} n_{ij} \gamma_{ij} \right) - 2 \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-beta} +\begin{split} +\frac{\mathrm{d}\mathrm{RSS}}{\mathrm{d}\beta_j} &= \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} \frac{\mathrm{d}}{\mathrm{d}\beta_j} (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij})^2 \\ +&= \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} -2 (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij}) \\ +&= 2 n_{\bullet j} \mu + 2 n_{\bullet j} \beta_j + 2 \left( \sum_{i=1}^{a} n_{ij} \alpha_i + \sum_{i=1}^{a} n_{ij} \gamma_{ij} \right) - 2 \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-gamma} +\begin{split} +\frac{\mathrm{d}\mathrm{RSS}}{\mathrm{d}\gamma_{ij}} &= \sum_{k=1}^{n_{ij}} \frac{\mathrm{d}}{\mathrm{d}\gamma_{ij}} (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij})^2 \\ +&= \sum_{k=1}^{n_{ij}} -2 (y_{ijk} - \mu - \alpha_i - \beta_j - \gamma_{ij}) \\ +&= 2 n_{ij} (\mu + \alpha_i + \beta_j + \gamma_{ij}) - 2 \sum_{k=1}^{n_{ij}} y_{ijk} \; . +\end{split} +$$ + +Setting these derivatives to zero, we obtain the estimates of $\mu_i$: + +$$ \label{eq:rss-der-mu-zero} +\begin{split} +0 &= 2 n \hat{\mu} + 2 \left( \sum_{i=1}^{a} n_{i \bullet} \alpha_i + \sum_{j=1}^{b} n_{\bullet j} \beta_j + \sum_{i=1}^{a} \sum_{j=1}^{b} n_{ij} \gamma_{ij} \right) - 2 \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\hat{\mu} &= \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \sum_{i=1}^{a} \frac{n_{i \bullet}}{n} \alpha_i - \sum_{j=1}^{b} \frac{n_{\bullet j}}{n} \beta_j - \sum_{i=1}^{a} \sum_{j=1}^{b} \frac{n_{ij}}{n} \gamma_{ij} \\ +\hat{\mu} &\overset{\eqref{eq:samp-size}}{=} \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \sum_{j=1}^{b} \sum_{i=1}^{a} \frac{n_{ij}}{n} \alpha_i - \sum_{i=1}^{a} \sum_{j=1}^{b} \frac{n_{ij}}{n} \beta_j - \sum_{i=1}^{a} \sum_{j=1}^{b} \frac{n_{ij}}{n} \gamma_{ij} \\ +\hat{\mu} &\overset{\eqref{eq:anova2-cons}}{=} \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-alpha-zero} +\begin{split} +0 &= 2 n_{i \bullet} \hat{\mu} + 2 n_{i \bullet} \hat{\alpha}_i + 2 \left( \sum_{j=1}^{b} n_{ij} \beta_j + \sum_{j=1}^{b} n_{ij} \gamma_{ij} \right) - 2 \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\hat{\alpha}_i &= \frac{1}{n_{i \bullet}} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \hat{\mu} - \sum_{j=1}^{b} \frac{n_{ij}}{n_{i \bullet}} \beta_j - \sum_{j=1}^{b} \frac{n_{ij}}{n_{i \bullet}} \gamma_{ij} \\ +\hat{\alpha}_i &= \frac{1}{n_{i \bullet}} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \hat{\mu} - \frac{n}{n_{i \bullet}} \sum_{j=1}^{b} \frac{n_{ij}}{n} \beta_j - \frac{n}{n_{i \bullet}} \sum_{j=1}^{b} \frac{n_{ij}}{n} \gamma_{ij} \\ +\hat{\alpha}_i &\overset{\eqref{eq:anova2-cons}}{=} \frac{1}{n_{i \bullet}} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-beta-zero} +\begin{split} +0 &= 2 n_{\bullet j} \hat{\mu} + 2 n_{\bullet j} \hat{\beta}_j + 2 \left( \sum_{i=1}^{a} n_{ij} \alpha_i + \sum_{i=1}^{a} n_{ij} \gamma_{ij} \right) - 2 \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\hat{\beta}_j &= \frac{1}{n_{\bullet j}} \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} - \hat{\mu} - \sum_{i=1}^{a} \frac{n_{ij}}{n_{\bullet j}} \alpha_i - \sum_{i=1}^{a} \frac{n_{ij}}{n_{\bullet j}} \gamma_{ij} \\ +\hat{\beta}_j &= \frac{1}{n_{\bullet j}} \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} - \hat{\mu} - \frac{n}{n_{\bullet j}} \sum_{i=1}^{a} \frac{n_{ij}}{n} \alpha_i - \frac{n}{n_{\bullet j}} \sum_{i=1}^{a} \frac{n_{ij}}{n} \gamma_{ij} \\ +\hat{\beta}_j &\overset{\eqref{eq:anova2-cons}}{=} \frac{1}{n_{\bullet j}} \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} - \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} +\end{split} +$$ + +$$ \label{eq:rss-der-gamma-zero} +\begin{split} +0 &= 2 n_{ij} (\hat{\mu} + \hat{\alpha}_i + \hat{\beta}_j + \hat{\gamma_{ij}}) - 2 \sum_{k=1}^{n_{ij}} y_{ijk} \\ +\hat{\gamma_{ij}} &= \frac{1}{n_{ij}} \sum_{k=1}^{n_{ij}} y_{ijk} - \hat{\alpha}_i - \hat{\beta}_j - \hat{\mu} \\ +\hat{\gamma_{ij}} &= \frac{1}{n_{ij}} \sum_{k=1}^{n_{ij}} y_{ijk} - \frac{1}{n_{i \bullet}} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} - \frac{1}{n_{\bullet j}} \sum_{i=1}^{a} \sum_{k=1}^{n_{ij}} y_{ijk} + \frac{1}{n} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n_{ij}} y_{ijk} \; . +\end{split} +$$ \ No newline at end of file From 82acf48c55ec4b51048ad3664fce92be7f36d7f9 Mon Sep 17 00:00:00 2001 From: Joram Soch Date: Sun, 6 Nov 2022 16:10:31 +0100 Subject: [PATCH 3/3] edited table of contents --- I/ToC.md | 119 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 63 insertions(+), 56 deletions(-) diff --git a/I/ToC.md b/I/ToC.md index e61e0f68..1623bf03 100644 --- a/I/ToC.md +++ b/I/ToC.md @@ -563,61 +563,68 @@ title: "Table of Contents"    1.2.13. **[Cross-validated log Bayes factor](/P/ugkv-cvlbf)**
   1.2.14. **[Expectation of cross-validated log Bayes factor](/P/ugkv-cvlbfmean)**
- 1.3. Simple linear regression
-    1.3.1. *[Definition](/D/slr)*
-    1.3.2. **[Special case of multiple linear regression](/P/slr-mlr)**
-    1.3.3. **[Ordinary least squares](/P/slr-ols)** (1)
-    1.3.4. **[Ordinary least squares](/P/slr-ols2)** (2)
-    1.3.5. **[Expectation of estimates](/P/slr-olsmean)**
-    1.3.6. **[Variance of estimates](/P/slr-olsvar)**
-    1.3.7. **[Distribution of estimates](/P/slr-olsdist)**
-    1.3.8. **[Correlation of estimates](/P/slr-olscorr)**
-    1.3.9. **[Effects of mean-centering](/P/slr-meancent)**
-    1.3.10. *[Regression line](/D/regline)*
-    1.3.11. **[Regression line includes center of mass](/P/slr-comp)**
-    1.3.12. **[Projection of data point to regression line](/P/slr-proj)**
-    1.3.13. **[Sums of squares](/P/slr-sss)**
-    1.3.14. **[Transformation matrices](/P/slr-mat)**
-    1.3.15. **[Weighted least squares](/P/slr-wls)** (1)
-    1.3.16. **[Weighted least squares](/P/slr-wls2)** (2)
-    1.3.17. **[Maximum likelihood estimation](/P/slr-mle)** (1)
-    1.3.18. **[Maximum likelihood estimation](/P/slr-mle2)** (2)
-    1.3.19. **[Sum of residuals is zero](/P/slr-ressum)**
-    1.3.20. **[Correlation with covariate is zero](/P/slr-rescorr)**
-    1.3.21. **[Residual variance in terms of sample variance](/P/slr-resvar)**
-    1.3.22. **[Correlation coefficient in terms of slope estimate](/P/slr-corr)**
-    1.3.23. **[Coefficient of determination in terms of correlation coefficient](/P/slr-rsq)**
- - 1.4. Multiple linear regression
-    1.4.1. *[Definition](/D/mlr)*
-    1.4.2. **[Special case of general linear model](/P/mlr-glm)**
-    1.4.3. **[Ordinary least squares](/P/mlr-ols)** (1)
-    1.4.4. **[Ordinary least squares](/P/mlr-ols2)** (2)
-    1.4.5. *[Total sum of squares](/D/tss)*
-    1.4.6. *[Explained sum of squares](/D/ess)*
-    1.4.7. *[Residual sum of squares](/D/rss)*
-    1.4.8. **[Total, explained and residual sum of squares](/P/mlr-pss)**
-    1.4.9. *[Estimation matrix](/D/emat)*
-    1.4.10. *[Projection matrix](/D/pmat)*
-    1.4.11. *[Residual-forming matrix](/D/rfmat)*
-    1.4.12. **[Estimation, projection and residual-forming matrix](/P/mlr-mat)**
-    1.4.13. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)**
-    1.4.14. **[Weighted least squares](/P/mlr-wls)** (1)
-    1.4.15. **[Weighted least squares](/P/mlr-wls2)** (2)
-    1.4.16. **[Maximum likelihood estimation](/P/mlr-mle)**
-    1.4.17. **[Maximum log-likelihood](/P/mlr-mll)**
-    1.4.18. **[Deviance function](/P/mlr-dev)**
-    1.4.19. **[Akaike information criterion](/P/mlr-aic)**
-    1.4.20. **[Bayesian information criterion](/P/mlr-bic)**
-    1.4.21. **[Corrected Akaike information criterion](/P/mlr-aicc)**
- - 1.5. Bayesian linear regression
-    1.5.1. **[Conjugate prior distribution](/P/blr-prior)**
-    1.5.2. **[Posterior distribution](/P/blr-post)**
-    1.5.3. **[Log model evidence](/P/blr-lme)**
-    1.5.4. **[Deviance information criterion](/P/blr-dic)**
-    1.5.5. **[Posterior probability of alternative hypothesis](/P/blr-pp)**
-    1.5.6. **[Posterior credibility region excluding null hypothesis](/P/blr-pcr)**
+ 1.3. Analysis of variance
+    1.3.1. *[One-way ANOVA](/D/anova1)*
+    1.3.2. **[Ordinary least squares for one-way ANOVA](/P/anova1-ols)**
+    1.3.3. **[F-test for main effect in one-way ANOVA](/P/anova1-f)**
+    1.3.4. *[Two-way ANOVA](/D/anova2)*
+    1.3.5. **[Ordinary least squares for two-way ANOVA](/P/anova2-ols)**
+ + 1.4. Simple linear regression
+    1.4.1. *[Definition](/D/slr)*
+    1.4.2. **[Special case of multiple linear regression](/P/slr-mlr)**
+    1.4.3. **[Ordinary least squares](/P/slr-ols)** (1)
+    1.4.4. **[Ordinary least squares](/P/slr-ols2)** (2)
+    1.4.5. **[Expectation of estimates](/P/slr-olsmean)**
+    1.4.6. **[Variance of estimates](/P/slr-olsvar)**
+    1.4.7. **[Distribution of estimates](/P/slr-olsdist)**
+    1.4.8. **[Correlation of estimates](/P/slr-olscorr)**
+    1.4.9. **[Effects of mean-centering](/P/slr-meancent)**
+    1.4.10. *[Regression line](/D/regline)*
+    1.4.11. **[Regression line includes center of mass](/P/slr-comp)**
+    1.4.12. **[Projection of data point to regression line](/P/slr-proj)**
+    1.4.13. **[Sums of squares](/P/slr-sss)**
+    1.4.14. **[Transformation matrices](/P/slr-mat)**
+    1.4.15. **[Weighted least squares](/P/slr-wls)** (1)
+    1.4.16. **[Weighted least squares](/P/slr-wls2)** (2)
+    1.4.17. **[Maximum likelihood estimation](/P/slr-mle)** (1)
+    1.4.18. **[Maximum likelihood estimation](/P/slr-mle2)** (2)
+    1.4.19. **[Sum of residuals is zero](/P/slr-ressum)**
+    1.4.20. **[Correlation with covariate is zero](/P/slr-rescorr)**
+    1.4.21. **[Residual variance in terms of sample variance](/P/slr-resvar)**
+    1.4.22. **[Correlation coefficient in terms of slope estimate](/P/slr-corr)**
+    1.4.23. **[Coefficient of determination in terms of correlation coefficient](/P/slr-rsq)**
+ + 1.5. Multiple linear regression
+    1.5.1. *[Definition](/D/mlr)*
+    1.5.2. **[Special case of general linear model](/P/mlr-glm)**
+    1.5.3. **[Ordinary least squares](/P/mlr-ols)** (1)
+    1.5.4. **[Ordinary least squares](/P/mlr-ols2)** (2)
+    1.5.5. *[Total sum of squares](/D/tss)*
+    1.5.6. *[Explained sum of squares](/D/ess)*
+    1.5.7. *[Residual sum of squares](/D/rss)*
+    1.5.8. **[Total, explained and residual sum of squares](/P/mlr-pss)**
+    1.5.9. *[Estimation matrix](/D/emat)*
+    1.5.10. *[Projection matrix](/D/pmat)*
+    1.5.11. *[Residual-forming matrix](/D/rfmat)*
+    1.5.12. **[Estimation, projection and residual-forming matrix](/P/mlr-mat)**
+    1.5.13. **[Idempotence of projection and residual-forming matrix](/P/mlr-idem)**
+    1.5.14. **[Weighted least squares](/P/mlr-wls)** (1)
+    1.5.15. **[Weighted least squares](/P/mlr-wls2)** (2)
+    1.5.16. **[Maximum likelihood estimation](/P/mlr-mle)**
+    1.5.17. **[Maximum log-likelihood](/P/mlr-mll)**
+    1.5.18. **[Deviance function](/P/mlr-dev)**
+    1.5.19. **[Akaike information criterion](/P/mlr-aic)**
+    1.5.20. **[Bayesian information criterion](/P/mlr-bic)**
+    1.5.21. **[Corrected Akaike information criterion](/P/mlr-aicc)**
+ + 1.6. Bayesian linear regression
+    1.6.1. **[Conjugate prior distribution](/P/blr-prior)**
+    1.6.2. **[Posterior distribution](/P/blr-post)**
+    1.6.3. **[Log model evidence](/P/blr-lme)**
+    1.6.4. **[Deviance information criterion](/P/blr-dic)**
+    1.6.5. **[Posterior probability of alternative hypothesis](/P/blr-pp)**
+    1.6.6. **[Posterior credibility region excluding null hypothesis](/P/blr-pcr)**
2. Multivariate normal data @@ -773,4 +780,4 @@ title: "Table of Contents" 3.5. Bayesian model averaging
   3.5.1. *[Definition](/D/bma)*
   3.5.2. **[Derivation](/P/bma-der)**
-    3.5.3. **[Calculation from log model evidences](/P/bma-lme)**
\ No newline at end of file +    3.5.3. **[Calculation from log model evidences](/P/bma-lme)**