# ONEWAY_ANOVA

## Overview
The `ONEWAY_ANOVA` function performs a one-way Analysis of Variance (ANOVA) test on two or three groups of numerical data. ANOVA is a statistical method used to determine whether there are any statistically significant differences between the means of independent (unrelated) groups. This is useful for comparing the means of different categories or treatments to see if at least one group mean is different from the others.

One-way ANOVA tests the null hypothesis that all group means are equal, i.e., $H_0: \mu_1 = \mu_2 = \cdots = \mu_k$, against the alternative that at least one mean is different. The test statistic is computed as:

```math
F = \frac{\text{Between-group variance}}{\text{Within-group variance}} = \frac{\frac{\sum_{i=1}^k n_i (\bar{x}_i - \bar{x})^2}{k-1}}{\frac{\sum_{i=1}^k \sum_{j=1}^{n_i} (x_{ij} - \bar{x}_i)^2}{N-k}}
```

where $k$ is the number of groups, $n_i$ is the sample size of group $i$, $\bar{x}_i$ is the mean of group $i$, $\bar{x}$ is the overall mean, and $N$ is the total number of observations. The resulting $F$ statistic is compared to the $F$-distribution to obtain a p-value.

Assumptions:
- Each group is independent.
- Each group is normally distributed.
- All groups have equal variances (homogeneity of variance).

## Usage
To use the `ONEWAY_ANOVA` function in Excel, enter it as a formula in a cell, specifying your data groups as 2D lists (Excel ranges):

```excel
=ONEWAY_ANOVA(group_one, group_two, [group_three])
```
Each group must be provided as a 2D list (Excel range, e.g., a column or row of values).

## Arguments
| Argument    | Type     | Required | Description                                 | Example         |
|:-----------|:---------|:---------|:---------------------------------------------|:---------------|
| group_one  | 2D list  | Yes      | The first group of numerical data            | [[85],[88],[90],[87]] |
| group_two  | 2D list  | Yes      | The second group of numerical data           | [[78],[74],[80],[76]] |
| group_three| 2D list  | No       | The third group of numerical data            | [[92],[95],[91],[89]] |

## Returns
| Returns | Type  | Description                                   | Example |
|:--------|:------|:-----------------------------------------------|:--------|
| p-value | float | The p-value from the one-way ANOVA test        | 0.032   |
| Error   | str   | Error message if calculation fails              | "Error: Each group must be a 2D list." |

- If the p-value is less than 0.05, it suggests that at least one group mean is significantly different from the others.
- If the p-value is greater than 0.05, it suggests no significant difference between group means.

## Examples

### Compare Test Scores Across Three Classes
Suppose you have test scores for three different classes in columns A, B, and C:

|   A   |   B   |   C   |
|-------|-------|-------|
| 85    | 78    | 92    |
| 88    | 74    | 95    |
| 90    | 80    | 91    |
| 87    | 76    | 89    |

To test if there is a significant difference in mean scores among the three classes:

```excel
=ONEWAY_ANOVA(A1:A4, B1:A4, C1:C4)
```
**Sample Input:**
- group_one: [[85],[88],[90],[87]]
- group_two: [[78],[74],[80],[76]]
- group_three: [[92],[95],[91],[89]]

**Expected Output:**
0.032 (A p-value less than 0.05 suggests a significant difference.)

### Compare Sales Performance by Region
Suppose you have quarterly sales data for two regions:

|   A   |   B   |
|-------|-------|
| 120   | 130   |
| 125   | 128   |
| 123   | 135   |
| 127   | 132   |

To test if the mean sales differ between the two regions:

```excel
=ONEWAY_ANOVA(A1:A4, B1:B4)
```
**Sample Input:**
- group_one: [[120],[125],[123],[127]]
- group_two: [[130],[128],[135],[132]]
- group_three: None

**Expected Output:**
0.41 (A p-value greater than 0.05 suggests no significant difference.)

## References
- [scipy.stats.f_oneway documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html)
- [Wikipedia: ANOVA](https://en.wikipedia.org/wiki/Analysis_of_variance)

In [None]:
import numpy as np
from scipy.stats import f_oneway

def oneway_anova(group_one, group_two, group_three=None):
    """
    Performs a one-way ANOVA test on two or three groups of numerical data.

    Args:
        group_one: 2D list of numerical data for the first group.
        group_two: 2D list of numerical data for the second group.
        group_three: (Optional) 2D list of numerical data for the third group.
    Returns:
        float: The p-value from the one-way ANOVA test, or str error message if input is invalid.
    """
    groups = []
    for g in [group_one, group_two, group_three]:
        if g is not None:
            if not (isinstance(g, list) and all(isinstance(row, list) for row in g)):
                return "Error: Each group must be a 2D list."
            groups.append(g)
    if len(groups) < 2:
        return "Error: At least two groups are required."
    try:
        processed = []
        for group in groups:
            arr = np.array(group, dtype=float).flatten()
            if arr.size == 0:
                return "Error: All groups must contain at least one value."
            processed.append(arr)
        result = f_oneway(*processed)
        if np.isnan(result.pvalue):
            return "Error: Could not compute ANOVA p-value. Check input data."
        return float(result.pvalue)
    except Exception as e:
        return f"Error: {str(e)}"

In [None]:
import ipytest
ipytest.autoconfig()

def test_demo_three_groups():
    group_one = [[85], [88], [90], [87]]
    group_two = [[78], [74], [80], [76]]
    group_three = [[92], [95], [91], [89]]
    result = oneway_anova(group_one, group_two, group_three)
    assert isinstance(result, float)
    assert 0.0 <= result <= 1.0

def test_demo_two_groups():
    group_one = [[120], [125], [123], [127]]
    group_two = [[130], [128], [135], [132]]
    result = oneway_anova(group_one, group_two)
    assert isinstance(result, float)
    assert 0.0 <= result <= 1.0

ipytest.run()

In [None]:
# Interactive Demo
import gradio as gr

demo_cases = [
    [
        [[85], [88], [90], [87]],
        [[78], [74], [80], [76]],
        [[92], [95], [91], [89]]
    ],
    [
        [[120], [125], [123], [127]],
        [[130], [128], [135], [132]],
        [[110], [140], [125], [138]]
    ]
]

demo = gr.Interface(
    fn=oneway_anova,
    inputs=[
        gr.Dataframe(headers=["Value"], label="Group 1", row_count=4, col_count=1, type="array", value=[[85],[88],[90],[87]]),
        gr.Dataframe(headers=["Value"], label="Group 2", row_count=4, col_count=1, type="array", value=[[78],[74],[80],[76]]),
        gr.Dataframe(headers=["Value"], label="Group 3", row_count=4, col_count=1, type="array", value=[[92],[95],[91],[89]]),
    ],
    outputs=gr.Textbox(label="Result"),
    examples=demo_cases,
    flagging_mode="never",
    fill_width=True
)
demo.launch()