This notebook contains the Python code to run one-way Analysis of Variance (ANOVA) test on CSV files. I prepared this code to be be reused in future scientific studies for both myself and anyone who finds it.

Import the necessary libraries

In [None]:
import pandas as pd
import numpy as np
from scipy import stats
from scipy.stats import levene
from statsmodels.formula.api import ols
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from scipy.stats import ttest_ind

Load up the CSV file as per the comments into a Pandas Dataframe

In [None]:
# Specify the path to your CSV file
data_path = 'Lettuce 1st Iteration Nutrient Data.csv'

# Read the data from the CSV file
data = pd.read_csv(data_path)

One-way ANOVA test is performed with a column assigned as the dependent variable and the a list is used to declare the single or multiple columns to be considered as grouping/independent variables for the tests. The code prints the results summary for each variable.

In [None]:
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd

# Assuming 'data' is your DataFrame containing the dataset
# 'grouping_variables' is a list containing the names of grouping/independent variables

grouping_variables = ['VariableA', 'VariableB']

# Perform One-Way ANOVA for each nutrient
for grouping_variable in grouping_variables:
    print("\n{} Analysis:".format(grouping_variable))
    
    # Perform One-Way ANOVA
    model = smf.ols('{} ~ C(SampleType)'.format(grouping_variable), data=data).fit()
    anova_table = sm.stats.anova_lm(model, typ=2)

    # Get the total sum of squares (SS):
    ss_total = anova_table['sum_sq'].sum()

    # Calculate eta-squared (η²):
    eta_squared = anova_table['sum_sq'] / ss_total

    # Calculate partial eta-squared (η²_p):
    df_group = anova_table['df'][0]  # Degrees of freedom for the group variable
    df_residual = anova_table['df'][1]  # Degrees of freedom for the residual
    eta_squared_partial = (anova_table['sum_sq'] / (anova_table['sum_sq'] + df_residual * model.mse_resid)) * df_group

    # Combine ANOVA results with eta-squared and partial eta-squared:
    results_combined = pd.concat([anova_table, eta_squared.rename('eta_squared'), eta_squared_partial.rename('partial_eta_squared')], axis=1)

    # Print the combined results:
    print(results_combined)