Suppose you have an experiment with 4 different fertilizers (treatments) applied to crops in 5 different fields (blocks). You want to determine if there is a significant effect of fertilizer type on crop yield, accounting for the variability between fields


In [10]:
pip install statsmodels pandas

Note: you may need to restart the kernel to use updated packages.


In [11]:


import numpy as np
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Set random seed for reproducibility
np.random.seed(42)

# Create a synthetic dataset
blocks = np.tile(np.arange(1, 6), 4)  # 5 blocks (fields), repeated for each treatment
treatments = np.repeat(['Fertilizer_A', 'Fertilizer_B', 'Fertilizer_C', 'Fertilizer_D'], 5)

# Generate some synthetic crop yield data with random noise
yields = np.random.normal(loc=50, scale=5, size=20) + \
         np.repeat([10, 20, 30, 40], 5)  # Different base yields for different fertilizers

data = pd.DataFrame({
    'Block': blocks,
    'Treatment': treatments,
    'Yield': yields
})

# Fit the ANOVA model
model = ols('Yield ~ C(Treatment) + C(Block)', data=data).fit()
anova_table = sm.stats.anova_lm(model, typ=2)

print(anova_table)

                   sum_sq    df          F    PR(>F)
C(Treatment)  1486.236589   3.0  36.752050  0.000003
C(Block)        82.810038   4.0   1.535811  0.253786
Residual       161.758224  12.0        NaN       NaN


interpretation of Results
C(Treatment): This row shows the variation due to different treatments (fertilizers). The F-value and p-value indicate whether there are significant differences between the treatment means.
C(Block): This row shows the variation due to different blocks (fields). This accounts for the variability between blocks.
Residual: This shows the remaining variation not explained by the treatments or blocks.
If the p-value for C(Treatment) is less than the significance level (e.g., 0.05), you can conclude that there are significant differences between the means of the different treatments.
This example demonstrates how to perform ANOVA for a randomized block design using Python. Adjust the data generation part to match your actual experimental data for real-world applications.