### Replicating Fisher’s Analysis of Mendel’s Pea Data
> Gregor Mendel’s classic pea‐plant experiments produced strikingly neat ratios (e.g. ~3:1 phenotypes in monohybrid crosses)

> In 1936 R.A. Fisher re‐examined Mendel’s published counts and found the fit to expectation “too good to be true,” suggesting possible bias

> Below we reconstruct Mendel’s key datasets and perform step-by-step χ² tests for each:
>> Monohybrid Crosses (Phenotypes 3:1)
>>> Mendel’s monohybrid crosses involve one trait (dominant vs recessive phenotype). He reported seven traits (seed shape, seed color, flower color, pod shape, pod color, flower position, stem length) each showing ~3:1 ratios
The observed counts and expected counts are:

In [7]:
import pandas as pd

data = {
    'Trait': [
        'Seed Shape (round/wrinkle)',
        'Seed Color (yellow/green)',
        'Flower Color (violet/white)',
        'Pod Shape (inflated/constricted)',
        'Pod Color (green/yellow unripe)',
        'Flower Position (axial/terminal)',
        'Stem Length (tall/dwarf)'
    ],
    'Total': [7324, 8023, 929, 1181, 580, 858, 1064],
    'Observed Dominant': [5474, 6022, 705, 882, 428, 651, 787],
    'Observed Recessive': [1850, 2001, 224, 299, 152, 207, 277],
    'Expected Dominant (3:1)': [5493, 6017, 697, 886, 435, 644, 798],
    'Expected Recessive (3:1)': [1831, 2006, 232, 295, 145, 214, 266]
}


df = pd.DataFrame(data)

df

Unnamed: 0,Trait,Total,Observed Dominant,Observed Recessive,Expected Dominant (3:1),Expected Recessive (3:1)
0,Seed Shape (round/wrinkle),7324,5474,1850,5493,1831
1,Seed Color (yellow/green),8023,6022,2001,6017,2006
2,Flower Color (violet/white),929,705,224,697,232
3,Pod Shape (inflated/constricted),1181,882,299,886,295
4,Pod Color (green/yellow unripe),580,428,152,435,145
5,Flower Position (axial/terminal),858,651,207,644,214
6,Stem Length (tall/dwarf),1064,787,277,798,266


In [None]:
import numpy as np

data = np.array([
    [5474, 1850, 7324],  # Expt 1 (seed shape)
    [6022, 2001, 8023],  # Expt 2 (seed color)
    [705,  224,  929],   # Expt 3 (flower color)
    [882,  299, 1181],   # Expt 4 (pod shape)
    [428,  152,  580],   # Expt 5 (pod color)
    [651,  207,  858],   # Expt 6 (flower position)
    [787,  277, 1064],   # Expt 7 (stem length)
])

chi2_vals = []
for dom, rec, total in data:
    exp_dom = total * 3/4
    exp_rec = total * 1/4
    chi2 = (dom-exp_dom)**2/exp_dom + (rec-exp_rec)**2/exp_rec
    chi2_vals.append(chi2)
chi2_vals, np.sum(chi2_vals)

print(data)
print("Chi-squared values for each experiment:", chi2_vals) 

[[5474 1850 7324]
 [6022 2001 8023]
 [ 705  224  929]
 [ 882  299 1181]
 [ 428  152  580]
 [ 651  207  858]
 [ 787  277 1064]]
Chi-squared values for each experiment: [0.2628800291279811, 0.01499854584735552, 0.3907427341227126, 0.06350550381033022, 0.45057471264367815, 0.34965034965034963, 0.606516290726817]


> The code above computes each cross’s χ² and sums them. For example, 
for seed‐shape: expected 3/4×7324=5493 round and 1/4×7324=1831 wrinkled. 
The χ² contributions are $(5474-5493)^2/5493 + (1850-1831)^2/1831≈0.12$. 
Repeating for all 7 crosses gives a total χ² ≈ 2.14 (df=7), p≈0.95. Individually, 
each χ² is small, showing very close agreement with the 3:1 expectation.

> Step-by-step χ² (monohybrid ex.1): 
>> For seed shape, χ2 = (5474 −5493)² /5493  + (1850 −1831)²/1831 ≈ 0.12,
>>> ###### with df=1 (two categories minus one) . All seven experiments individually give similarly tiny χ². 
>>> ###### Degrees of freedom: Each monohybrid (binary phenotype) has df=1. For all 7, df_total=7. 
> Interpretation: High p-values (p≈0.95 combined) mean no evidence against the 3:1 ratio. Mendel’s phenotypic ratios fit theory extraordinarily well.