In 1975, an experiment was conducted to see if cloud seeding produced rainfall.
26 clouds were seeded with silver nitrate and 26 were not.
The decision to seed or not was made at random.
Get the data from https://www.stat.cmu.edu/~larry/all-of-statistics/=data/clouds.dat.
Let $\theta$ be the difference in the mean precipitation from the two groups.
Estimate $\theta$. Estimate the standard error of the estimate and produce a 95 percent confidence interval.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from scipy.stats import norm

In [2]:
pd_data = pd.read_csv('./data/clouds_clean.dat', delimiter='\s+')
data_unseeded = pd_data['Unseeded_Clouds'].to_numpy()
data_seeded = pd_data['Seeded_Clouds'].to_numpy()

In [3]:
pd_data

Unnamed: 0,Unseeded_Clouds,Seeded_Clouds
0,1202.6,2745.6
1,830.1,1697.8
2,372.4,1656.0
3,345.5,978.0
4,321.2,703.4
5,244.3,489.1
6,163.0,430.0
7,147.8,334.1
8,95.0,302.8
9,87.0,274.7


In [4]:
mean_est = np.mean(data_seeded) - np.mean(data_unseeded)

std_err_est = np.sqrt(np.var(data_seeded, ddof=1) + np.var(data_unseeded, ddof=1))

alpha = 0.05
z = norm.isf(alpha/2)

lowerbound = mean_est - z*std_err_est
upperbound = mean_est + z*std_err_est

In [5]:
print(f"The estimated mean is {mean_est:.0f} acree-feet.")
print(f"The estimated standard error is {std_err_est:.0f} acree-feet.")
print(
    "With 95 percent confidence, the additional precipitation\nobtained by seeding is between "
    f"{lowerbound:.0f} and {upperbound:.0f} acree-feet."
)

The estimated mean is 277 acree-feet.
The estimated standard error is 708 acree-feet.
With 95 percent confidence, the additional precipitation
obtained by seeding is between -1110 and 1665 acree-feet.
