# Test of Hypotheses Based on One Sample

In [1]:
import math
from pyreadr import read_r
from scipy import stats
from statsmodels.stats.proportion \
    import proportion_confint, proportions_ztest

## 1) Radon Detectors

A sample of 12 radon detectors of a certain type was selected, and each was exposed to 100 pCi/L of radon.
The resulting readings were as follows. (Data ex08.32) 

Does this data suggest that the population mean reading under these conditions differs from 100?

State and test the appropriate hypotheses using (α = .05). 

* $H_0: \mu = 100$
* $H_1: \mu \neq 100$

In [2]:
data = read_r('../data/devore7/ex08.32.rda')
df = data['ex08.32']

stat, pval = stats.ttest_1samp(a=df['C1'], popmean=100)
pval

0.37661608746499975

## 2) Batteries

A manufacturer of nickel-hydrogen batteries randomly selects 100 nickel plates for test cells, cycles them a specified number of times, and determines that 14 of the plates have blistered. 

1. Does this provide compelling evidence for concluding that more than 10% of all plates blister under such circumstances? State and test the appropriate hypotheses using a significance level of α = .05.
2. In reaching your conclusion, what type of error might you have committed?

In [4]:
stat, pval = proportions_ztest(count=14, nobs=100, value=0.1,
                               alternative='larger')
pval

0.12450017622604997

A possible β error (type II)

## 3) Organic Matter in Soil

A random sample of soil specimens was obtained, and the amount of organic matter (%) in the soil was determined for each specimen, resulting in the accompanying data (from “Engineering Properties of Soil,” Soil Science, 1998: 93–102). (Data: ex08.54)

1. Calculate the sample mean, sample standard deviation, and (estimated) standard error of the mean.
2. Does this data suggest that the true average percentage of organic matter in such soil is something other than 3%? Carry out a test of the appropriate hypotheses at significance level .10.
3. Would your conclusion be different if α = .05 had been used?

## 3) Organic Matter in Soil (cont.)

In [5]:
data = read_r('../data/devore7/ex08.54.rda')
df = data['ex08.54']

mean, std = df['percorg'].mean(), df['percorg'].std()
n = df['percorg'].size
std_err = std / math.sqrt(n)
print(mean, std, std_err)

2.481333333333333 1.615640650839065 0.2949742764289613


In [6]:
stat, pval = stats.ttest_1samp(a=df['percorg'], popmean=3)
pval

0.08923961541442524

1. $pval < 0.1$ => reject $H_0$
2. $0.05 < pval$ => do not reject $H_0$

## 4) Drywall

With domestic sources of building supplies running low several years ago, roughly 60,000 homes were built with imported Chinese drywall.
According to the article “Report Links Chinese Drywall to Home Problems” (New York Times, Nov. 24, 2009),
federal investigators identified a strong association between chemicals in the drywall and electrical problems,
and there is also strong evidence of respiratory difficulties due to the emission of hydrogen sulfide gas.
An extensive examination of 51 homes found that 41 had such problems.
Suppose these 51 were randomly sampled from the population of all homes having Chinese drywall.

1. Does the data provide strong evidence for concluding that more than 50% of all homes with Chinese drywall have electrical/environmental problems? Carry out a test of hypotheses using α = .01.
2. Calculate a confidence interval using a confidence level of 99% for the percentage of all such homes that have electrical/environmental problems.

* $H_0: \pi \leq \pi_0 = 50%$
* $H_1: \pi > \pi_0 = 50%$

## 4) Drywall (cont.)

In [7]:
stat, pval = proportions_ztest(count=41, nobs=51, value=0.5,
                               alternative='larger')
pval

2.292517684598636e-08

In [8]:
proportion_confint(41, 51, 0.01)

(0.6607180319569749, 0.9471251052979272)

## 5) Soil Heat

The article “Orchard Floor Management Utilizing Soil-Applied Coal Dust for Frost Protection” (Agri. and Forest Meteorology, 1988: 71–82) reports the following values for soil heat flux of eight plots covered with coal dust. (Data: ex08.66)
The mean soil heat flux for plots covered only with grass is 29.0.

Assuming that the heat-flux distribution is approximately normal,
does the data suggest that the coal dust is effective in increasing the mean heat flux over that for grass?

Test the appropriate hypotheses using α = .05. In reaching your conclusion, what type of error might you have committed? 

* $H_0: \mu \leq \mu_0 = 29.0$
* $H_1: \mu > \mu_0 = 29.0$

## 5) Soil Heat (cont.)

In [9]:
data = read_r('../data/devore7/ex08.66.rda')
df = data['ex08.66']

stat, pval = stats.ttest_1samp(df['SoilHeat'], popmean=29.,
                               alternative='greater')
pval

0.2320653906988781

A possible β (type II) error

## 6) Robots

Scientists think that robots will play a crucial role in factories in the next several decades.
Suppose that in an experiment to determine whether the use of robots to weave computer cables is feasible,
a robot was used to assemble 500 cables.
The cables were examined and there were 10 defectives.

If human assemblers have a defect rate of .035 (3.5%),
does this data support the hypothesis that the proportion of defectives is lower for robots than for humans?

Use a α = .05 significance level. Determine the type of possible error.

In [10]:
stat, pval = proportions_ztest(count=10, nobs=500, value=0.035,
                               alternative='smaller')
pval

0.008292359711399333

## 7) Flame Time

The accompanying observations on residual flame time (sec) for strips of treated children’s nightwear were given in the article “An Introduction to Some Precision and Accuracy of Measurement Problems” (J. of Testing and Eval., 1982: 132–140).
Suppose a true average flame time of at most 9.75 had been mandated.

Does the data suggest that this condition has not been met?
Carry out an appropriate two tailed test using α = .05. (Data: ex08.70)

In [11]:
data = read_r('../data/devore7/ex08.70.rda')
df = data['ex08.70']

stat, pval = stats.ttest_1samp(df['time'], popmean=9.75)
pval

0.00013857175434982256