<a href="https://colab.research.google.com/github/sandeep-pydeti/machine-learning/blob/main/cb_en_u4cse19138_HypothesisTesting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hypothesis Testing

In [1]:
import numpy as np
import scipy.stats as st

1. Travel Times to Work 
Based on information from the U.S. Census Bureau, the mean travel time to work
in minutes for all workers 16 years old and older was 25.3 minutes. A large
company with offices in several states randomly sampled 100 of its workers to
ascertain their commuting times. The sample mean was 23.9 minutes, and the
population standard deviation is 6.39 minutes. At the 0.01 level of significance,
can it be concluded that the mean commuting time is less for this particular
company?


In [2]:
#H0 : μ = 16, Ha :  μ > 25.3
n = 100
xbar = 23.9
mu = 25.3
sigma = 6.39
alpha = 0.01

In [3]:
z_critical = abs(st.norm.ppf(alpha)) #Absolute value taken as the it's a right-tailed test and the original value will be negative
z_critical

2.3263478740408408

In [4]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

-2.1909233176838843

In [5]:
if (z < z_critical): 
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


2. Time Until Indigestion Relief 
An advertisement claims that Fasto Stomach Calm will provide relief from
indigestion in less than 10 minutes. For a test of the claim, 35 randomly selected
individuals were given the product; the average time until relief was 9.25 minutes.
From past studies, the standard deviation of the population is known to be 2
minutes. Can you,conclude that the claim is justified? Find the P‐value,and let α =0.05

In [6]:
#H0 : μ >= 10 and Ha : μ < 10
n = 35
xbar = 9.25
mu = 10
sigma = 2
alpha = 0.05

In [7]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

-2.218529918662356

In [8]:
p_val = (1 - st.norm.cdf(abs(z))) * 2
p_val

0.026518721959430724

In [9]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Reject null hypothesis


3. 
A random sample of 1 Dollar Trifecta tickets at a local racetrack paid the following
amounts,(in dollars and cents). Is there sufficient evidence to conclude that the
average Trifecta winnings exceed $50? Use α = 0.10. Assume the variable is
normally distributed.    8.90,70.20,15.00,29.10,141.00,48.60,83.00,72.70,75.30,59.20,32.40,19.00,190.10 

In [10]:
#H0 : μ = 50, Ha : μ > 50
n = 13
degrees_of_freedom = n-1
xbar = 59.73
mu = 50
s = 54.17723
alpha = 0.10

In [11]:
t = (xbar - mu)/(s / np.sqrt(n))
t

0.6475416685250354

In [12]:
p_val = (1 - st.t.cdf(abs(t), degrees_of_freedom)) #"1 - cdf" because it's a right-tailed test
p_val

0.2647376565885107

In [13]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


4. Whooping Crane Eggs 
Once down to about 15, the world’s only wild flock of whooping cranes now
numbers a record 237 birds in its Texas Coastal Bend, wintering ground. The
average whooping crane egg weighs 208 grams. A new batch of randomly selected
eggs was recently weighed, and their weights are listed below. At α = 0.01, is there
sufficient evidence to conclude that the weight is greater than 208 grams? Assume
the variable is normally distributed.
210,210.2,208.5,209,211.6,206.4,212,209.7,210.3


In [14]:
#H0 : μ <= 208, Ha : μ > 208
n = 15
degrees_of_freedom = n-1
xbar = 209.74
mu = 208
s = 1.6734
alpha = 0.01

In [15]:
t = (xbar - mu)/(s / np.sqrt(n))
t

4.027125028326127

In [16]:
t_critical = st.t.ppf(alpha/2, degrees_of_freedom)
t_critical

-2.9768427341126604

In [18]:
if (abs(t) > abs(t_critical)):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


5. Federal Prison Populations 
Nationally 60.2% of federal prisoners are serving time for drug offences. A warden
feels that in his prison the percentage is even higher. He surveys 400 randomly
selected inmates’ records and finds that 260 of the inmates are drug offenders. At
α = 0.05, is he correct?

In [19]:
#H0 : p = 0.602 and Ha : p>0.602
n = 400
xbar = 0.65
mu = 0.602
sigma = 0.398
alpha = 0.05

In [20]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

2.4120603015075397

In [21]:
p_val = (1 - st.norm.cdf(abs(z))) * 2
p_val

0.015862657734413865

In [22]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Reject null hypothesis


6. MP3 Ownership
An MP3 manufacturer claims that 65%,of teenagers 13 to 16 years old have their
own MP3 players. A researcher wishes to test the claim and selects a,random
sample of 80 teenagers. She finds that 57 have their own MP3 players. At α = 0.05,
should the claim be,rejected? Use the P-value method.

In [23]:
#H0 : μ = 0.65 and Ha : μ != 0.65
n = 80
xbar = 0.71
mu = 0.65
sigma = 0.35
alpha = 0.05

In [24]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

1.5333037559998546

In [25]:
p_val = (1 - st.norm.cdf(abs(z))) * 2
p_val

0.12520102961031565

In [26]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Null hypothesis cannot be rejected


7. Men Aged 65 and Over in the Labour Force 
Of men,aged 65 and over 20.5% are still in the U.S. labour force.,A random sample
of 120 retired male teachers indicated that 38 were still working. Use both a
confidence,interval and a hypothesis test. Test the claim that the,proportion is
greater than 20.5% at α = 0.10.

In [27]:
#H0 : μ = 0.205 and Ha : μ > 0.205
n =120
xbar = 0.317
mu = 0.205
sigma = 0.359
alpha = 0.10

In [28]:
z = (xbar-mu)/(sigma/np.sqrt(n))
z

3.4175446485002015

In [29]:
p_val = (1 - st.norm.cdf(abs(z))) * 2
p_val

0.0006318872872197456

In [30]:
if (p_val > alpha):
    print("Null hypothesis cannot be rejected")
else:
    print("Reject null hypothesis")

Reject null hypothesis


8. Movie Admission Prices 
The average movie admission price for a recent year was 7.18 Dollars. The
population,variance was 3.81. A random sample of 15 theatre admission prices
had a mean of 8.02 Dollars with a standard,deviation of $2.08. At α = 0.05, is there
sufficient evidence to conclude a difference from the population variance?
Assume the variable is normally distributed.

9. Fuel Consumption 
The standard deviation of fuel,consumption of a manufacturer’s sport utility
vehicle,is hypothesized to be 3.3 miles per gallon. A random,sample of 18 vehicles
has a standard deviation of,2.8 miles per gallon. At α = 0.10, is the claim,
believable?


10. Tire Inflation 
To see whether people are keeping their car tires inflated to the correct level of 35
pounds per square inch (psi), a tire company manager selects a random sample of
36 tires and checks the pressure. The mean of the sample is 33.5 psi, and the
population standard deviation is 3 psi. Are the tires properly inflated? Use α = 0.10.
Find the 90% confidence interval of the mean. Do the results agree? Explain