# Finding the Confidence Interval of Polling Figures

You are running a political campaign and decide to run 30 focus groups with about 10 people in each group. You get the results and want to report to your candidate the number of people who would vote for them in a typical 10-person group. Since there is some variability in each focus group, you decide that the most accurate way is to give a 95% z-confidence interval. You assume from past experience that the standard deviation is 2.89.

1.Import the random Python package and set the seed to 39809. This will ensure that we get the same results every time we run the program:

In [13]:
import random
from math import sqrt
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt

In [14]:
std = 2.89
cl = 0.95
n = 30

2.Initialize our sample list and collect our samples from our focus groups. Use random.randint

In [15]:
sample_list = []
random.seed(39809)
for i in range(n):
    sample_list.append(random.randint(0,10))

In [16]:
sample_mean = np.mean(sample_list)

sample_mean

5.0

3.Calculate 95% z-confidence interval.

In [17]:
critic_value = stats.norm.ppf(((1-cl)/2) + cl)
critic_value

1.959963984540054

In [18]:
lower_limit = sample_mean - (critic_value * (std/sqrt(n)))
lower_limit

3.965845784931483

In [19]:
upper_limit = sample_mean + (critic_value * (std/sqrt(n)))
upper_limit

6.034154215068517

In [20]:
print(f"Your {cl} z confidence interval is ({lower_limit:.2f}, {upper_limit:.2f})")

Your 0.95 z confidence interval is (3.97, 6.03)


In [21]:
s=std/sqrt(n)

In [22]:
stats.norm.interval(cl,sample_mean,s)

(3.965845784931483, 6.034154215068517)

4.If you did everything correctly, then the following should be printed when you run your notebook:

    Your 0.95 z confidence interval is (3.965845784931483, 6.034154215068517)

# Hypothesis Testing

Your boss asks you to conduct a hypothesis test about the mean dwell time of a new type of UAV. Before you arrived, an experiment was conducted on n=5 UAVs (all of the new type) resulting in a sample mean dwell time of ybar=10.4 hours. The goal is to conclusively demonstrate, if possible, that the data supports the manufacturer’s claim that the mean dwell time is greater than 10 hours. Given that it is reasonable to assume the dwell times are normally distributed, the sample standard deviation is s = 0.5 hours, and using a significance level of α = 0.01:

1.Write out the null and alternative hypotheses

In [23]:
# H0: mu=10 # null hypothesis, (we can say H0: mu ≤ 10 )
# H1: mu>10 # alternative hypothesis, right tail
mu0=10
s=0.5
n=5
x=10.4
alpha=0.01 #significant level
df=n-1

2.Calculate the test statistic

In [24]:
# n <30, sigma is unknown, observations are independent so we use t test.
# s=np.std(data)*sqrt(n/df) #Standard Deviation of Sample
# sm=s/sqrt(n) # Standard Error of Sample
# t=(x-mu) / sm
sm=s/sqrt(n)
t=(x-mu0) / sm
t

1.7888543819998335

In [25]:
tc = stats.t.ppf(1-alpha,df) # Critical t Value
tc

3.7469473879811366

3.Find the p-value and state the outcome

In [26]:
pV=1-stats.t.cdf(t,df)
pV

0.07407407407407385

In [28]:
if pV<alpha:
    print('At {} level of significance, we can reject the null hypothesis in favor of Ha.'.format(alpha))
else:
    print('At {} level of significance, we fail to reject the null hypothesis.'.format(alpha))

At 0.01 level of significance, we fail to reject the null hypothesis.


In [29]:
# Critical Value=?
# z= (x-mu)/(sigma/sqrt(n)) >> Xc = mu0 + Zα*(sigma/sqrt(n))

mu0 + stats.t.ppf(1-alpha,df)*sm

10.83784290676411