In [1]:
from scipy.stats import norm, t
import numpy as np

<hr style="color: #009933; border: solid 1px">
<span style="color: #009933;">known std, $\sigma$</span>

# <span style="color: #2455C3">Running speed</span>

A new training technique is supposed to <u>decrease</u> how long it takes professional sprinters to run 200-meter races (i.e. this training technique will increase their speeds). A trainer takes a random sample of female sprinters and records how long it takes sprinters to run 200 meters after the training.

The trainer finds that the average time it takes the population of female sprinters to run 200 meters is 22.965 seconds with standard deviation 0.360. The trainer's sample completed the sprint in 22.793 seconds ($n=16$)

## <span style="color: #85100F">Solution</span>

#### <span style="color: #85100F">1. Statating the hypothesis</span>

If the time it takes to run 200 meters is the variable of interest, which of the following symbolizes the null and alternative hypotheiss for ono-tailed (directional) test where $\mu_I$ is the mean after this intervention?

$$\begin{array}{cl}
H_0: & \mu \leq \mu_I\\
H_1: & \mu > \mu_I\\
 & \text{after the training sprinters will run 200 meters in less time}
\end{array}$$

#### <span style="color: #85100F">2. Analyzing sample data</span>

In [2]:
# known data
# one-sample left-tailed test
n = 16.0
mu = 22.965
sigma = 0.360
x_bar = 22.793
alpha = 0.05

In [3]:
SE = sigma / np.sqrt(n)
SE

0.089999999999999997

#### <span style="color: #85100F;">3. Test statistic calculation</span>

In [4]:
z_score = (x_bar - mu) / SE
z_score

-1.9111111111111179

#### <span style="color: #85100F;">4. Critical point determination</span>

In [5]:
z_critical = norm.ppf(alpha)
z_critical

-1.6448536269514729

#### <span style="color: #85100F;">5. Results interpretation</span>

<span style="color: #009933;">The <b>null hypothesis is rejected with $p<0.05$</b>, which means that <b>sptinters are faster after the training</b></span>

1) <u>Descriptive statistics:</u>
    $$\begin{array}{ccc}
    \mu & = & 22.965\\
    \bar{x} & = & 22.793\\
    \sigma & = & 0.360
    \end{array}$$

2) <u>Inferential statistics:</u>

z=-1.91, p=.03, one-tailed<br>
Confidence interval on the mean of 200-meter sprint time<br>
95% CI = (22.617 - 22.969)

3) <u>Effect size measures:</u>

* Cohen's d = -0.48

In [15]:
# P-value
P_value = norm.cdf(z_score)
print "P-value = {:0.2f}".format(P_value)

P-value = 0.03


In [20]:
# CI
z_char = norm.ppf(1-alpha/2)
me = z_char * SE
print "95% CI = ({:0.3f} - {:0.3f})".format(x_bar-me, x_bar+me)

95% CI = (22.617 - 22.969)


In [19]:
# Cohen's d
d = (x_bar - mu) / sigma
print "d = {:0.2f}".format(d)

d = -0.48


<hr style="color: #009933; border: solid 1px">
<span style="color: #009933;">unknown std, $s$</span>

# <span style="color: #2455C3">Food spend</span>

US families spend an average of \$151 per week on food in 2012. Food Now!, a food coperative wants to reduce the costs of food for their members ($n=25$). To do so, they implement some cost-saving programs. Food Now! wants to know if its program really obtain the desired benefits.

## <span style="color: #85100F">Solution</span>

#### <span style="color: #85100F">1. Statating the hypothesis</span>

$$\begin{array}{cl}
H_0: & \mu_{\text{program}} \geq 151\\
 & \text{the program did not change the cost of food or even increase it}\\
H_A: & \mu_{\text{program}} < 151\\
 & \text{the program reduced the cost of food}
\end{array}$$

#### <span style="color: #85100F;">2. Analyzing sample data</span>

In [9]:
# known data
# one-sampple left-tailed test
n = 25.0
mu = 151.0
s = 50.0             #given
x_bar = 126.0        #given
alpha = 0.05

In [10]:
SE = s / np.sqrt(n)
SE

10.0

#### <span style="color: #85100F;">3. Test statistic calculation</span>

In [12]:
t_score = (x_bar-mu) / SE
t_score

-2.5

#### <span style="color: #85100F;">4. Critical point determination</span>

In [4]:
t_critical = t.ppf(alpha, n-1)
t_critical

-1.7108820799094282

#### <span style="color: #85100F;">5. Results interpretation</span>

<span style="color: #009933;">The <b>null hypothesis is rejected</b>, which means that <b>the program reduces the cost of food</b></span>

1) <u>Descriptive statistics:</u>
    $$\begin{array}{ccc}
    \mu & = & 151\\
    \bar{x} & = & 126\\
    s & = & 50
    \end{array}$$

2) <u>Inferential statistics:</u>

t(24)=-2.5, p=.01, one-tailed<br>
Confidence interval on the mean food expense per week<br>
95% CI = (105.36 - 146.64)

3) <u>Effect size measures:</u>

* Cohen's d = -0.50
* R^2 = .21

So the <b>21%</b> of the <b>difference in food expenses</b> for the sample of 25 people are <b>due to</b> the <b>cost-saving program</b>.

In [17]:
# P-value
P_value = t.cdf(t_score, n-1)
print "P-value = {:0.2f}".format(P_value)

P-value = 0.01


In [25]:
# CI
t_char = t.ppf(1-alpha/2, n-1)
me = t_char * SE
print "95% CI = ({:0.2f} - {:0.2f})".format(x_bar-me, x_bar+me)

95% CI = (105.36 - 146.64)


In [18]:
# Cohen's d
d = (x_bar - mu) / s
print "d = {:0.2f}".format(d)

d = -0.50


In [21]:
# R^2
r_squared = t_score**2 / (t_score**2 + (n-1))
print "R^2 = {:0.2f}".format(r_squared)

R^2 = 0.21
