# One-sample z-test - Lab

### Introduction
In this lab we will go through quick tests to help you better understand the ideas around hypothesis testing.

## Objectives
You would be able to
* Understand and explain use cases for a 1-sample z-test
* Set up null and alternative hypotheses
* Calculate z statistic using z-tables and cdf functions
* Calculate and interpret p-value for significance of results.

## Exercise 1
A rental car company claims the mean time to rent a car on their website is 60 seconds with a standard deviation of 30 seconds. A random sample of 36 customers attempted to rent a car on the website. The mean time to rent was 75 seconds. Is this enough evidence to contradict the company's claim? 

<img src="http://www.guptatravelsjabalpur.com/wp-content/uploads/2016/04/car-rentalservice.jpg" width=400>

Follow the 5 steps shown in previous lesson and use alpha = 0.05. 

In [None]:
# Adding notes from previous lesson for reference:
'''
1. State Hypotheses
2. Specify Significance Level
    a. alpha = 0.05
3. Calculate Test Statistic
4. Calculate p-Value
5. Interpret p-Value
'''

Calculate Test Statistic:
![Screen%20Shot%202019-04-06%20at%2012.09.47%20AM.png](attachment:Screen%20Shot%202019-04-06%20at%2012.09.47%20AM.png)

```py
import scipy.stats as stats
from math import sqrt
x_bar = 102 # sample mean 
n = 50 # number of students
sigma = 16 # sd of population
mu = 100 # Population mean 

z = (x_bar - mu)/(sigma/sqrt(n))
z
```

Calculate p-Value:
```py
stats.norm.cdf(z)
```
```
0.8116204410942089 
```

The percent of area under the normal curve from negative infinity to .88 z score is 81.2% (from z-table and calculations), meaning the average intelligence of this set of students is greater than 81.2% of the population. But we wanted it to be greater than 95% to prove our hypothesis to be significantly correct. 

And we get our p value probability by subtracting z value from 1 , as sum of probabilities in a normal distribution is always 1

```py
pval = 1 - stats.norm.cdf(z)
pval
```
```
0.18837955890579106
```

In [4]:
import math
import scipy.stats as stats
import numpy as np

In [None]:
# State you null and alternative hypotheses
# Ho: Time to rent car is 60 seconds.
# Ha: Time to rent car is >60 seconds.

In [6]:
mu = 60
sigma = 30
x_bar = 75
n = 36

z = (x_bar - mu)/(sigma/math.sqrt(n))
p = 1 - stats.norm.cdf(z)

print(f'The p-Value is {p} and the z-statistic is {z}.')
# (p = 0.0013498980316301035, z = 3.0)

The p-Value is 0.0013498980316301035 and the z-statistic is 3.0.


In [None]:
# Interpret the results in terms of p-value obtained
# Because p-Value is less than 0.05, we reject the null hypothesis
# that renting a car takes 60 seconds or less to rent.

## Exercise 2

Twenty five students complete a preparation program for taking the SAT test.  Here are the SAT scores from the 25 students who completed  program:

``
434 694 457 534 720 400 484 478 610 641 425 636 454
514 563 370 499 640 501 625 612 471 598 509 531
``

<img src="http://falearningsolutions.com/wp-content/uploads/2015/09/FAcogtrain71FBimage.jpg" width=400>

We know that the population average for SAT scores is 500 with a standard deviation of 100.

The question is, are these students’ SAT scores significantly greater than a population mean? 

*Note that the the maker of the SAT prep program claims that it will increase (and not decrease) your SAT score.  So, you would be justified in conducting a one-directional test. (alpha = .05).*



In [None]:
# State your hypotheses 
# Ho: No increase in SAT score.
# Ha: Increase in SAT score after completing prep program.

In [8]:
# Give your solution here 
x = np.array([434, 694, 457, 534, 720, 400, 484, 478, 610, 641, 425, 636, 454,
514, 563, 370, 499, 640, 501, 625, 612, 471, 598, 509, 531])
x_bar = x.mean()
mu = 500
n = len(x)
sigma = 100

z = (x_bar - mu) / (sigma/math.sqrt(n))
p = 1 - stats.norm.cdf(z)

print(f'The p-Value is {p} and the z-statistic is {z}.')
# p = 0.03593031911292577, z = 1.8

The p-Value is 0.03593031911292577 and the z-statistic is 1.8.


In [None]:
# Interpret the results in terms of p-value obtained
# The p-Value is .03, and we therefore reject the null hypothesis that 
# there is no increase in SAT score after the prep program.

## Summary

In this lesson, we conducted a couple of simple tests comparing sample and population means, in an attempt to reject our null hypotheses. This provides you with a strong foundation to move ahead with more advanced tests and approaches in statistics. 