## Problem Statement 1:
Blood glucose levels for obese patients have a mean of 100 with a standard deviation of 15. A researcher thinks that a diet high in raw cornstarch will have a positive effect on blood glucose levels. A sample of 36 patients who have tried the raw cornstarch diet have a mean glucose level of 108. Test the hypothesis that the raw cornstarch had an effect or not.

In [1]:
# One sample z-test
# Null Hypothesis H0 : x = mu (Sample mean = population mean) (Raw cornstarch had no effect on blood glucose levels)
# Alternative Hypothesis H1 : x != mu (sample mean != population mean) (Raw cornstarch had effect on blood glucose levels)
# Two-tailed test

import numpy as np
from scipy.stats import norm

mu = 100 # population mean
sd = 15 # population standard deviation
n = 36 # sample size
x = 108 # sample mean

# As per central limit theorem the distribution of sample means follows a normal disribution X ~ N(mu, sd/sqrt(n))
# z-score of x
z_score = (x - mu)/(sd/np.sqrt(n))

# alpha = 5%
alpha = 0.05

# P-value
p_value = norm.sf(z_score)
print('P-value = %0.4f' %(p_value))

# Two-tailed test. Reject H0 if P-value < alpha/2 (0.025)
if p_value > alpha/2:
    print("P-value > 0.025, Fail to reject the null hypothesis - Conclusion: Raw cornstarch had no effect on blood glucose levels")
else:
    print("P-value < 0.025, Reject the null hypothesis - Conclusion: Raw cornstarch had effect on blood glucose levels")


P-value = 0.0007
P-value < 0.025, Reject the null hypothesis - Conclusion: Raw cornstarch had effect on blood glucose levels


## Problem Statement 2:
In one state, 52% of the voters are Republicans, and 48% are Democrats. In a second state, 47% of the voters are Republicans, and 53% are Democrats. Suppose a simple random sample of 100 voters are surveyed from each state.

What is the probability that the survey will show a greater percentage of Republican voters in the second state than in the first state?

In [2]:
# P1 = 0.52 : the proportion of Republican voters in the first state
# P2 = 0.47 : the proportion of Republican voters in the second state
# Let p1 and p2 be the proportion of Republican voters in the samples from the first state and the second state respectively.
# Sample size n1 = n2 = n = 100
n = 100

# We need to find the probability that p1 is less than p2 . P(p1 < p2) = P((p1 - p2) < 0)

# Mean of the difference in sample proportions
# mu = E(p1 - p2) = P1 - P2 = 0.52 - 0.47 = 0.05
mu = 0.05

# Standard deviation of the difference in sample proportions
# sd = sqrt( P1(1 - P1) / n1  +  P2(1 - P2) / n2 ) = sqrt( (0.52)(0.48) / 100  +  (0.47)(0.53) / 100 )
sd = np.sqrt( 0.52*0.48 / 100  +  0.47*0.53 / 100 )

# z-score of (p1 - p2) = 0
z_score = (0 - mu)/sd

# P((p1 - p2) < 0) = P(Z < z_score)
prob = norm.cdf(z_score)
print('P(p1 < p2) = %0.4f' %(prob))

P(p1 < p2) = 0.2395


## Problem Statement 3:
You take the SAT and score 1100. The mean score for the SAT is 1026 and the standard deviation is 209. How well did you score on the test compared to the average test taker?

In [3]:
mu = 1026 # population mean
sd = 209 # population standard deviation
x = 1100 # Individual Test score

# z-score of x
z_score = (x - mu)/sd
print('Individual score on the test as compared to the average test taker : %0.4f'%(z_score))


Individual score on the test as compared to the average test taker : 0.3541
