Conducting a z-test in Python
The p-value is 0.157, which is more than the level of significance, α, of 0.05. You therefore do not reject the null hypothesis. The sample indicates that there is no evidence to suggest the personal incomes of loan applicants is not equal to $65,000.

Approach two: The confidence interval ranges from $63,668.44 to $65,331.55. 
The sample mean falls within this interval, therefore you do not reject the null hypothesis. The sample indicates that there is no evidence to suggest the personal incomes of loan applicants is not equal to $65,000.

Note that the 1.96 specified in the confidence interval calculation is a constant for a two-tailed z-test at a 5% level of significance. For a 10% level of significance, the corresponding value is 1.64.

If you get stuck, refer to "main_solved.py" to help with coding in "main.py".

In [1]:
import numpy as np 
import scipy.stats as stats 

# Scenario Information 
sample_mean = 64500 
hypothesised_pop_mean = 65000 
population_std = 2500  
sample_size = 50 
alpha = 0.05 

# Calculate the Z-score 
z_score = (sample_mean-hypothesised_pop_mean)/(population_std/np.sqrt(50)) 
print('Z-Score :', z_score) 

# Calculate P-Value  
p_value = 2 * (1 - stats.norm.cdf(abs(z_score))) 
print('p-value :', p_value) 

# Calculate the Confidence Interval  
#Below, it's assumed that the *sample* std. dev. is 3000
lb = sample_mean - 1.96 * (3000 / np.sqrt(sample_size)) # Lower Boundary 
ub = sample_mean + 1.96 * (3000 / np.sqrt(sample_size)) # Upper Boundary 
print(lb, ub) 

Z-Score : -1.414213562373095
p-value : 0.15729920705028522
63668.44242532462 65331.55757467538


Conducting a t-test in Python
The p-value is 0.465, which is greater than the level of significance, α, of 0.05. You therefore fail to reject the null hypothesis. There is insufficient evidence that the personal incomes of loan applicants are not equal to $65,000. 

Approach two: The confidence interval ranges from $63095.97 to $65904.03. The population mean falls within this interval, therefore you fail to reject the null hypothesis. There is insufficient evidence that the personal incomes of loan applicants are not equal to $65,000.

If you get stuck, refer to "main_solved.py" to help with coding in "main.py".

In [2]:
import numpy as np
import scipy.stats as stats
 
# Scenario Information
sample_mean = 64500
hypothesised_pop_mean = 65000
sample_std = 3000 # estimated from sample standard deviation
sample_size = 20
alpha = 0.05
 
# Calculate the t-score
t_score = (sample_mean-hypothesised_pop_mean)/(sample_std/np.sqrt(20))
print('T-Score :', t_score)
 
# Calculate P-Value 
p_value = 2 * (1 - stats.t.cdf(abs(t_score), df=sample_size-1))
print('p-value :', p_value)
 
# Calculate the Confidence Interval 
#Below, it's assumed that the *sample* std. dev. is 3000
lb = sample_mean - 2.093 * (3000 / np.sqrt(sample_size)) # Lower Boundary 
ub = sample_mean + 2.093 * (3000 / np.sqrt(sample_size)) # Upper Boundary 
print(lb, ub)

T-Score : -0.74535599249993
p-value : 0.46517796008604195
63095.97291692788 65904.02708307211


Conducting a one-tailed t-test in Python
The p-value is 2.337, which is higher than any typical level of significance, including the level of significance that you have been asked to use (0.05). You therefore fail to reject the null hypothesis – there is insufficient evidence to conclude that the average bank balance of customers with positive bank balances and less than $5,000 in their account is greater than $1,000.

Note: that if you had been provided the data, rather than the summary statistics, you could have also used this approach for the two-tailed test in the earlier exercise.

When you have completed all the activities, return to the topic page in Canvas.

If you get stuck, refer to "main_solved.py" to help with coding in "main.py".

Use left and right arrow keys to adjust the split region size

In [3]:
import pandas as pd
import numpy as np
import scipy.stats as stats
 
bank = pd.read_csv("bank.csv")
 
# Filter the DataFrame 
filtered_bank = bank[(bank['balance'] > 0) & (bank['balance'] < 5000)] 
 
# Setting a random seed and perform random sampling
np.random.seed(54321)
sample = filtered_bank.sample(n=500, replace=False)['balance'] 
 
# Perform the T-Test
test_p_value = stats.ttest_1samp(sample, popmean=1000, alternative='greater').pvalue
print("p-value: ", test_p_value)


p-value:  0.9455806343914672
