95% confidence interval = we can expect that 95 out of 100 our estimate include the tru population value

= mean +- ME = mean +- critical_score * SE


# Confidence interval by hand for sample mean
There are two common ways that interviewers will touch on confidence intervals; they will either ask you to explain it in simple terms, or elaborate on how they are calculated, possibly having you implement one. In this exercise, you'll practice the latter by producing a confidence interval by hand, using no packages other than those imported for you.

We have gone ahead and assigned the appropriate z-score for a 95% confidence interval and sample mean to the z_score and sample_mean variables to simplify things a bit.

In [None]:
from scipy.stats import sem, t
data = [1, 2, 3, 4, 5]
confidence = 0.95

# Compute the standard error and margin of error
std_err = sem(data)
margin_error = std_err * z_score

# Compute and print the lower threshold
lower = sample_mean - margin_error
print(lower)

# Compute and print the upper threshold
upper = sample_mean + margin_error
print(upper)

# Applying confidence intervals for proportion
In practice, you aren't going to hand-code confidence intervals. Let's utilize the statsmodels package to streamline this process and examine some more tendencies of interval estimates.

In this exercise, we've generated a binomial sample of the number of heads in 50 fair coin flips saved as the heads variable. You'll compute a few different confidence intervals for this sample, and then scale your work for 10 similar samples.

The proportion_confint() function has already been imported to help you compute confidence intervals.

In [4]:
#Compute and print a 99% confidence interval for 50 trials; does it contain the true proportion of a fair coin flip?
# Compute and print the 99% confidence interval
# Repeat this process 10 times 
from statsmodels.stats.proportion import proportion_confint
from scipy.stats import binom
heads = binom.rvs(50, 0.5, size=10)
for val in heads:
    confidence_interval = proportion_confint(val, 50, .10)
    print(confidence_interval)

(0.21148951611838032, 0.4285104838816197)
(0.1934013015689181, 0.4065986984310819)
(0.46518968814451866, 0.6948103118554813)
(0.26709065248750397, 0.49290934751249604)
(0.507090652487504, 0.732909347512496)
(0.3836912846323326, 0.6163087153676674)
(0.42406406993539053, 0.6559359300646095)
(0.48604119788424416, 0.7139588021157558)
(0.30518968814451874, 0.5348103118554812)
(0.42406406993539053, 0.6559359300646095)


# Hypothesis testing

## Assumption
1) random sample

2) population normally. distributed

3) observation need to be indipendent

4) variance needs to be constant

# One tailed z-test: proportion of conversion

We know now that hypothesis tests can come in several forms. In this exercise, you'll implement a one tailed z-test on test data from tracking conversion on a mobile app. The data has been imported as results and numpy has already been imported for you along with pandas as well.

The treatment group represents some graphic alteration that we expect to improve the conversion rate of users. Run a test with alpha as .05 and find out if the change actually helped.

In [None]:
# Assign and print the conversion rate for each group
conv_rates = results.groupby(results.Group).mean()
print(conv_rates)

In [None]:
#Assign the number of control conversions to num_control and the total number of trials to the total_control variable by slicing the DataFrame.
num_control = results[results['Group']=='control']['Converted'].sum()
total_control = len(results[results['Group']=='control'])

In [None]:
# Assign the number of conversions and total trials
num_treat = results[results['Group']=='treatment']['Converted'].sum()
total_treat = len(results[results['Group']=='treatment'])

Run the z-test using the proportions_ztest() function and passing it the count and nobs variables.

alternative : string in [‘two-sided’, ‘smaller’, ‘larger’]

The alternative hypothesis can be either two-sided or one of the one- sided tests, smaller means that the alternative hypothesis is prop < value` and larger means ``prop > value, or the corresponding inequality for the two sample test.

In [None]:
from statsmodels.stats.proportion import proportions_ztest
count = np.array([num_treat, num_control]) 
nobs = np.array([total_treat, total_control])

# Run the z-test and print the result 
stat, pval = proportions_ztest(count, nobs, alternative="larger")
print('{0:0.3f}'.format(pval))

# Two tailed t-test
In this exercise, you'll tackle another type of hypothesis test with the two tailed t-test for means. More concretely, you'll run the test on our laptops dataset from before and try to identify a significant difference in price between Asus and Toshiba.

Once again, we've imported all of the standard packages. Once you get your result, don't forget to make an actionable conclusion.

In [None]:
# Assign and print the mean price for each group using the groupby() function on the Company feature.
prices = laptops.groupby(laptops['Company']).mean()
print(prices)

In [None]:
# Assign the prices of each group
asus = laptops[laptops['Company'] == 'Asus']['Price']
toshiba = laptops[laptops['Company'] == 'Toshiba']['Price']

In [None]:
# Run the t-test
from scipy.stats import ttest_ind
tstat, pval = ttest_ind(asus, toshiba)
print('{0:0.3f}'.format(pval))

With a p-value of .133, we cannot reject the null hypothesis! There's not enough evidence here to conclude that Toshiba laptops are significantly more expensive than Asus. With that being said, .133 is fairly close to reasonable significance so we may want to run another test or examine this further.

# Power and sample size
1) Power: probability of detectiong the effect, higher power less Type II error
2) confidence level
3) minimum effect size

***Sample size and confidence level are negatively correlated with Type II error, while minimum effect size causes a higher chance of Type II error.***


-> standardize effect size required for those package: example increase from 0.2 -  0.25 = 0.5
REMEMBER:
1. The smaller the effect that you want to detect, the higher the probability of a Type II error.
2. The larger sample size, the less probable a Type II error is.
3. The higher the confidence level, the lower the chances of a Type II error.

## Calculating sample size
Let's finish up our dive into statistical tests by performing power analysis to generate needed sample size. Power analysis involves four moving parts:

1. Sample size

2. Effect size

3. Minimum effect

4. Power

In this exercise, you're working with a website and want to test for a difference in conversion rate. Before you begin the experiment, you must decide how many samples you'll need per variant using 5% significance and 95% power.

In [None]:
#Standardize the effect of a conversion rate increase from 20% to 25% success using the proportion_effectsize() function.
from statsmodels.stats.proportion import proportion_effectsize
std_effect = proportion_effectsize(0.2, 0.25)

# Assign and print the needed sample size
from statsmodels.stats.power import  zt_ind_solve_power
sample_size = zt_ind_solve_power(effect_size=std_effect, nobs1=None, alpha=0.05, power=0.95)
print(sample_size)

# Bonferroni correction
Bonferroni correction
Let's implement multiple hypothesis tests using the Bonferroni correction approach that we discussed in the slides. You'll use the imported multipletests() function in order to achieve this.

Use a single-test significance level of .05 and observe how the Bonferroni correction affects our sample list of p-values already created.

In [None]:
from statsmodels.sandbox.stats.multicomp import multipletests
pvals = [.01, .05, .10, .50, .99]

# Create a list of the adjusted p-values
p_adjusted = multipletests(pvals, alpha=0.05, method='bonferroni')

# Print the resulting conclusions
print(p_adjusted[0])

# Print the adjusted p-values themselves 
print(p_adjusted[1])