# Confidence intervals

## Tasks

### Task 1

We called 1000 clients and we want to check the probability that 100 or fewer users will agree to renew our subscription.  
When called, a client renews their subscription with a probability of 0.3.

In [1]:
from scipy.stats import binom


n = 1000  
p = 0.3   
k = 100   


probability = binom.cdf(k, n, p)

print(f'The probability of 100 or fewer renewals is: {probability:.4f}')

The probability of 100 or fewer renewals is: 0.0000


### Task 2

What confidence interval would correspond to a 3-sigma rule?

In [2]:
from scipy.stats import norm


probability = 2 * norm.cdf(3) - 1
print(f'The probability within 3 standard deviations is: {probability:.4f}')

The probability within 3 standard deviations is: 0.9973


### Task 3

We are estimating the average daily temperature in a certain city.  
In our sample, there are 63 observations, with a mean value of 25 and a standard deviation of 7 (known in advance, not calculated from the
sample).  
Calculate the 95% confidence interval.

In [3]:
mean_temperature = 25
std = 7
sample_size = 63
ci = 0.95


norm.interval(ci, loc=mean_temperature, scale=std/sample_size**.5)

(23.27147423942126, 26.72852576057874)

### Task 4

Suppose we have 25 stores.  
The average daily revenue is 170 thousand dollars, and the estimated standard deviation based on this sample is 12 thousand dollars.  
Estimate the 95% confidence interval for the average revenue of the stores.

In [4]:
mean_revenue = 170000
std = 12000
stores_number = 25
ci = 0.95


norm.interval(ci, loc=mean_revenue, scale=std/stores_number**.5)

(165296.08643710386, 174703.91356289614)

### Task 5

Let's practice building more confidence intervals.  
As a sample, we will use synthetic data: generate 1000 numbers from a Poisson distribution with lambda parameter 50.   
Calculate the 95% confidence interval for the mean based on the Central Limit Theorem.   
What is the width of the interval you obtained? 

In [5]:
import numpy as np
from scipy import stats


lambda_param = 50
n_samples = 1000
np.random.seed(42) 
data = np.random.poisson(lam=lambda_param, size=n_samples)

sample_mean = np.mean(data)
sample_std = np.std(data, ddof=1)

confidence_level = 0.95
alpha = 1 - confidence_level

z_critical = stats.norm.ppf(1 - alpha / 2)  # equals to 0.975 percentile
standard_error = sample_std / np.sqrt(n_samples)
margin_of_error = z_critical * standard_error

lower_bound = sample_mean - margin_of_error
upper_bound = sample_mean + margin_of_error
interval_width = upper_bound - lower_bound


print(f'Sample Mean: {sample_mean:.2f}')
print(f'Sample Standard Deviation: {sample_std:.2f}')
print(f'Z-critical value for {confidence_level*100}% confidence: {z_critical:.2f}')
print(f'{confidence_level*100}% Confidence Interval: ({lower_bound:.2f}, {upper_bound:.2f})')
print(f'Width of the Confidence Interval: {interval_width:.2f}')

Sample Mean: 49.80
Sample Standard Deviation: 7.28
Z-critical value for 95.0% confidence: 1.96
95.0% Confidence Interval: (49.35, 50.25)
Width of the Confidence Interval: 0.90


### Task 6

Let's use the same dataset (synthetic from the Poisson distribution).  
Build a 95% confidence interval for the standard deviation using bootstrap.  
It is recommended to do it using the corresponding function from the `scipy.stats` library.  
Attention, the bootstrap function has a method parameter that determines how the confidence interval boundaries will be calculated.  In our case, it should be `percentile`.

In [6]:
from scipy.stats import bootstrap


def std_deviation(x, axis=None):
    return np.std(x, ddof=1, axis=axis)


result = bootstrap((data,),
                   statistic=std_deviation,
                   confidence_level=confidence_level,
                   n_resamples=10000,
                   method='percentile',
                   random_state=np.random.seed(42))


print(f'Standard Deviation {confidence_level*100}% Confidence Interval: ({result.confidence_interval.low:.2f}, {result.confidence_interval.high:.2f})')

Standard Deviation 95.0% Confidence Interval: (6.96, 7.58)


### Task 7

tbc..