# Confidence Intervals - how to calculate confidence intervals in python

QTS Asia Limited
- YouTube:  https://youtu.be/x-qDnuomvsU


In frequentist statistics, a confidence interval is a range of estimates for an unknown parameter. A confidence interval is computed at a designated confidence level; the 95% CL is most common, but other levels, such as 90% or 99%, are sometimes used.

reference https://en.wikipedia.org/wiki/Confidence_interval



You can compute your confidence interval with the folling formula:

Confidence Interval = x  +/-  t*(s/√n)


where:

- x: sample mean
- t: t-value for the confidence level
- s: sample standard deviation
- n: sample size


A simple calcualtion for the 95% Confidence Interval:

- x +/- 1.96*(s/√n)

Confidence Intervals using the t Distribution
for a small sample set (n <30), we can use the t.interval() function from the scipy.stats library to calculate a confidence interval for a population mean.

The following example shows how we calculate a confidence interval for the close price in our data set

In [1]:
import numpy as np
import scipy.stats as st
import pandas as pd

#load sample data, 
# you can find the following data set link in the youtube details section below:

df = pd.read_csv('example_data.csv')

data1 = df.CLOSE

#95% confidence interval for population mean close
st.t.interval(alpha=0.95, df=len(data1)-1, loc=np.mean(data1), scale=st.sem(data1))

(15940.0790240234, 15967.385764708995)


Larger confidence levels will result in a the wider  confidence interval. 

In [2]:
#create 99% confidence interval for same sample
st.t.interval(alpha=0.99, df=len(data1)-1, loc=np.mean(data1), scale=st.sem(data1)) 


(15935.60556440214, 15971.859224330256)

## Confidence Intervals Using the Normal Distribution
For samples n≥30 you can use the norm.interval() function from the scipy.stats library.


In [3]:

#define sample data
data2 = df.OPEN

#create 95% confidence interval for the open price
st.norm.interval(alpha=0.95, loc=np.mean(data2), scale=st.sem(data2))


(15941.35403155692, 15967.012165626178)



As with the t distribution, larger confidence levels create wider confidence intervals. 

In [4]:
#create 99% confidence interval for the open price
st.norm.interval(alpha=0.99, loc=np.mean(data2), scale=st.sem(data2))


(15937.322846533018, 15971.043350650081)