# Confidence Intervals Code Appendix

## Student's t Distribution

### Coding Key Words:

- rvs: random variates
- pdf: probability density function
- cdf: cumulative distribution function
- ppf: percent point function or percentile (inverse of cdf)

### Required parameters:

- df: degree of freedom, n - 1
- size: size of the simulated dataset
- x: location for probability calculation
- q: lower tail probability for percentile calculation

``` Python
# Import dependency
from scipy import stats

# Create the parameter values
df = 34
x = 1
q = 0.75

# Create a random variable that follows t distribution
t = stats.t.rvs(df=df, size=1000)

# Calculate the PDF at x
tx = stats.t.pdf(x=x, df=df)

# Calculate the probability equal or less than x
txx = stats.t.cdf(x=x, df=df)

# Calculate the 75th percentile
tpct = stats.t.ppf(q=q, df=df)
```

In [1]:
# Import t from scipy.stats
from scipy import stats

In [2]:
# Create the degree of freedon 
df = 20

# Create a set of random values (1000) that follows t distribution
t = stats.t.rvs(df=df, size=1000)

print(f"Student's t Random Variable: \n{t}\n")

Student's t Random Variable: 
[ 1.06256106e+00 -6.43962449e-01  1.16649072e+00 -3.40205325e-01
 -4.05021146e-01  7.49663154e-01  4.16789203e-01 -7.05234192e-01
 -7.16674012e-01 -3.87726133e-01  1.58726357e-01  8.48187326e-01
 -5.20538829e-01 -1.35599071e+00 -3.72868514e-01  5.23356395e-01
 -3.23748174e-01 -1.06898946e+00 -6.47390246e-01 -1.31745076e-01
  2.61493805e-01 -1.33890462e-01 -8.72793142e-02 -1.22596839e+00
  1.04010898e+00 -7.22352331e-01 -6.78968445e-02  4.16228117e-01
 -1.59993515e+00 -4.12986787e-01 -1.16935704e+00 -7.45855834e-01
  1.77090276e-01 -8.89877411e-01 -1.32967692e-01  4.73358439e-01
  1.70975468e+00  1.01219652e+00  4.22579838e-01 -5.63604266e-01
 -2.36182040e+00 -1.34678757e-01 -3.34736359e-01 -1.32563806e+00
 -9.13669544e-02  1.36429075e+00  1.14397903e+00 -2.21927610e-01
 -1.19321414e+00 -1.04981534e+00  7.49854939e-01 -1.67997589e+00
  1.81471946e-01  1.82777949e-01 -6.35171898e-01 -4.41804398e-01
 -6.07329236e-01  8.53469395e-01 -7.91618060e-02 -3.68140806

In [3]:
# Calculate the pdf at x = 1
stats.t.pdf(x=1, df=df)

0.23604564912670092

In [4]:
# Calculate the cdf at x = 1
stats.t.cdf(x=1, df=df)

0.8353717114141455

In [5]:
# Calculate the t statistic at 50% percentile
stats.t.ppf(q=0.5, df=df)

6.72145054561635e-17

## Confidence Intervals for Student's t Distribution

``` python
# Import stats module
from scipy import stats

# Calculate the confidence interval
stats.t.interval(
    confidence=0.9,           # Confidence level
    df=df,               # Degrees of freedom
    loc=sample_mean,     # Sample mean
    scale=standard_error # Estimated standard error for t-distribution
)
```

### Confidence Intervals for Sample Means Distribution

When the population standard deviation is known, we should use the standard z (normal) distribution.

``` Python
# Import stats module
from scipy import stats

# Calculate the confidence interval 
stats.norm.interval(
    confidence=0.9,       # Confidence level
    loc=sample_mean,      # Sample mean
    scale=standard_error  # Standard error for sample distribution
)
```

In [None]:
# Import t from scipy.stats
from scipy import stats

In [6]:
# Calculate the 95% confidence interval
sample_mean = 15
standard_error = 1.2

stats.t.interval(
    confidence=0.95,
    df=df,
    loc=sample_mean,
    scale=standard_error
)

(12.496843863280997, 17.503156136719003)

## Bootstrap

Re-sampling method to create new samples for statistical inference.

``` Python
from sklearn.utils import resample

data = [...]

boot = resample(data,              # original data set
                replace=True,      # resampling with replacement
                n_samples=10,      # number of samples
                random_state=123   # random seed to ensure consistent result
)

print(f'Bootstrap Sample: \n{boot}')
```

In [None]:
# Import resample from sklearn.utils
from sklearn.utils import resample

In [None]:
# Create 10 boostrap samples from the random variable t
boot = resample(t,
               replace=True,
               n_samples=100,
               random_state=777
)

print(f'Boostrap Sample: \n{boot}')