<a href="https://colab.research.google.com/github/tomanizer/stats_in_10_minutes/blob/master/Students_T.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Student's T

The Student's t-distribution was developed by William Sealy Gosset under the pseudonym Student.

In [0]:
#@title
import numpy as np
from scipy.stats import norm, t
import plotly.graph_objects as go

In [0]:
fig = go.Figure()
dt = np.linspace(start=-10, stop=10, num=500)

In [121]:
#@title Gaussian vs Student T { run: "auto" }
#@markdown #####Normal
gaussian_std = 1  #@param {type: "slider", min: 1, max: 10}

#@markdown #####Student-t
decrees_of_freedom = 1  #@param {type: "slider", min: 1, max: 10}

std_normal = norm.pdf(dt, 0, gaussian_std)
student_t = t.pdf(x=dt, df=decrees_of_freedom, loc=0, scale=1)

if fig.data == ():
  fig.add_scatter(x=dt, y=[], name='standard normal')
  fig.add_scatter(x=dt, y=[], name=f'student-t {decrees_of_freedom} dgf')

fig.data[0].y = std_normal
fig.data[1].y = student_t
fig

# Deriving Student's t

The recipe for Student's t is as follows:

1. Draw samples from a normal distribution

In [122]:
mu = 0
std = 1
samplesize = 20
sample_from_normal = np.random.normal(mu,std, samplesize)
sample_from_normal

array([ 0.27694146, -1.47655275,  2.62345118,  1.81709297,  1.24356739,
        1.44237429, -0.4274136 ,  0.20014058, -0.96994631, -0.14211196,
       -0.92716166, -1.30787689,  1.23075739,  0.60244228, -1.17452245,
        0.04862965, -1.22170955,  0.55981615, -1.02743764,  0.74759388])

2. Calculate sample mean and sample variance

In [123]:
sample_mean = sample_from_normal.mean()
print(f"Sample mean: {sample_mean}")
sample_var = sample_from_normal.var()
print(f"Sample variance: {sample_var}")

Sample mean: 0.10590372123908813
Sample variance: 1.31388906960349


3. Define a random variable for the *Student's t statistic* 

In [124]:
random_variable = (sample_mean - mu) / (sample_var / np.sqrt(samplesize))
random_variable

0.36046866548977996

4. Create a function to let us draw many samples of this random variable

In [0]:
def studentt_random_variable(n=100, samplesize=20, mu=0, std=1):
  sample_from_normal = np.random.normal(mu,std, (n, samplesize))
  sample_mean = sample_from_normal.mean(axis=1)
  sample_var = sample_from_normal.var(axis=1)
  random_variable = (sample_mean - mu) / (sample_var / np.sqrt(samplesize))
  return sorted(random_variable)


In [150]:
#@title Simulate Student's t { run: "auto" }

n = 444  #@param {type: "slider", min: 10, max: 500}
samplesize = 5  #@param {type: "slider", min: 1, max: 200}
xval = studentt_random_variable(n=n, samplesize=samplesize)

fig1 = go.Figure()
fig1.add_histogram(x= xval, 
                   name='student t', 
                   histnorm="probability density")

In [0]:
#@title
