# <span style="color:#54B1FF">Simulating Experiments:</span> &nbsp; <span style="color:#1B3EA9"><b>A Single Experiment</b></span>

<br>

This notebook simulates a single, simple one-sample experiment.

<br>

Let's reconsider the one-sample example from Lesson 08:

<img alt="one-sample" width=400 src="https://i1.wp.com/www.real-statistics.com/wp-content/uploads/2012/11/one-sample-t-test-1.png?w=552"/>

<br>

A one-sample t test tests the null hypothsis: <span style="color:blue">$\hspace{5mm} H_0: \overline{y} = 0$</span>

The sample mean is: <span style="color:blue">$\hspace{5mm} \overline{y} = 4.667$</span>

The sample standard deviation is: <span style="color:blue">$\hspace{5mm} s = 11.155$</span>

The t value is: <span style="color:blue">$\hspace{5mm} t = 1.449$</span>

The probability value is: <span style="color:blue">$\hspace{5mm} p = 0.088$</span>

This implies that $H_0$ is **not** rejected. 

<br>
<br>

Let's simulate this experiment, <span style="color:red">assuming that $H_0$ is **true**</span>, and also <span style="color:red">assuming that that the data are drawn randomly from the Normal distribution</span>.

First, let's simulate just one iteration:

In [1]:
import numpy as np
from scipy import stats

In [2]:
n     = 12      # sample size
mu    = 0       # when H0 is true, the mean is zero
sigma = 11.155  # assumed true standard deviation

np.random.seed(0)
y   = mu + sigma * np.random.randn(n)

np.set_printoptions(precision=3)  # print array values with 3 significant digits
print(y)

[ 19.678   4.464  10.918  24.997  20.833 -10.902  10.598  -1.688  -1.151
   4.58    1.607  16.222]


<br>
<br>

Congratulations!  You have just simulated an experiment.

What are the mean and standard deviation (SD) for this simulated dataset?

<br>

In [3]:
m  = y.mean()
s  = y.std(ddof=1)

print(  np.array( [ m , s ] )  )

[ 8.346 10.751]


<br>
<br>

Note that that both the **sample mean** and the **sample SD** are different from their true values of 0 and 11.155, respectively.

<br>

Why is this?

Because randomly sampling a small number of values produces imperfect estimates of the true mean and true SD.

<br>

See Lesson 07, which shows that numerical estimates improve as the number of random values increases.

<br>

In order to improve the estimates, we can either:
1. Increase sample size ($n$), or
1. Increase the number of simulated experiments ($N$)

<br>

Hypothesis testing probabilities pertain to the second case.

In particular, hypothesis testing probabilities pertain to an infinite number of experiments ($N = \infty$).

The next notebook shows how we can simulate a large number of experiments, to **numerically approximate** the theoretical assumption of $N = \infty$.