In [2]:
## Import required Python modules
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import scipy, scipy.stats
import io
import base64
#from IPython.core.display import display
from IPython.display import display, HTML, Image
from urllib.request import urlopen

try:
    import astropy as apy
    import astropy.table
    _apy = True
    #print('Loaded astropy')
except:
    _apy = False
    #print('Could not load astropy')

## Customising the font size of figures
plt.rcParams.update({'font.size': 14})

## Customising the look of the notebook
display(HTML("<style>.container { width:95% !important; }</style>"))
## This custom file is adapted from https://github.com/lmarti/jupyter_custom/blob/master/custom.include
HTML('custom.css')
#HTML(urlopen('https://raw.githubusercontent.com/bretonr/intro_data_science/master/custom.css').read().decode('utf-8'))

In [3]:
## Custom imports
from matplotlib.cm import jet
from math import ceil, pi
from scipy.stats import poisson, norm, binom
from matplotlib.collections import PatchCollection
from matplotlib.patches import Circle, Rectangle

In [3]:
## Adding a button to hide the Python source code
HTML('''<script>
code_show=true;
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the Python code."></form>''')

<div class="container-fluid">
    <div class="row">
        <div class="col-md-8" align="center">
            <h1>PHYS 10791: Introduction to Data Science</h1>
            <!--<h3>2019-2020 Academic Year</h3><br>-->
        </div>
        <div class="col-md-3">
            <img align='center' style="border-width:0" src="images/UoM_logo.png"/>
        </div>
    </div>
</div>

<div class="container-fluid">
    <div class="row">
        <div class="col-md-2" align="right">
            <b>Course instructors:&nbsp;&nbsp;</b>
        </div>
        <div class="col-md-9" align="left">
            <a href="http://www.renebreton.org">Prof. Rene Breton</a> - Twitter <a href="https://twitter.com/BretonRene">@BretonRene</a><br>
            <a href="http://www.hep.manchester.ac.uk/u/gersabec">Dr. Marco Gersabeck</a> - Twitter <a href="https://twitter.com/MarcoGersabeck">@MarcoGersabeck</a>
        </div>
    </div>
</div>

# Chapter 8 - Summary Sheet

### 8.1 Decision making

Hypothesis testing is essentially decision making.
This is linked to interpreting results that were obtained in a statistical test.

#### 8.1.1 Introductory examples

In general we need the following to conduct a hypothesis test:
- The assertion that some hypothesis is true,
- A numerical test that is to be applied to data, and
- A hypothesis that is accepted or rejected depending on the outcome of the test.

#### 8.1.2 Hypotheses

Hypotheses are statements that are either true or false. **Simple hypotheses** define the probability distribution function completely. **Composite hypotheses** combine several probability distribution functions.

#### 8.1.3 Alternative hypotheses

In hypothesis tests we often compare to alternative hypothses.

In general, it is crucial to distinguish between one-tailed directional and two-tailed non-directional tests. 
A two-tailed test refers to the comparison of a test outcome to a value where we don't care of whether the outcome is less than or greater than the value.
In the directional test the sign of the difference between test outcome and comparison value is of importance.


#### 8.1.4 Type I/II errors

The two cases where there is a mismatch between the hypothesis being true or false and the decision taken based on the test are called Type I and Type II error according to the following pattern:

| Hypothesis \ Decision | accept | reject |
|:-------------------|:----------:|:----------:|
| **true** | :) | Type I error |
| **false** | Type II error | :) |

#### 8.1.5 Significance and Power

**Significance**

Type I errors are inevitable and the rate at which they occur is called significance.
The significance, $\alpha$, is the integral of the probability distribution of the hypothesis over the rejection region:

$$\alpha=\int_{Reject}P_H(x)dx.$$

**Power**

Considering the alternative hypothesis, we can define the integral of the probability distribution of the alternative hypothesis over the acceptance region, in other words the rate of Type II errors, as

$$\beta=\int_{Accept}P_A(x)dx,$$

or, by integrating of the rejection region as above, we get

$$1-\beta=\int_{Reject}P_A(x)dx,$$

where $1-\beta$ is called the power of the test.

### 8.2 Practical examples

#### 8.2.1 Hypothesis tests with a discrete distribution

In a Poisson test of the hypothesis that a counting experiment results in a count compatible with a certain mean $\lambda$ or smaller is:
$$1-\alpha\lt\int_{Accept}Poisson(x;\lambda)dx=\sum_{x=0}^{n}Poisson(x;\lambda)$$
for significance $\alpha$ and $n$ the limit of the acceptance region. 

#### 8.2.3 Interpreting experiments: null hypothesis

We can only ever reject a hypothesis with great confidence, but not accept it. For any theory we want to test, we have to formulate the opposite hypothesis and aim to falsify this. This hypothesis is called the null hypothesis, $H_0$.

<div class="well" align="center">
    <div class="container-fluid">
        <div class="row">
            <div class="col-md-3" align="center">
                <img align="center" alt="Creative Commons License" style="border-width:0" src="https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png" width="60%">
            </div>
            <div class="col-md-8">
            This work is licensed under a <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>).
            </div>
        </div>
    </div>
    <br>
    <br>
    <i>Note: The content of this Jupyter Notebook is provided for educational purposes only.</i>
</div>