## Statistical hypothesis testing

A statistical hypothesis test, sometimes called confirmatory data analysis, is a hypothesis that is testable on the basis of observing a process that is modeled via a set of random variables.

**Null hypothesis**: A null hypothesis is a precise statement about a population that we try to reject with sample data. The hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error. Example: Sample mean is same as the population mean.

**Alternative hypothesis**: Sample mean is not same as the population mean


- https://www.spss-tutorials.com/null-hypothesis/
- https://statistics.laerd.com/statistical-guides/hypothesis-testing-3.php
- http://blog.minitab.com/blog/statistics-and-quality-data-analysis/what-are-degrees-of-freedom-in-statistics

 

## T-Test and Z-Test

 In a z-test, the sample is assumed to be normally distributed. A z-score is calculated with population parameters such as “population mean” and “population standard deviation” and is used to validate a hypothesis that the sample drawn belongs to the same population.

- **Z-Test** is used to validate a hypothesis that the sample drawn belongs to the same population
- **T-test** is used to compare the mean of two given samples

### ZScore

Simply put, a z-score is the number of standard deviations from the mean a data point is. But more technically it’s a measure of how many standard deviations below or above the population mean a raw score is. A z-score is also known as a standard score and it can be placed on a normal distribution curve. Z-scores range from -3 standard deviations (which would fall to the far left of the normal distribution curve) up to +3 standard deviations (which would fall to the far right of the normal distribution curve). In order to use a z-score, you need to know the mean μ and also the population standard deviation σ.

Z-scores are a way to compare results from a test to a “normal” population. Results from tests or surveys have thousands of possible results and units. However, those results can often seem meaningless. For example, knowing that someone’s weight is 150 pounds might be good information, but if you want to compare it to the “average” person’s weight, looking at a vast table of data can be overwhelming (especially if some weights are recorded in kilograms). A z-score can tell you where that person’s weight is compared to the average population’s mean weight.

* you’ll use a z-score in testing more often than a t score.

**Z-Test**: `z = (X-μ)/σ`
- σ is the population standard deviation and
- μ is the population mean.

The z-score formula doesn’t say anything about sample size; The rule of thumb applies that your sample size should be above 30 to use it.

Like z-scores, t-scores are also a conversion of individual scores into a standard form. However, t-scores are used when you don’t know the population standard deviation; You make an estimate by using your sample.

- https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/z-score/
- http://www.ttable.org/z-score-table.html

### TScore

**T-Test**: The t test (also called Student’s T Test) compares two averages (means) and tells you if they are different from each other. The t test also tells you how significant the differences are; In other words it lets you know if those differences could have happened by chance.

- A large t-score tells you that the groups are different.
- A small t-score tells you that the groups are similar.

There are three main types of t-test:

1. An Independent Samples t-test compares the means for two groups.
2. A Paired sample t-test compares means from the same group at different times (say, one year apart).
3. A One sample t-test tests the mean of a single group against a known mean.

When you run a hypothesis test, you use the T statistic with a p value. The p-value tells you what the odds are that your results could have happened by chance. Let’s say you and a group of friends score an average of 205 on a bowling game. You know the average bowler scores 79.7. Should you and your friends consider professional bowling? Or are those scores a fluke? Finding the t statistic and the probability value will give you a good idea. More technically, finding those values will give you evidence of a significant difference between your team’s mean and the population mean (i.e. everyone).

The greater the T, the more evidence you have that your team’s scores are significantly different from average. A smaller T value is evidence that your team’s score is not significantly different from average. It’s pretty obvious that your team’s score (205) is significantly different from 79.7, so you’d want to take a look at the probability value. If the p-value is larger than 5%, the odds are your team getting those scores are due to chance. Very small (under 5%), you’re onto something: think about going professional.

**T-Test**: `T = (X – μ) / [ σ/√(n) ]`

This makes the equation identical to the one for the z-score; the only difference is you’re looking up the result in the T table, not the Z-table. For sample sizes over 30, you’ll get the same result.

- https://www.statisticshowto.datasciencecentral.com/t-statistic/



## TScore vs ZScore: 

Technically, z-scores are a conversion of individual scores into a standard form. The conversion allows you to more easily compare different data; it is based on your knowledge about the population’s standard deviation and mean. A z-score tells you how many standard deviations from the mean your result is. You can use your knowledge of normal distributions (like the 68 95 and 99.7 rule) or the z-table to determine what percentage of the population will fall below or above your result.

- https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/hypothesis-testing/t-score-vs-z-score/

- **Confidence interval**: Bandwidth that's likely to enclose the population correlation

## The P-Value

**p-value**: Is the probability of finding some sample outcome or a more extreme one if the null hypothesis is true. Every t-value has a p-value to go with it. A p-value is the probability that the results from your sample data occurred by chance. P-values are from 0% to 100%. They are usually written as a decimal. For example, a p value of 5% is 0.05. Low p-values are good; They indicate your data did not occur by chance.

In [2]:
import numpy as np
from scipy import stats


# **Define 2 random distributions**
#S ample Size
N = 10
# Gaussian distributed data with mean = 2 and var = 1
a = np.random.randn(N) + 2
# Gaussian distributed data with with mean = 0 and var = 1
b = np.random.randn(N)


# **Calculate the Standard Deviation**
# Calculate the variance to get the standard deviation
# For unbiased max likelihood estimate we have to divide the var by N-1, and therefore the parameter ddof = 1
var_a = a.var(ddof=1)
var_b = b.var(ddof=1)

#std deviation
s = np.sqrt((var_a + var_b)/2)

## Calculate the t-statistics
t = (a.mean() - b.mean())/(s*np.sqrt(2/N))

## Compare with the critical t-value, Degrees of freedom
df = 2*N - 2

#p-value after comparison with the t
p = 1 - stats.t.cdf(t,df=df)

# You can see that after comparing the t statistic with the critical t value (computed internally) we get a good p value of 0.0005 and thus we reject the null hypothesis and thus it proves that the mean of the two distributions are different and statistically significant.
print("t = " + str(t))
print("p = " + str(2*p))


# Cross Checking with the internal scipy function
t2, p2 = stats.ttest_ind(a,b)
print("t = " + str(t2))
print("p = " + str(p2))

t = 3.952717183300462
p = 0.0009331119058779702
t = 3.952717183300462
p = 0.0009331119058780262
