https://towardsdatascience.com/hypothesis-testing-with-python-step-by-step-hands-on-tutorial-with-practical-examples-e805975ea96e

https://towardsdatascience.com/hypothesis-testing-in-machine-learning-using-python-a0dc89e169ce

https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/

https://www.investopedia.com/terms/h/hypothesistesting.asp

https://www.pythonfordatascience.org/independent-samples-t-test-python/

https://www.pythonfordatascience.org/parametric-assumptions-python/

# Hypothesis testing

## What is Hypothesis testing?

Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population parameter. The methodology employed by the analyst depends on the nature of the data used and the reason for the analysis.

Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. Such data may come from a larger population, or from a data-generating process. The word "population" will be used for both of these cases in the following descriptions

### Key Takeaways

-   Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data.
-   The test provides evidence concerning the plausibility of the hypothesis, given the data.
-   Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed.

## How Hypothesis Testing Works

In hypothesis testing, an analyst tests a statistical sample, with the goal of providing evidence on the plausibility of the **null hypothesis**.

Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed. All analysts use a random population sample to test two different hypotheses: the **null hypothesis** and the **alternative hypothesis**.

The **null hypothesis** is usually a hypothesis of equality between population parameters; e.g., a **null hypothesis** may state that the population mean return is equal to zero. The **alternative hypothesis** is effectively the opposite of a **null hypothesis** (e.g., the population mean return is not equal to zero). Thus, they are mutually exclusive, and only one can be true. However, one of the two hypotheses will always be true.

### 4 Steps of Hypothesis Testing

All hypotheses are tested using a four-step process:

1.  The first step is for the analyst to state the two hypotheses so that only one can be right.
2.  The next step is to formulate an analysis plan, which outlines how the data will be evaluated.
3.  The third step is to carry out the plan and physically analyze the sample data.
4.  The fourth and final step is to analyze the results and either reject the null hypothesis, or state that the null hypothesis is plausible, given the data.

## Real-World Example of Hypothesis Testing

If, for example, a person wants to test that a penny has exactly a 50% chance of landing on heads, the null hypothesis would be that 50% is correct, and the alternative hypothesis would be that 50% is not correct.

Mathematically, the null hypothesis would be represented as **Ho: P = 0.5**. The alternative hypothesis would be denoted as "Ha" and be identical to the null hypothesis, except with the equal sign struck-through, **Ho: P ≠ 0.5**, meaning that it does not equal 50%.

A random sample of 100 coin flips is taken, and the null hypothesis is then tested. If it is found that the 100 coin flips were distributed as 40 heads and 60 tails, the analyst would assume that a penny does not have a 50% chance of landing on heads and would reject the null hypothesis and accept the alternative hypothesis.

If, on the other hand, there were 48 heads and 52 tails, then it is plausible that the coin could be fair and still produce such a result. In cases such as this where the null hypothesis is "accepted," the analyst states that the difference between the expected results (50 heads and 50 tails) and the observed results (48 heads and 52 tails) is "explainable by chance alone."

In [3]:
from numpy.random import binomial
from scipy.stats import binomtest

fairCoinToss:list = list(binomial(1, 0.5, 1000))

unfairCoinToss:list  = list(binomial(1, 0.2, 1000))

fairCoinTossData = {
  'sucess': fairCoinToss.count(1),
  'trials':  len(fairCoinToss),
}

unfairCoinTossData = {
  'sucess': unfairCoinToss.count(1),
  'trials':  len(unfairCoinToss),
}

fairResult = binomtest(k=fairCoinTossData['sucess'], n=fairCoinTossData['trials'], p=0.5)

unfairResult = binomtest(k=unfairCoinTossData['sucess'], n=unfairCoinTossData['trials'], p=0.5)

print(fairResult)

print(unfairResult)

BinomTestResult(k=493, n=1000, alternative='two-sided', proportion_estimate=0.493, pvalue=0.6810229832764916)
BinomTestResult(k=196, n=1000, alternative='two-sided', proportion_estimate=0.196, pvalue=6.107128666211852e-88)


In the fair coin toss the null hypothesis cannot be rejected at the 5% level of significance because the returned p value is greater than the critical value of 5%.

Meanwhile in the unfair coin toss the null hypothesis can be rejected, because the p value is too lower than 5% or 0.05.

For more information check binomial testing:

https://en.wikipedia.org/wiki/Binomial_test


The basic of hypothesis is [normalisation](https://en.wikipedia.org/wiki/Normalization_(statistics)) and [standard normalisation](https://stats.stackexchange.com/questions/10289/whats-the-difference-between-normalization-and-standardization). All our hypothesis is revolve around basic of these 2 terms. let’s see these.

For more about null hypothesis: https://www.investopedia.com/terms/n/null_hypothesis.asp.

## T-Test
The indepentent T-test is a parametric test used to test for a statistically significant difference in the means between 2 groups. As with all parametric tests, there are certain conditions that need to be met in order for the test results to be considered reliable.

The conditions required to conduct a t-test include the measured values in ratio scale or interval scale, simple random extraction, homogeneity of variance, appropriate sample size, and normal distribution of data. The normality assumption means that the collected data follows a normal distribution, which is essential for parametric assumption [Source](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6676026/#:~:text=The%20conditions%20required%20to%20conduct%20a%20t%2Dtest%20include%20the,and%20normal%20distribution%20of%20data.).



### Parametric assumptions

Parametric tests have the same assumptions, or conditions, that need to be met in order for the analysis to be considered reliable.

1.- Independence

2.- Population distributions are normal

3.- Samples have equal variances


Now to load the data set and take a high level look at the variables.

In [5]:
import pandas as pd
import researchpy as rp
import scipy.stats as stats

In [7]:
df = pd.read_csv("https://raw.githubusercontent.com/researchpy/Data-sets/master/blood_pressure.csv")
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 120 entries, 0 to 119
Data columns (total 5 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   patient    120 non-null    int64 
 1   sex        120 non-null    object
 2   agegrp     120 non-null    object
 3   bp_before  120 non-null    int64 
 4   bp_after   120 non-null    int64 
dtypes: int64(3), object(2)
memory usage: 4.8+ KB


### Indepedent T-test using researchpy

The method returns 2 data frames, one that contains the summary statistical information and the other that contains the statistical test information. If the returned data frames are not stored as a Python object then the output will be less clean than it can be since it will be displayed as a tuple - see below.

In [5]:
summary, results = rp.ttest(group1= df['bp_after'][df['sex'] == 'Male'], group1_name= "Male",
                            group2= df['bp_after'][df['sex'] == 'Female'], group2_name= "Female")
print(summary)

   Variable      N        Mean         SD        SE   95% Conf.    Interval
0      Male   60.0  155.516667  15.243217  1.967891  151.578926  159.454407
1    Female   60.0  147.200000  11.742722  1.515979  144.166533  150.233467
2  combined  120.0  151.358333  14.177622  1.294234  148.795621  153.921046


  groups = group1.append(group2, ignore_index= True)


In [6]:
print(results)

              Independent t-test   results
0  Difference (Male - Female) =     8.3167
1          Degrees of freedom =   118.0000
2                           t =     3.3480
3       Two side test p value =     0.0011
4      Difference < 0 p value =     0.9995
5      Difference > 0 p value =     0.0005
6                   Cohen's d =     0.6112
7                   Hedge's g =     0.6074
8              Glass's delta1 =     0.5456
9            Point-Biserial r =     0.2945


### Interpretation
The average blood pressure after the treatment for males, M= 155.2 (151.6, 159.5), was statistically signigicantly higher than females, M= 147.2 (144.2, 150.2); t(118)= 3.3480, p= 0.001.

### Indepedent T-test using scipy.stats

This method conducts the independent sample t-test and returns only the t test statistic and it's associated p-value. For more information about this method, please refer to the official documentation page.

In [8]:
stats.ttest_ind(df['bp_after'][df['sex'] == 'Male'],
                df['bp_after'][df['sex'] == 'Female'])

Ttest_indResult(statistic=3.3479506182111387, pvalue=0.0010930222986154283)

### Interpretation

There is a statistically significant difference in the average post blood pressure between males and females, t= 3.3480, p= 0.001.