https://towardsdatascience.com/hypothesis-testing-with-python-step-by-step-hands-on-tutorial-with-practical-examples-e805975ea96e

https://towardsdatascience.com/hypothesis-testing-in-machine-learning-using-python-a0dc89e169ce

https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/

https://www.investopedia.com/terms/h/hypothesistesting.asp

# Hypothesis testing

## What is Hypothesis testing?

Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population parameter. The methodology employed by the analyst depends on the nature of the data used and the reason for the analysis.

Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data. Such data may come from a larger population, or from a data-generating process. The word "population" will be used for both of these cases in the following descriptions

### Key Takeaways

-   Hypothesis testing is used to assess the plausibility of a hypothesis by using sample data.
-   The test provides evidence concerning the plausibility of the hypothesis, given the data.
-   Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed.

## How Hypothesis Testing Works

In hypothesis testing, an analyst tests a statistical sample, with the goal of providing evidence on the plausibility of the **null hypothesis**.

Statistical analysts test a hypothesis by measuring and examining a random sample of the population being analyzed. All analysts use a random population sample to test two different hypotheses: the **null hypothesis** and the **alternative hypothesis**.

The **null hypothesis** is usually a hypothesis of equality between population parameters; e.g., a **null hypothesis** may state that the population mean return is equal to zero. The **alternative hypothesis** is effectively the opposite of a **null hypothesis** (e.g., the population mean return is not equal to zero). Thus, they are mutually exclusive, and only one can be true. However, one of the two hypotheses will always be true.

### 4 Steps of Hypothesis Testing

All hypotheses are tested using a four-step process:

1.  The first step is for the analyst to state the two hypotheses so that only one can be right.
2.  The next step is to formulate an analysis plan, which outlines how the data will be evaluated.
3.  The third step is to carry out the plan and physically analyze the sample data.
4.  The fourth and final step is to analyze the results and either reject the null hypothesis, or state that the null hypothesis is plausible, given the data.

## Real-World Example of Hypothesis Testing

If, for example, a person wants to test that a penny has exactly a 50% chance of landing on heads, the null hypothesis would be that 50% is correct, and the alternative hypothesis would be that 50% is not correct.

Mathematically, the null hypothesis would be represented as **Ho: P = 0.5**. The alternative hypothesis would be denoted as "Ha" and be identical to the null hypothesis, except with the equal sign struck-through, **Ho: P ≠ 0.5**, meaning that it does not equal 50%.

A random sample of 100 coin flips is taken, and the null hypothesis is then tested. If it is found that the 100 coin flips were distributed as 40 heads and 60 tails, the analyst would assume that a penny does not have a 50% chance of landing on heads and would reject the null hypothesis and accept the alternative hypothesis.

If, on the other hand, there were 48 heads and 52 tails, then it is plausible that the coin could be fair and still produce such a result. In cases such as this where the null hypothesis is "accepted," the analyst states that the difference between the expected results (50 heads and 50 tails) and the observed results (48 heads and 52 tails) is "explainable by chance alone."

In [25]:
from numpy.random import binomial
from scipy.stats import binomtest

fairCoinToss:list = list(binomial(1, 0.5, 100))

unfairCoinToss:list  = list(binomial(1, 0.2, 100))

fairCoinTossData = {
  'sucess': fairCoinToss.count(1),
  'trials':  len(fairCoinToss),
}

unfairCoinTossData = {
  'sucess': unfairCoinToss.count(1),
  'trials':  len(unfairCoinToss),
}

fairResult = binomtest(k=fairCoinTossData['sucess'], n=fairCoinTossData['trials'], p=0.5)

unfairResult = binomtest(k=unfairCoinTossData['sucess'], n=unfairCoinTossData['trials'], p=0.5)

print(fairResult)

print(unfairResult)

BinomTestResult(k=46, n=100, alternative='two-sided', proportion_estimate=0.46, pvalue=0.48411841360729146)
BinomTestResult(k=26, n=100, alternative='two-sided', proportion_estimate=0.26, pvalue=1.6673626494501009e-06)


In the fair coin toss the null hypothesis cannot be rejected at the 5% level of significance because the returned p value is greater than the critical value of 5%.

Meanwhile in the unfair coin toss the null hypothesis can be rejected, because the p value is too lower than 5% or 0.05.

For more information check binomial testing:

https://en.wikipedia.org/wiki/Binomial_test


The basic of hypothesis is [normalisation](https://en.wikipedia.org/wiki/Normalization_(statistics)) and [standard normalisation](https://stats.stackexchange.com/questions/10289/whats-the-difference-between-normalization-and-standardization). All our hypothesis is revolve around basic of these 2 terms. let’s see these.

For more about null hypothesis: https://www.investopedia.com/terms/n/null_hypothesis.asp.