# Chapter 1

- Hypothesis testing:
    - Layman term : Method of testing if an assumption falls in feasible likelihood estimation or just a non-sense.
    - Hypothesis : statement about an unknown population parameter (Your guess of the average value of a sample)
    - Hypothesis test : test of two competing hypotheses. 
        - Null hypothesis : Your guess
        - Alternative hypothesis : Challenging idea against your guess
    - 3 tests: 
        - direction of test = direction of alternative hypothesis (or challenging idea)
        - hypothesis for less or fewer = left tailed test
        - hypothesis for greater or more = right tailed test
        - hypothesis for different = two tailed test
    - Testing Method : Compute sample mean and see the difference between guessed mean and sample mean. Measure if the difference is significant or not.
    - Solution:
        - z-test
        - proportion z-test
        - t-test
        - ANOVA
        - chi-square test of independence
        - Chi-square goodness of fit
        - Non-parametric Wilcoxon-signed rank test 
        - Non-parametric Wilcoxon-Mann-Whitney test
        - Non-parametric Kruskal-Wallis test
- A/B Testing : Hypothesis testing on 2 different scenarios (control vs treatment group)
- p-value : 
    - probability of obtaining a result if the null hypothesis is true
    - probability value from z-score or significance level
    - Large p-value = p values not on tails = Large support for null hypothesis (p_value > alpha)
    - Small p-value = p values on tails = No support for null hypothesis (p_value <= alpha)
    - cut-off point = common significance level is 5%
    - confidence interval = 1- alpha = 95%
    - if the hypothesized population parameter is within the confidence interval, you should fail to reject the null hypothesis.
- Null is False / Negative.
- False Positive = Type 1 error
- False Negative = Type 2 error
    <center><img src="images/01.01.png"  style="width: 400px, height: 300px;"/></center>

# Chapter 2

- z-test
    - hypothesis testing on one-sample problem
    - When population parameter is known
    - Requires z-score (measure of how many standard deviations a data point is from the mean of a dataset)
    - We test with the difference of between population and sample mean and its distribution
    - Procedures:
        - Bootstrap and get the mean of sample means
        - measure standard daviation of bootstrap distribution
        - find z-score = (sample mean - guessed or hypothesis mean) / standard daviation of bootstrap distribution
        - convert z-score to p-value
        - Compare p-value with significance level
- Proportions z-test
    - calculates if the difference in proportion of two categories across two variables have any significance
    - Gives idea if two categorical columns have any association
    - small p-value means the variables have association
- t-test 
    -  hypothesis testing on two-sample problem and allows more uncertainty for two variables
    - When a sample standard deviation is used in estimating a standard error instead of population parameters.
    - Requires t-stat, with degree of freedom (no of values in a variable - 1)
    - We test with the difference of sample means and its distribution
    -  paired t-test : both variables are dependent on a third variable
    - Procedures:
        - get the mean of sample means of two different classes
        - measure standard daviation of the two classes
        - find t-stat = (sample1 mean - sample2 mean) / root over ((std of sample1^2/length of sample1) + (std of sample2^2/length of sample2))
        - convert t-stat to p-value
        - Compare p-value with significance level
- ANOVA test
    - hypothesis testing on more than 2 groups
    - high chance of False Positive with more number of groups. use correction method to adjust p-values
- Chi-square test of independence
    - calculates if the difference in proportion of multiple categories across two variables have any significance
    - Gives the idea if the two categorical variables are statistically independent of each other
    - Statistical independence - proportion of successes in the response variable is the same across all categories of the explanatory variable
    - p-value < significance level means the variables are not independent. Instead, they are associated
    - No direction or tail or alternative argument since it is squared (Always right tailed test)
- Chi-square goodness of fit
    - Checks whether the hypothesised proportion are a good fit for the sample proportion
    - Used to compare the proportion of a categorical variable's distribution across different dataset 
    - if p-value < significance level then we can say that the hypothesised proportion is not a good fit of the sample proportion
- Non-parametric tests
    - when sample size is very small (<30 for normal tests, < 10 for proportional tests, < 5 for chi-square tests)
    - When the distribution is not normal
- Non-parametric Wilcoxon-signed rank test :
    - Works well with paired data
    - Takes the elementwise difference between 2 columns 
    - Take the absolute value of the differences
    - Rank the absolute values
    - Split them into two parts. Add all ranks for values that were less than zero and add all ranks for values that were greater than zero
    - take the minimum of the two sum
- Non-parametric Wilcoxon-Mann-Whitney test:
    - Works well with unpaired data of two categories
    - requires to convert data from long to wide format
- Non-parametric Kruskal-Wallis test:
    - Works well for data with multiple categories

### Hypothesis tests

```
import pingouin
# Z-test (One directional, left)
pingouin.ztest(x=df['col1'], y=df['col2'], alternative='less')
# Paired t-test (One directional, paired, left)
pingouin.ttest(x=df['col1'], y=df['col2'], paired=True, alternative="less")
# ANOVA test (All possible combination of groups)
pingouin.anova(data=df, dv="col", between="cat_col")
pingouin.pairwise_tests(data=df, dv="col", between="cat_col", padjust="bonf")

# Proportion Z-test
from statsmodels.stats import proportion
z_score, p_val = proportion.proportions_ztest(count=[len(df[(df['cat_col1'] == 'A') & (df['cat_col2'] == 'X')]),
                                            len(df[(df['cat_col1'] == 'B') & (df['cat_col2'] == 'X')])],
                                      nobs=[len(df[df['cat_col1'] == 'A']),
                                            len(df[df['cat_col1'] == 'B'])])
# Chi-square test of independence
expected, observed, stats = pingouin.chi2_independence(data=df, x="catcol1", y="catcol2")
print(stats)


# Perform chi-square goodness-of-fit test between two groupby dataframe results
from scipy.stats import chi2_contingency
chi2, p, _ = chi2_contingency([df1["col"], df2["col"]])

# Non-parametric wilcoxon signed rank for paired t-test
pingouin.wilcoxon(x=df['col1'], y=df['col2'], alternative="less")

# Non-parametric Wilcoxon-Mann-Whitney test for unpaired two categories t-test
df_wide = df.pivot(columns='cat_col',values='num_col')
pingouin.mwu(x=df_wide['cat1'], y=df_wide['cat2'], alternative='greater')

# Non-parametric Kruskal-Wallis test for ANOVA / multiple categories
pingouin.kruskal(data=df, dv='num_col', between='cat_col')

```

# Chapter 3

- Used to determine relationship between 2 categorical variables.
- It tests for association between 2 variables and tells us how likely
- the is association due to chance 
- The test assumes (null hypothesis) that the variables are independent.
- The the model does not fit, then that proves that the variables are dependent.

calculation steps: 

- use contigency table for observed values from a dataset.
- on contigency table,  for each cell, (row total * column total)/grand total
- this will create table with expected value 
- calculate chi square value: SUM((observed-expected)^2 / expected)
- determine p-value for that chi square value
- if p<0.05, then the variables are not independent (Reject null)

# Chapter 4

- Sampled data must be randomly collected from the population
- Size of dataset (Our generic assumption for parametric hypothesis tests)
    - for normal tests : 30 sample data
    - for proportion tests : 10 for each categorical data
    - for chi-square tests : 5 for each categorical class
    - sanity check : bootstrap distribution should be normally distributed
- Otherwise do non-parametric hypothesis tests