# Non Parametric Tests

## Wilcoxon Paired test:
Used to compare two paired samples.

Example:


![image.png](attachment:image.png)

     H0 – There is no significant difference in the calcium level of the patient between initial calcium values and after 2 weeks
     H1 - There is significant difference in the calcium level of the patient between initial calcium values and after 2 weeks

---

    from scipy.stats import Wilcoxon
    stats, p = wilcoxon(dataset.TOTALCIN, dataset.TOTALCW2)


> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)




## Friedman Test

It is used to compare more than two paired samples

Example:

![image.png](attachment:image.png)

    H0 - There is no significant difference in the calcium level of the patient intially, after 2 weeks and 4th week
    H1 - There is significant difference in the calcium level of the patient intially, after 2 weeks and 4th week
---

    from scipy.stats import friedmanchisquare

    stats, p = friedmanchisquare(dataset.TOTALCIN, dataset.TOTALCW2, dataset.TOTALCW4)

> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)

  

## Mann-Whitney Test

It is used to compare two independent samples

Example:

![image.png](attachment:image.png)

5 stores release Design 1 and record their sales then release Design 2 and record their sales

    H0 - There is no significant difference in the sales of Design 1 and Design 2
    
    H1 - There is significant difference in the sales of Design 1 and Design 2
    
---

    from scipy.stats import mannwhitneyu
    stats , p = mannwhitneyu(dataset.Design1, dataset.Design2)

> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)



## Kruskal-Wallis Test

It is used to compare more than two independent samples

Example:
![image.png](attachment:image.png)

5 stores release Design 1 and record their sales then release Design 2 and record their sales and do the same with Design 3

    H0 - There is no significant difference in the sales of Design 1, Design 2 and Design 3
    
    H1 - There is significant difference in the sales of Design 1, Design 2 and Design 3
---

    from scipy.stats import kruskal
    stats , p = mannwhitneyu(dataset.Design1, dataset.Design2, dataset.Design3)

> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)


## Chi Square Test

It is used to check the dependency of the variables and the variables should be categorical

Example:

![image.png](attachment:image.png)

**Drop the Null Values**   -    [use dopna() function]

    H0 - There is no dependency between Gender and Smoking

    H1 - There is dependency between Gender and Smoking
    
---
    
    from scipy.stats import chi2_contingency
    chitable = pd.crosstab(dataset.Gender, dataset.Smoking)
    chitable
    
![image.png](attachment:image.png)
    
    stats, p, dof, expected = chi2_contingency(chitable)
    stats, p

> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)

# Parametric Tests

## One Sample T Test
It is used to compare sample mean with the population mean

Example:

Student's detail dataset - 
![image.png](attachment:image.png)


We can use Mann Whitney Test to compare two independent samples here, ids and height

Population mean will always be an assumption,
here assume the population mean is 65

    H0 - There is no significant difference between the mean of students height against population mean which is 65
    
    H1 - There is significant difference between the mean of students height against population mean which is 65
---

    from scipy.stats import ttest_1samp
    stats, p = ttest_1samp(dataset.Height, 65)
    stats, p
    
> **stats** - critical value (*can be used for citical value approach*)

> **p** - p value (*can be used for p value approach*)

## Two Sampled Paired T Test
It is used to compare mean of two paired samples

Example:

![image.png](attachment:image.png)

- English and Maths marks are mentioned across the student's ID
- Same student has 2 different subject marks. So here we comparing two different situations for the same student.

    H0 - There is no significant difference in the mean of students English and Maths marks
    H1 - There is significant difference in the mean of students English and Maths marks
---
    
    from scipy.stats import ttest_rel
    stats, p = ttest_rel(dataset.English, dataset.Math)

## Two-Sample Independent T Test

It is used to compare the mean of two independent samples

Example:
- Assume there is a marathon competition, there is a set of participants participating
- In that competion some may be non-athlete while some may be athlete
- The time taken to complete entire distance is mentioned across their ID
- 0 - Non Athlete
- 1 - Athlete


![image.png](attachment:image.png)

- Make a new table have duration for Athletes and Non-Athletes using [pandas.get_dummies()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html) function from pandas


![image.png](attachment:image.png)


    H0 -There is no significant difference in the mean of duration between Athletes and Non Athletes
    
    H1 -There is no significant difference in the mean of duration between Athletes and Non Athletes
---

    froms scipy.stats import ttest_ind
    stats, p = ttest_ind(dataset.Nonathlete, dataset.Athlete)
    stats, p