# Hypothesis Testing
## Parametric Tests for Continuous Data (Investigating Means)
### One Sample t-test against Hypothesised Population Mean
$H_0: Sample\:Mean= Hypothesised\:Population\:Mean$<br>
$H_1: Sample\:Mean\neq Hypothesised\:Population\:Mean$<br>
Assumptions: Sample size >30 or when the population SD is known<br>
Produces sample mean, SE, t-statistic, p(observing a value from SND (T) greater than test statistic t) or Pr(T > t) or right-tailed p-value, 95% CI

In [1]:
import os
os.chdir("/Applications/Stata/utilities")
from pystata import config
config.init("se")

import pandas as pd

IBM = '/Users/mujiechen/Jupyter-Notebook/STATA/Datasets/IBM.dta'
LSS = '/Users/mujiechen/Jupyter-Notebook/STATA/Datasets/LSS.dta'
IBM = pd.read_stata(IBM)
LSS = pd.read_stata(LSS)
print(IBM.head())


  ___  ____  ____  ____  ____ ®
 /__    /   ____/   /   ____/      StataNow 18.5
___/   /   /___/   /   /___/       SE—Standard Edition

 Statistics and Data Science       Copyright 1985-2023 StataCorp LLC
                                   StataCorp
                                   4905 Lakeway Drive
                                   College Station, Texas 77845 USA
                                   800-782-8272        https://www.stata.com
                                   979-696-4600        service@stata.com

Stata license: Unlimited-user network, expiring  9 Sep 2025
Serial number: 501809305305
  Licensed to: Mujie
               

Notes:
      1. Unicode is supported; see help unicode_advice.
      2. Maximum number of variables is set to 5,000 but can be increased;
          see help set_maxvar.
   Patient_ID  Gender  Age  Ageofonset  Dysphagia_BL  Walkingaid_BL  \
0           1       1   54          46             0              0   
1           2       1   83          74

In [2]:
%%stata -d IBM

ttest IBMFRS_baseline == 25


. 
. ttest IBMFRS_baseline == 25

One-sample t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
IBMFRS~e |      30        24.8    1.295793    7.097353     22.1498     27.4502
------------------------------------------------------------------------------
    mean = mean(IBMFRS_baseline)                                  t =  -0.1543
H0: mean = 25                                    Degrees of freedom =       29

    Ha: mean < 25               Ha: mean != 25                 Ha: mean > 25
 Pr(T < t) = 0.4392         Pr(|T| > |t|) = 0.8784          Pr(T > t) = 0.5608

. 


Pr(T > t) is the likelihood of obtaining a test statistic as extreme as t, assuming the null hypothesis is true;<br>
This is also known as the p-value, or probability of incorrectly rejecting the null hypothesis were it true

Pr(T < t) is the likelihood of obtaining a test statistic less extreme than t (or the AUC to the left of the t-statistic), assuming the null hypothesis is true;<br>
Which is equivalent to 1 - Pr(T > t)

Pr(|T| > |t|) is the two-tailed probability of the test statistic being as extreme as the test statistic (usually *2 of right-tailed p-value)

> N.B. The One Sample z-test is a special case in one sample z-tests, whereby the sample means follow the standard normal distribution, a special type of normal distribution whereby population mean = 0 and SD = 1

### Two Sample T-test
$H_0: Mean\:1 = Mean\:2$<br>
$H_1: Mean\:1\neq Mean\:2$<br>
Assumptions: Data follows a normal distribution<br>
Produces means of each group, Δmeans, SE, t-statistic, p(observing a value from SND (T) greater than test statistic t) or Pr(T > t) or right-tailed p-value, 95% CI

In [3]:
%%stata

ttest IBMFRS_1year, by(Gender)


. 
. ttest IBMFRS_1year, by(Gender)

Two-sample t test with equal variances
------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
       0 |       6    18.16667    2.315407    5.671567    12.21472    24.11861
       1 |      24    23.95833    1.570192     7.69234    20.71014    27.20652
---------+--------------------------------------------------------------------
Combined |      30        22.8    1.390774    7.617584    19.95555    25.64445
---------+--------------------------------------------------------------------
    diff |           -5.791667    3.364945               -12.68444    1.101111
------------------------------------------------------------------------------
    diff = mean(0) - mean(1)                                      t =  -1.7212
H0: diff = 0                                     Degre

### Paired Samples T-test
$H_0: Mean_{t1} = Mean_{t2}$<br>
$H_1: Mean_{t1}\neq Mean_{t2}$<br>

or

$H_0: \Delta Mean = 0$<br>
$H_1: \Delta Mean\neq 0$

In [4]:
%%stata

ttest IBMFRS_baseline == IBMFRS_1year

gen IBMFRS_diff = IBMFRS_1year - IBMFRS_baseline
ttest IBMFRS_diff == 0


. 
. ttest IBMFRS_baseline == IBMFRS_1year

Paired t test
------------------------------------------------------------------------------
Variable |     Obs        Mean    Std. err.   Std. dev.   [95% conf. interval]
---------+--------------------------------------------------------------------
IBMFRS~e |      30        24.8    1.295793    7.097353     22.1498     27.4502
IBMFRS~r |      30        22.8    1.390774    7.617584    19.95555    25.64445
---------+--------------------------------------------------------------------
    diff |      30           2    .6643638    3.638871    .6412234    3.358777
------------------------------------------------------------------------------
     mean(diff) = mean(IBMFRS_baseline - IBMFRS_1year)            t =   3.0104
 H0: mean(diff) = 0                              Degrees of freedom =       29

 Ha: mean(diff) < 0           Ha: mean(diff) != 0           Ha: mean(diff) > 0
 Pr(T < t) = 0.9973         Pr(|T| > |t|) = 0.0054          Pr(T > t) =

## Non-Parametric Tests for Continuous Data
Non-parametric Tests are performed on ranked data rather than data themselves e.g. 43 68 112 452 will be coded as 1 2 3 4
> Instead of the t-distribution, non-parametric tests rely on the standard normal distribution (or z-distribution), since the ranking and resampling data leads to approximation of SND due to the Central Limit Theorem (CLM)

### Wilcoxon Rank Sum or Mann Whitney U Test

$H_0: Median\:1 = Median\:2$<br>
$H_1: Median\:1\neq Median\:2$<br>

Produces rank sum values, expected rank sum values, variance adjustment (to account for tied ranks), z-statistic, P-value or Pr > |z| or the chance of observing z-value as high as this under the $H_0$

In [5]:
print(LSS.head())

  PatientID  Operation  Gender  EQ5DPre  EQ5D24mts
0      H001          1       1    0.088      0.516
1      H002          1       0   -0.536        NaN
2      H005          1       0    0.159      0.691
3      H009          1       0    0.587      0.691
4       H12          0       0   -0.016      0.516


In [7]:
%%stata -d LSS

ranksum EQ5D24mts, by(Operation)


. 
. ranksum EQ5D24mts, by(Operation)

Two-sample Wilcoxon rank-sum (Mann–Whitney) test

   Operation |      Obs    Rank sum    Expected
-------------+---------------------------------
           0 |       24         517         516
           1 |       18         386         387
-------------+---------------------------------
    Combined |       42         903         903

Unadjusted variance     1548.00
Adjustment for ties       -8.78
                     ----------
Adjusted variance       1539.22

H0: EQ5D24~s(Operat~n==0) = EQ5D24~s(Operat~n==1)
         z =  0.025
Prob > |z| = 0.9797
Exact prob = 0.9849

. 


### Paired Samples T-test (Wilcoxon Signed Rank)

$H_0: Median_{t1}= Median_{t2}$<br>
$H_1: Median_{t1}\neq Median_{t2}$

or

$H_0: Median\:of\:\Delta= 0$<br>
$H_1: Median\:of\:\Delta \neq 0$

In [10]:
%%stata

signrank EQ5DPre = EQ5D24mts

gen EQ5D_diff = EQ5D24mts - EQ5DPre
signrank EQ5D_diff = 0


. 
. signrank EQ5DPre = EQ5D24mts

Wilcoxon signed-rank test

        Sign |      Obs   Sum ranks    Expected
-------------+---------------------------------
    Positive |       10         157         451
    Negative |       31         745         451
        Zero |        1           1           1
-------------+---------------------------------
         All |       42         903         903

Unadjusted variance     6396.25
Adjustment for ties       -1.88
Adjustment for zeros      -0.25
                     ----------
Adjusted variance       6394.12

H0: EQ5DPre = EQ5D24mts
         z = -3.677
Prob > |z| = 0.0002
Exact prob = 0.0001

. gen EQ5D_diff = EQ5D24mts - EQ5DPre
(5 missing values generated)

. signrank EQ5D_diff = 0

Wilcoxon signed-rank test

        Sign |      Obs   Sum ranks    Expected
-------------+---------------------------------
    Positive |       31         746         451
    Negative |       10         156         451
        Zero |        1           1      

### 95% Confidence Interval
For centile difference as opposed to "ci" for mean difference

In [12]:
%%stata

centile EQ5D_diff, centile(50)


. 
. centile EQ5D_diff, centile(50)

                                                          Binom. interp.   
    Variable |       Obs  Percentile    Centile        [95% conf. interval]
-------------+-------------------------------------------------------------
   EQ5D_diff |        42         50       .2745            .104    .3526933

. 


If 95% CI is narrow and exclude 0, $H_0$ can be rejected

____________________________________________________________________________________________________
## Parametric Tests for Categorical Data (Investigating Proportions)
### Two Sample T-test for Proportions
$H_0: Proportion\:1 = Proportion\:2$<br>
$H_0: Proportion\:2\neq Proportion\:2$<br>
Produces proportions of each group, Δproportions, SE, z-statistic, p(observing a value from SND (Z) greater than test statistic z) or Pr(Z > z) or right-tailed p-value, 95% CI

> One Group Z-test for proportions (binary outcomes)

In [None]:
%%stata

prtest Dysphagia_BL, by(Gender)
prtesti 20 12 33 22, count // The immediate test whereby proportions are known but you may not have data

Pr(Z > z) is used to determine the likelihood of obtaining a test statistic as extreme as z, assuming the null hypothesis is true;<br>
This is also known as the p-value, or probability of incorrectly rejecting the null hypothesis were it true

### Chi-Square Test (5 and up), Fischer's Exact Test (<5 in contingency table)

In [None]:
%%stata

