# Guide: Choosing the Right Hypothesis Test


## LESSON OBJECTIVES:
By the end of this lesson, students will be able to

- Use this guide to select the correct hypothesis test based on the number of samples/groups and the type of data being compared.
- Identify and test the assumptions of the selected test.
- Perform the selected test or its nonparametric equivalent.
We have explored several different hypothesis tests.  We will now provide a resource that you can use to help you determine the correct test for your situation.

## Choosing the Correct Hypothesis Test
### STEP 1: Stating our Hypothesis
- Before selecting the correct hypothesis test, you must first officially state your null hypothesis  and alternative hypothesis.   You should also define your significance value (alpha).  

- Before stating your hypotheses, ask yourself

    1) What question am I attempting to answer?
    2) What metric/value do I want to measure to answer this question?
    3) Do I expect the groups to be different in a specific way? (i.e. one group greater than the other).
        - Or do I just think they'll be different, but don't know how?
        
- Now formally declare your hypotheses after asking yourself the questions above:



### STEP 2: Determine the category/type of test based on your data.
- Q1: What type of data do I have:
    - Numeric?
    - Categorical?
- Q2: How many samples/groups am I comparing?
    1) sample vs a known quantity?
        - Example: comparing our 1 potential alien height against the population.
    2) samples/groups:
        - Example: Comparing effect of Drug A vs effect of a Placebo.
    3) More than 2 samples/groups:
        - Example: results of 3 different diets

Use the table below to look up the correct test according to your answers


![Screen%20Shot%202023-01-12%20at%202.49.05%20PM.png](attachment:Screen%20Shot%202023-01-12%20at%202.49.05%20PM.png)


### STEP 3: Does the data meet the assumptions of the selected test?
ASSUMPTIONS SUMMARY
- One-Sample T-Test
    - No significant outliers
    - Normality
- Independent t-test (2-sample)
    - No significant outliers
    - Normality
    - Equal Variance
- One Way ANOVA
    - No significant outliers
    - Equal variance
    - Normality
- Binomial Test
    - There are 2 possible outcomes: success and failure.
    - The probability of success is constant
    - The trials are independent.
- Chi-Square test
    - There are two categorical variables (ordinal or nominal)
    - The outcomes are independent

### HOW TO: TEST ASSUMPTIONS AND SELECT CORRECT TEST
0. Check for & Remove Outliers
    - Required for 1-sample t-test and ANOVA.
    - Use one of the two methods below to identify outliers:
        - Use Tukey's interquartile range rule.
        - Use absolute value of Z-scores >3 as rule.
1. Test Assumption of Normality
- Use either of the following tests to determine if your data is normally distributed:

        - D'Agostino-Pearson's normality test  scipy.stats.normaltest
        - Shapiro-Wilk Test (if n <20) scipy.stats.shapiro
- Outcome A: if your data IS normally distributed:

        - Move onto assumption #2: testing the assumption of equal variance.
- Outcome B: If your data is NOT normally distributed:

        - If your group sizes (n) are large enough, we can safely ignore the normality assumption.




![Screen%20Shot%202023-01-12%20at%202.54.45%20PM.png](attachment:Screen%20Shot%202023-01-12%20at%202.54.45%20PM.png)

    - Outcome B1: if your N is large enough to ignore the assumption of normality:
            - Move onto assumption #2: testing the assumption of equal variance.
    - Outcome B2: if your N is NOT large enough::
            - You should not run the selected test.
            - Move onto step 3: selecting the non-parametric equivalent test
            
## 2. Test Assumption of Equal Variance
- Levene's Test scipy.stats.levene

If you pass the assumption of equal variance:

    - Use the regular 2-sample t-test (or ANOVA).
    - See the Final Summary Table at the bottom for the function to use.
- If you fail the assumption of equal variance:

        - If you wanted to run a 2-sample T-Test:
            - Use a Welch's T-Test.
            - for scipy, add equal_var=False to ttest_ind
        - If you wanted to a different test:
            - See 3.  Select a non-parametric equivalent of your test
            
## 3. Select a non-parametric equivalent of your test.
- Select the test from the right Nonparametric column that matches your original Parametric test.

- The nonparametric test functions are used the same way as the parametric test. No other changes needed

- For more information see: Choosing Between Parametric and Non-Parametric Tests

# Summary Table - Hypothesis Testing Functions

![Screen%20Shot%202023-01-12%20at%203.02.00%20PM.png](attachment:Screen%20Shot%202023-01-12%20at%203.02.00%20PM.png)



## STEP 4: Perform Test & Interpret Result
- Perform a hypothesis test from the summary table above to get your p-value.

- If the p-value is > alpha:
        - We fail to reject the null hypothesis. There is no significant difference between groups. 


- If the p-value is < alpha:

        - Reject the null hypothesis. There is a significant difference between groups. We have supported the alternative hypothesis.
        - If you have multiple groups (i.e. ANOVA, Kruskal-Wallis), see Step 4: Post-Hoc Tests in order to determine which groups were different.


## STEP 5: Post-hoc multiple comparison tests (if needed)
- Our p-value indicated there WAS a significant difference between groups, but we don't know WHICH groups yet. 
- We must run a pairwise Tukey's test to know which groups were significantly different. 
- Tukey pairwise comparison test
    - statsmodels.stats.multicomp.pairwise_tukeyhsd
- Tukey's test will run separate tests on pair of groups to get a separate p-value for each. But it does it in a smart way that prevents false positives.


# Summary
This lesson provides an overview of hypothesis testing, and talks through the process required to select the appropriate test. You may wish to refer to this page as a reference when you plan your hypothesis test.  





Previous
Privacy Policy