# Week 1

## Comparing two groups - drawing inferences 

### 1.01 Null Hypothesis Testing
* Example of cat diets
    * Why can't we just take $
    \mu_{\text{raw}} - \mu_{\text{canned}}$ and determine whether there's an impact?
        * This doesn't take into account any random fluctuations due to the distributions
        * Precision due to sample size
            * Could divide by variance to normalize
        * Could use Bayesian approach to use distributions and compute probabilities
    * Instead of arbitrarily guessing, formulate a null hypothesis and then show that this is unlikely to be true, at least below a certain probability (p-value)

### 1.02 P-Values
* Continuation of previous example using cat foot
* Our null hypothesis is that there's no significant difference between the two diets
    * $H_0: \mu_1 - \mu_2 = 0$
* Also need to specify an alternative hypothesis
    * We want to look at the area under the curve that represents a result that is at least as extreme as our null hypothesis
    * Possible alternative hypotheses
        * $H_a: \mu_r - \mu_c \gt 0$ (unidirectional)
            * take area to the right
        * $H_a: \mu_r - \mu_c \neq 0$ (bidrectional)
            * take area to the right/left and then double it (p-value $\cdot$ 2)
        * $H_a: \mu_r - \mu_c \lt 0$ (unidirectional)
            * take area to the left
    * Use $\alpha$ to denote significance level (typical choice is $\alpha = 0.05$)
* Note: we never accept the null hypothesis, we just fail to reject it
* Before statistical software it was really painful to compute p-values. People used to use tables (e.g. z-tables) to do the calculations. It's much easier now with stiatistical software.

### 1.03 Confidence intervals and two-sided tests
* *Confidence interval* (definition): if you were to repeatedly draw samples and calculate the statistic, then what is the range of values around the statistic if I want the population parameter to lie within the interval in 95% of my samples?
* Two sided tests and confidence are closely related
* Test decision is made whether boundary values are exceeded
* With a test the interval is centered around the expected test statistic value under the null
* With a confidence interval, the interval is centered around the sample statistic value
* The margins around these intervals are the same
    * In a test, the margin equals the critical test statistic value ($T_{\alpha / 2}$) expressed in standard errors
    * In an interval, the margin equals the critical test statistic value multiplied by the standard error ($T_{\alpha / 2} \cdot \text{se}$) in original units of sample statistic

### 1.04 Power
* Power refers to probability of correctly rejecting the null hypothesis, the probability to detect a hypothesized effect if it truly exists in the population
* Types of correct decisions/errors
    * $H_0$ true, $H_0$ not rejected = true negative
        * This probability of this is $1 - \alpha$
    * $H_0$ false, $H_0$ not rejected = false negative
        * This probability of this is equal to confidence $\alpha$
        * Type I Error
    * $H_0$ true, $H_0$ rejected = false positive
        * This probability of this is $\beta$
        * Type II Error
    * $H_0$ false, $H_0$ rejected = true positive
        * The probability of this is called the *power* of the test and is $1 - \beta$
        * There are other ways of increasing power that don't require modifying $\alpha$ or $\beta$
            * More observations/larger population increase power
                * Standard error is $\frac{s}{\sqrt{n}} \rightarrow$ smaller standard deviation, larger sample decreases the standard error
            * How do we lower the standard deviation?
                * Better instrument
                * Control for variables that can increase variance
                * Larger sample statistic $\rightarrow$ the treatment in question had more of an effect and is easier to detect
                * Use one-sided tests rather than two-sided tests because they are stronger in terms of assumptions
                * Parameteric tests are stronger than non-parameteric tests
* Estimate power
    * Post-hoc (after we've collected samples)
    * A priori (before we run the experiment)
    * Standard effect size
        * $T = \frac{E - P_0}{\text{se}}$
    * Example
        * Expected effect: medium
        * $\alpha = 0.05$
        * $1 - \beta = 0.80$
        * Required n = ?
            * 1544 (might be too expensive)
            * Can increase alpha or use 

## Comparing two groups - Independent groups

### 1.05 Two Independent Proportions
* Z-test with two proportions 
    * Use if the following conditions are met
        * Binary response variable
        * Binary independent variable
    * Example of research questions
        * Are men more frequently smokers than women?
        * Is the proportion of people with epilepsy larger in Europe than in South America?
    * Assumptions
        * Independence (random assignment/selection)
        * Sufficient observations
            * 1-sided: 10 negative, 10 positive
            * 2-sided: 5 negative, 5 positive
        * If assumptions aren't met, use Fisher's exact test
    * Hypotheses
        * $H_0: p_1 - p_2 = 0$
        * $H_a: p_1 - p_2 \neq 0$
    * Test statistic
        * $z = \frac{\hat{p_1} - \hat{p_2} - (\pi_1 - \pi_2)}{\sqrt{\hat{p} ( 1 - \hat{p})(\frac{1}{n_1} + \frac{1}{n_2})}}$

### 1.06 Two Independent Means
* T-test with two independent means
    * Use if the following conditions are met
        * Quantitative response variable
        * Binary independent variable
    * Example of research questions
        * Does the avg number of shows watched vary between people w/ and w/o jobs?
        * Mean score on a happiness scale lower for people who have children versus people who don't have children?
    * Assumptions
        * Independent (assignment/selection)
        * Normally distributed
            * Not a problem if violated if:
                * large samples
                * small sample and 2-sided test
            * for 1-sided test should have at least 30 observations (CLT)
    * Hypotheses
        * $H_0: \mu_1 - \mu_2 = 0$
        * $H_a: $$
            * $\mu_1 - \mu_2 > 0$
            * $\mu_1 - \mu_2 \neq 0$
            * $\mu_1 - \mu_2 < 0$
* Equal variances vs. unequal variances version?
    * Unequal variances is generally preferred since it is a weaker assumption and the tests of equality of variance are not very robust

## Comparing two groups - dependent groups
* Z-test for two dependent proportion (McNemar's Test)
    * Use if the following conditions are met
        * Binary response variable
        * Binary independent variable
            * Distinguishes between two related/paired samples
                * Paired either because of same cases or matched pairs
                * Example
                    * Twins
                    * Couples
    * Research questions
        * Do people smoke less after they've been exposed to an aggressive anti-smoking campaign
        * Want to compare the proportion of cats with urinary problems in a sample of cats on raw meat diet and sample with canned food
            * Instead of independent samples, use matched pairs based on certain qualities (e.g. whether they were neutered, whether they're indoor cats, etc.)
    * Need to look at all possible combinations of binary variables
    * Assumptions
        * Sufficient observations
            * 1-sided: $n_{01} + n_{10} \geq 30$
            * 2-sided: no lower limit
    * Hypotheses
        * $H_0: p_1 = p_2$
        * $H_a: p_1 > p_2$
        * $H_a: p_1 \neq p_2$
        * $H_a: p_1 < p_2$
    * Test statistic
        * $z = \frac{n_{01} - n_{10}}{\sqrt{n_{01} + n_{10}}}$
        * Test statistic has a standard normal distribution
        