# Classical Inferential Framework - Part 2

This is the second part of the classical inferential framework. This will be broken into 2 or 3 parts, one-sided hypotheses, updated NHST framework thoughts, and other tidbits. 

## One-sided Hypotheses

One sided hypotheses are possible. In this framework, the null hypothesis mostly stays the same, but the alternative does change. 

Suppose we again wanted to test the slope. 

$$ 
H_{0}: \beta_{1} = 0  \\
H_{A}: \beta_{1} < 0
$$

To be more specific, we could update the null hypothesis as follows:

$$ 
H_{0}: \beta_{1} \geq 0  \\
H_{A}: \beta_{1} < 0
$$

Most of the mechanics stay the same, but there is one notable difference. Let's draw this out. 

In [2]:
library(tidyverse)
library(ggformula)
library(mosaic)

theme_set(theme_bw(base_size = 18))

airquality <- readr::read_csv("https://raw.githubusercontent.com/lebebr01/psqf_6243/main/data/iowa_air_quality_2021.csv")
wind <- readr::read_csv("https://raw.githubusercontent.com/lebebr01/psqf_6243/main/data/daily_WIND_2021-iowa.csv")

airquality <- airquality %>%
   left_join(wind, by = c('cbsa_name', 'date')) %>% 
   drop_na()

air_lm <- lm(daily_aqi ~ avg_wind, data = airquality)
summary(air_lm)

-- [1mAttaching packages[22m --------------------------------------- tidyverse 1.3.1 --

[32mv[39m [34mggplot2[39m 3.3.5     [32mv[39m [34mpurrr  [39m 0.3.4
[32mv[39m [34mtibble [39m 3.1.3     [32mv[39m [34mdplyr  [39m 1.0.7
[32mv[39m [34mtidyr  [39m 1.1.3     [32mv[39m [34mstringr[39m 1.4.0
[32mv[39m [34mreadr  [39m 2.0.1     [32mv[39m [34mforcats[39m 0.5.1

-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31mx[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()

Loading required package: ggstance


Attaching package: 'ggstance'


The following objects are masked from 'package:ggplot2':

    GeomErrorbarh, geom_errorbarh


Loading required package: scales


Attaching package: 'scales'


The following object is masked from 'package:purrr':

    discard


The following object is masked from 'package:readr':

    co


Call:
lm(formula = daily_aqi ~ avg_wind, data = airquality)

Residuals:
   Min     1Q Median     3Q    Max 
-41.71 -14.38  -0.73  12.43  86.84 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  48.2229     0.5155   93.54   <2e-16 ***
avg_wind     -2.2118     0.1043  -21.20   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.05 on 4819 degrees of freedom
Multiple R-squared:  0.08528,	Adjusted R-squared:  0.08509 
F-statistic: 449.3 on 1 and 4819 DF,  p-value: < 2.2e-16


## Updated NHST Thoughts

In general, p-values are fine, but the idea of dichotomizing those into a reject vs non-reject is problematic. Read a statement by the [American Statistical Association (ASA)](https://doi.org/10.1080/00031305.2016.1154108) on p-values for extended detail. 

In general, the gist, is that p-values can become uninterpretable in many different frameworks. In addition, using p-values, a continuous measure, as evidence that rejects vs does not reject the null hypothesis (ie. dichotomization of the p-value) is a bad idea. Furthermore, treating the p-value in complete isolation is problematic as well as it does not take into account the context of the problem at hand. 

## p-value and sample size

Why is it problematic to use a p-value in complete isolation? Let's show a simulated example. 

In [3]:
library(simglm)
library(future)

plan(multicore)

sim_arguments <- list(
    formula = y ~ x,
    fixed = list(x = list(var_type = 'continuous', mean = 0, sd = 1)),
    error = list(variance = 1),
    sample_size = 1000,
    reg_weights = c(5, .01)
)

sim_data <- simulate_fixed(data = NULL, sim_arguments) %>%
  simulate_error(sim_arguments) %>%
  generate_response(sim_arguments)

head(sim_data)

Unnamed: 0_level_0,X.Intercept.,x,level1_id,error,fixed_outcome,random_effects,y
Unnamed: 0_level_1,<dbl>,<dbl>,<int>,<dbl>,<dbl>,<dbl>,<dbl>
1,1,0.6688096,1,-0.07871876,5.006688,0,4.927969
2,1,1.1158619,2,0.24049855,5.011159,0,5.251657
3,1,-0.3526516,3,1.1149578,4.996473,0,6.111431
4,1,1.3598234,4,0.6647744,5.013598,0,5.678373
5,1,-0.1631678,5,-0.49157542,4.998368,0,4.506793
6,1,0.5593892,6,-0.02340767,5.005594,0,4.982186
