-
Notifications
You must be signed in to change notification settings - Fork 0
/
2pm.Rmd
91 lines (74 loc) · 3.35 KB
/
2pm.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
title: "Two part"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
### Two part models {.tabset .tabset-pills}
The input arguments for the `resid_2pm()` function differ from those of other functions in `assessor` package. Specifically, users can utilize this function with either models or Probability Integral Transform (PIT) as input.
For instance, in evaluating the distribution assumptions of a two-part model that combines a logistic and a gamma regression, you should provide the logistic regression model object as the argument for `model0` and the gamma regression model object for `model1`. We recommend utilizing the model input when assessing a gamma + logistic two-part model. Alternatively, users can directly use the PIT as input if their two-part model is not a gamma+logistic combination. In such cases, users should first calculate the PIT and then input into `part0` and `part1`, respectively.
This function accommodates two combinations: either `model0` in conjunction with `model1` or `part0` in conjunction with `part1`. Note that it is essential to specify the `y` (outcome) values in the function arguments.
The underlying model is a two-part model. The probability of zero is $$p_0(\mathbf{X})=\text{logit}^{-1}\left(\beta_0+X_{1}\beta_{1}+X_{2}\beta_{2} \right),$$
where $X_1$ is a standard normal variable, $X_2$ is binary with probability of one as 0.4, and $(\beta_0,\beta_{1},\beta_{2})=(1,-2,-1)$.
A gamma distribution is employed to generate positive data. The mean function of the positive part is described as $$\lambda_S=\exp\left(\beta_{0S}+\beta_{1S}X_1+\beta_{2S}X_2\right).$$
We let $(\beta_{0S},\beta_{1S},\beta_{2S})=(-1,-1,-2)$. The dispersion parameter is set to be 0.5.
#### Models as input
```{r model as input}
library(assessor)
library(MASS)
n <- 500
beta10 <- 1
beta11 <- -2
beta12 <- -1
beta13 <- -1
beta14 <- -1
beta15 <- -2
x11 <- rnorm(n)
x12 <- rbinom(n, size = 1, prob = 0.4)
p1 <- 1 / (1 + exp(-(beta10 + x11 * beta11 + x12 * beta12)))
lambda1 <- exp(beta13 + beta14 * x11 + beta15 * x12)
y2 <- rgamma(n, scale = lambda1 / 2, shape = 2)
y <- rep(0, n)
u <- runif(n, 0, 1)
ind1 <- which(u >= p1)
y[ind1] <- y2[ind1]
```
```{r 2pm 2, fig.align='center'}
# models as input
mgamma <- glm(y[ind1] ~ x11[ind1] + x12[ind1], family = Gamma(link = "log")) # Gamma regression
m10 <- glm(y == 0 ~ x12 + x11, family = binomial(link = "logit")) # logistic regression
resid.models <- resid_2pm(model0 = m10, model1 = mgamma, y = y)
```
#### PIT as input
```{r pit as input,fig.align='center'}
library(assessor)
library(MASS)
n <- 500
beta10 <- 1
beta11 <- -2
beta12 <- -1
beta13 <- -1
beta14 <- -1
beta15 <- -2
x11 <- rnorm(n)
x12 <- rbinom(n, size = 1, prob = 0.4)
p1 <- 1 / (1 + exp(-(beta10 + x11 * beta11 + x12 * beta12)))
lambda1 <- exp(beta13 + beta14 * x11 + beta15 * x12)
y2 <- rgamma(n, scale = lambda1 / 2, shape = 2)
y <- rep(0, n)
u <- runif(n, 0, 1)
ind1 <- which(u >= p1)
y[ind1] <- y2[ind1]
# PIT as input
mgamma <- glm(y[ind1] ~ x11[ind1] + x12[ind1], family = Gamma(link = "log")) # gamma regression
m10 <- glm(y == 0 ~ x12 + x11, family = binomial(link = "logit")) # logistic regression
cdfgamma <- pgamma(y[ind1],
scale = mgamma$fitted.values * gamma.dispersion(mgamma),
shape = 1 / gamma.dispersion(mgamma)
)
p1f <- m10$fitted.values
resid.pit <- resid_2pm(part0= p1f, part1 = cdfgamma, y = y)
```