-
Notifications
You must be signed in to change notification settings - Fork 1
/
binomial.html
313 lines (262 loc) · 10.5 KB
/
binomial.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head><title>R: Family Objects for Models</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" type="text/css" href="R.css">
</head><body>
<table width="100%" summary="page for family"><tr><td>family</td><td align="right">R Documentation</td></tr></table>
<h2>Family Objects for Models</h2>
<h3>Description</h3>
<p>Family objects provide a convenient way to specify the details of the
models used by functions such as <code>glm</code>. See the
documentation for <code>glm</code> for the details on how such model
fitting takes place.
</p>
<h3>Usage</h3>
<pre>
family(object, ...)
binomial(link = "logit")
gaussian(link = "identity")
Gamma(link = "inverse")
inverse.gaussian(link = "1/mu^2")
poisson(link = "log")
quasi(link = "identity", variance = "constant")
quasibinomial(link = "logit")
quasipoisson(link = "log")
</pre>
<h3>Arguments</h3>
<table summary="R argblock">
<tr valign="top"><td><code>link</code></td>
<td>
<p>a specification for the model link function. This can be
a name/expression, a literal character string, a length-one character
vector or an object of class
<code>"link-glm"</code> (such as generated by
<code>make.link</code>) provided it is not specified
<EM>via</EM> one of the standard names given next.
</p>
<p>The <code>gaussian</code> family accepts the links (as names)
<code>identity</code>, <code>log</code> and <code>inverse</code>;
the <code>binomial</code> family the links <code>logit</code>,
<code>probit</code>, <code>cauchit</code>, (corresponding to logistic,
normal and Cauchy CDFs respectively) <code>log</code> and
<code>cloglog</code> (complementary log-log);
the <code>Gamma</code> family the links <code>inverse</code>, <code>identity</code>
and <code>log</code>;
the <code>poisson</code> family the links <code>log</code>, <code>identity</code>,
and <code>sqrt</code> and the <code>inverse.gaussian</code> family the links
<code>1/mu^2</code>, <code>inverse</code>, <code>identity</code>
and <code>log</code>.
</p>
<p>The <code>quasi</code> family accepts the links <code>logit</code>, <code>probit</code>,
<code>cloglog</code>, <code>identity</code>, <code>inverse</code>,
<code>log</code>, <code>1/mu^2</code> and <code>sqrt</code>, and
the function <code>power</code> can be used to create a
power link function.
</p>
</td></tr>
<tr valign="top"><td><code>variance</code></td>
<td>
<p>for all families other than <code>quasi</code>, the variance
function is determined by the family. The <code>quasi</code> family will
accept the literal character string (or unquoted as a name/expression)
specifications <code>"constant"</code>, <code>"mu(1-mu)"</code>, <code>"mu"</code>,
<code>"mu^2"</code> and <code>"mu^3"</code>, a length-one character vector
taking one of those values, or a list containing components
<code>varfun</code>, <code>validmu</code>, <code>dev.resids</code>, <code>initialize</code>
and <code>name</code>.
</p>
</td></tr>
<tr valign="top"><td><code>object</code></td>
<td>
<p>the function <code>family</code> accesses the <code>family</code>
objects which are stored within objects created by modelling
functions (e.g., <code>glm</code>).</p>
</td></tr>
<tr valign="top"><td><code>...</code></td>
<td>
<p>further arguments passed to methods.</p>
</td></tr>
</table>
<h3>Details</h3>
<p><code>family</code> is a generic function with methods for classes
<code>"glm"</code> and <code>"lm"</code> (the latter returning <code>gaussian()</code>).
</p>
<p>The <code>quasibinomial</code> and <code>quasipoisson</code> families differ from
the <code>binomial</code> and <code>poisson</code> families only in that the
dispersion parameter is not fixed at one, so they can model
over-dispersion. For the binomial case see McCullagh and Nelder
(1989, pp. 124–8). Although they show that there is (under some
restrictions) a model with
variance proportional to mean as in the quasi-binomial model, note
that <code>glm</code> does not compute maximum-likelihood estimates in that
model. The behaviour of S is closer to the quasi- variants.
</p>
<h3>Value</h3>
<p>An object of class <code>"family"</code> (which has a concise print method).
This is a list with elements
</p>
<table summary="R valueblock">
<tr valign="top"><td><code>family</code></td>
<td>
<p>character: the family name.</p>
</td></tr>
<tr valign="top"><td><code>link</code></td>
<td>
<p>character: the link name.</p>
</td></tr>
<tr valign="top"><td><code>linkfun</code></td>
<td>
<p>function: the link.</p>
</td></tr>
<tr valign="top"><td><code>linkinv</code></td>
<td>
<p>function: the inverse of the link function.</p>
</td></tr>
<tr valign="top"><td><code>variance</code></td>
<td>
<p>function: the variance as a function of the mean.</p>
</td></tr>
<tr valign="top"><td><code>dev.resids</code></td>
<td>
<p>function giving the deviance residuals as a function
of <code>(y, mu, wt)</code>.</p>
</td></tr>
<tr valign="top"><td><code>aic</code></td>
<td>
<p>function giving the AIC value if appropriate (but <code>NA</code>
for the quasi- families). See <code>logLik</code> for the assumptions
made about the dispersion parameter.</p>
</td></tr>
<tr valign="top"><td><code>mu.eta</code></td>
<td>
<p>function: derivative <code>function(eta)</code>
<i>dμ/dη</i>.</p>
</td></tr>
<tr valign="top"><td><code>initialize</code></td>
<td>
<p>expression. This needs to set up whatever data
objects are needed for the family as well as <code>n</code> (needed for
AIC in the binomial family) and <code>mustart</code> (see <code>glm</code>.</p>
</td></tr>
<tr valign="top"><td><code>valid.mu</code></td>
<td>
<p>logical function. Returns <code>TRUE</code> if a mean
vector <code>mu</code> is within the domain of <code>variance</code>.</p>
</td></tr>
<tr valign="top"><td><code>valid.eta</code></td>
<td>
<p>logical function. Returns <code>TRUE</code> if a linear
predictor <code>eta</code> is within the domain of <code>linkinv</code>.</p>
</td></tr>
<tr valign="top"><td><code>simulate</code></td>
<td>
<p>(optional) function <code>simulate(object, nsim)</code> to be
called by the <code>"lm"</code> method of <code>simulate</code>. It will
normally return a matrix with <code>nsim</code> columns and one row for
each fitted value, but it can also return a list of length
<code>nsim</code>. Clearly this will be missing for ‘quasi-’ families.</p>
</td></tr>
</table>
<h3>Note</h3>
<p>The <code>link</code> and <code>variance</code> arguments have rather awkward
semantics for back-compatibility. The recommended way is to supply
them is as quoted character strings, but they can also be supplied
unquoted (as names or expressions). In addition, they can also be
supplied as a length-one character vector giving the name of one of
the options, or as a list (for <code>link</code>, of class
<code>"link-glm"</code>). The restrictions apply only to links given as
names: when given as a character string all the links known to
<code>make.link</code> are accepted.
</p>
<p>This is potentially ambiguous: supplying <code>link=logit</code> could mean
the unquoted name of a link or the value of object <code>logit</code>. It
is interpreted if possible as the name of an allowed link, then
as an object. (You can force the interpretation to always be the value of
an object via <code>logit[1]</code>.)
</p>
<h3>Author(s)</h3>
<p>The design was inspired by S functions of the same names described
in Hastie & Pregibon (1992) (except <code>quasibinomial</code> and
<code>quasipoisson</code>).
</p>
<h3>References</h3>
<p>McCullagh P. and Nelder, J. A. (1989)
<EM>Generalized Linear Models.</EM>
London: Chapman and Hall.
</p>
<p>Dobson, A. J. (1983)
<EM>An Introduction to Statistical Modelling.</EM>
London: Chapman and Hall.
</p>
<p>Cox, D. R. and Snell, E. J. (1981).
<EM>Applied Statistics; Principles and Examples.</EM>
London: Chapman and Hall.
</p>
<p>Hastie, T. J. and Pregibon, D. (1992)
<EM>Generalized linear models.</EM>
Chapter 6 of <EM>Statistical Models in S</EM>
eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
</p>
<h3>See Also</h3>
<p><code>glm</code>, <code>power</code>, <code>make.link</code>.
</p>
<p>For binomial <EM>coefficients</EM>, <code>choose</code>;
the binomial and negative binomial <EM>distributions</EM>,
<code>Binomial</code>, and <code>NegBinomial</code>.
</p>
<h3>Examples</h3>
<pre>
require(utils) # for str
nf <- gaussian()# Normal family
nf
str(nf)# internal STRucture
gf <- Gamma()
gf
str(gf)
gf$linkinv
gf$variance(-3:4) #- == (.)^2
## quasipoisson. compare with example(glm)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
d.AD <- data.frame(treatment, outcome, counts)
glm.qD93 <- glm(counts ~ outcome + treatment, family=quasipoisson())
glm.qD93
anova(glm.qD93, test="F")
summary(glm.qD93)
## for Poisson results use
anova(glm.qD93, dispersion = 1, test="Chisq")
summary(glm.qD93, dispersion = 1)
## Example of user-specified link, a logit model for p^days
## See Shaffer, T. 2004. Auk 121(2): 526-540.
logexp <- function(days = 1)
{
linkfun <- function(mu) qlogis(mu^(1/days))
linkinv <- function(eta) plogis(eta)^days
mu.eta <- function(eta) days * plogis(eta)^(days-1) *
.Call("logit_mu_eta", eta, PACKAGE = "stats")
valideta <- function(eta) TRUE
link <- paste("logexp(", days, ")", sep="")
structure(list(linkfun = linkfun, linkinv = linkinv,
mu.eta = mu.eta, valideta = valideta, name = link),
class = "link-glm")
}
binomial(logexp(3))
## in practice this would be used with a vector of 'days', in
## which case use an offset of 0 in the corresponding formula
## to get the null deviance right.
## Binomial with identity link: often not a good idea.
## Not run: binomial(link=make.link("identity"))
## tests of quasi
x <- rnorm(100)
y <- rpois(100, exp(1+x))
glm(y ~x, family=quasi(variance="mu", link="log"))
# which is the same as
glm(y ~x, family=poisson)
glm(y ~x, family=quasi(variance="mu^2", link="log"))
## Not run: glm(y ~x, family=quasi(variance="mu^3", link="log")) # fails
y <- rbinom(100, 1, plogis(x))
# needs to set a starting value for the next fit
glm(y ~x, family=quasi(variance="mu(1-mu)", link="logit"), start=c(0,1))
</pre>
</body></html>