Skip to content

Commit

Permalink
add 'loss' to the list of brms data sets
Browse files Browse the repository at this point in the history
  • Loading branch information
paul-buerkner committed Oct 7, 2020
1 parent d755360 commit 44249d7
Show file tree
Hide file tree
Showing 14 changed files with 426 additions and 205 deletions.
23 changes: 14 additions & 9 deletions R/brm.R
Original file line number Diff line number Diff line change
Expand Up @@ -326,16 +326,21 @@
#' summary(fit4)
#'
#'
#' # Simple non-linear gaussian model
#' x <- rnorm(100)
#' y <- rnorm(100, mean = 2 - 1.5^x, sd = 1)
#' data5 <- data.frame(x, y)
#' prior5 <- prior(normal(0, 2), nlpar = a1) +
#' prior(normal(0, 2), nlpar = a2)
#' fit5 <- brm(bf(y ~ a1 - a2^x, a1 + a2 ~ 1, nl = TRUE),
#' data = data5, prior = prior5)
#' # Non-linear Gaussian model
#' fit5 <- brm(
#' bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)),
#' ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1,
#' nl = TRUE),
#' data = loss, family = gaussian(),
#' prior = c(
#' prior(normal(5000, 1000), nlpar = "ult"),
#' prior(normal(1, 2), nlpar = "omega"),
#' prior(normal(45, 10), nlpar = "theta")
#' ),
#' control = list(adapt_delta = 0.9)
#' )
#' summary(fit5)
#' plot(conditional_effects(fit5), ask = FALSE)
#' conditional_effects(fit5)
#'
#'
#' # Normal model with heterogeneous variances
Expand Down
84 changes: 68 additions & 16 deletions R/datasets.R
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@
#' and \code{PKD} specifying the type of disease}
#' }
#'
#' @source McGilchrist, C. A., & Aisbett, C. W. (1991).
#' Regression with frailty in survival analysis.
#' \emph{Biometrics}, 47(2), 461-466.
#'
#' @examples
#' \dontrun{
#' ## performing surivival analysis using the "weibull" family
Expand All @@ -40,9 +44,6 @@
#' plot(fit2)
#' }
#'
#' @source McGilchrist, C. A., & Aisbett, C. W. (1991).
#' Regression with frailty in survival analysis.
#' \emph{Biometrics, 47(2)}, 461-466.
"kidney"


Expand All @@ -66,6 +67,10 @@
#' \item{carry}{A contrast to indicate possible carry over effects}
#' }
#'
#' @source Ezzet, F., & Whitehead, J. (1991).
#' A random effects model for ordinal responses from a crossover trial.
#' \emph{Statistics in Medicine}, 10(6), 901-907.
#'
#' @examples
#' \dontrun{
#' ## ordinal regression with family "sratio"
Expand All @@ -84,9 +89,6 @@
#' plot(fit2)
#' }
#'
#' @source Ezzet, F., & Whitehead, J. (1991).
#' A random effects model for ordinal responses from a crossover trial.
#' \emph{Statistics in Medicine, 10(6)}, 901-907.
"inhaler"


Expand Down Expand Up @@ -114,7 +116,15 @@
#' \item{zAge}{Standardized \code{Age}}
#' \item{zBase}{Standardized \code{Base}}
#' }
#'
#'
#' @source Thall, P. F., & Vail, S. C. (1990).
#' Some covariance models for longitudinal count data with overdispersion.
#' \emph{Biometrics, 46(2)}, 657-671. \cr
#'
#' Breslow, N. E., & Clayton, D. G. (1993).
#' Approximate inference in generalized linear mixed models.
#' \emph{Journal of the American Statistical Association}, 88(421), 9-25.
#'
#' @examples
#' \dontrun{
#' ## poisson regression without random effects.
Expand All @@ -131,12 +141,54 @@
#' summary(fit2)
#' plot(fit2)
#' }
#'
#' @source Thall, P. F., & Vail, S. C. (1990).
#' Some covariance models for longitudinal count data with overdispersion.
#' \emph{Biometrics, 46(2)}, 657-671. \cr
#'
#' Breslow, N. E., & Clayton, D. G. (1993).
#' Approximate inference in generalized linear mixed models.
#' \emph{Journal of the American Statistical Association, 88(421)}, 9-25.
"epilepsy"
#'
"epilepsy"

#' Cumulative Insurance Loss Payments
#'
#' @description This dataset, discussed in Gesmann & Morris (2020), contains
#' cumulative insurance loss payments over the course of ten years.
#'
#' @format A data frame of 55 observations containing information
#' on the following 4 variables.
#' \describe{
#' \item{AY}{Origin year of the insurance (1991 to 2000)}
#' \item{dev}{Deviation from the origin year in months}
#' \item{cum}{Cumulative loss payments}
#' \item{premimum}{Achieved premiums for the given origin year}
#' }
#'
#' @source Gesmann M. & Morris J. (2020). Hierarchical Compartmental Reserving
#' Models. \emph{CAS Research Papers}.
#'
#' @examples
#' \dontrun{
#' # non-linear model to predict cumulative loss payments
#' fit_loss <- brm(
#' bf(cum ~ ult * (1 - exp(-(dev/theta)^omega)),
#' ult ~ 1 + (1|AY), omega ~ 1, theta ~ 1,
#' nl = TRUE),
#' data = loss, family = gaussian(),
#' prior = c(
#' prior(normal(5000, 1000), nlpar = "ult"),
#' prior(normal(1, 2), nlpar = "omega"),
#' prior(normal(45, 10), nlpar = "theta")
#' ),
#' control = list(adapt_delta = 0.9)
#' )
#'
#' # basic summaries
#' summary(fit_loss)
#' conditional_effects(fit_loss)
#'
#' # plot predictions per origin year
#' conditions <- data.frame(AY = unique(loss$AY))
#' rownames(conditions) <- unique(loss$AY)
#' me_loss <- conditional_effects(
#' fit_loss, conditions = conditions,
#' re_formula = NULL, method = "predict"
#' )
#' plot(me_loss, ncol = 5, points = TRUE)
#' }
#'
"loss"
Binary file added data/loss.rda
Binary file not shown.
2 changes: 1 addition & 1 deletion doc/brms_nonlinear.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ pp_check(fit2)
loo(fit1, fit2)

## ---------------------------------------------------------------------------------------
loss <- read.csv("https://paul-buerkner.github.io/data/loss.csv")
data(loss)
head(loss)

## ---- results='hide'--------------------------------------------------------------------
Expand Down
20 changes: 13 additions & 7 deletions doc/brms_nonlinear.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -100,15 +100,15 @@ formula should be treated as non-linear.

In contrast to generalized linear models, priors on population-level parameters
(i.e., 'fixed effects') are often mandatory to identify a non-linear model.
Thus, **brms** requires the user to explicitely specify these priors. In the
Thus, **brms** requires the user to explicitly specify these priors. In the
present example, we used a `normal(1, 2)` prior on (the population-level
intercept of) `b1`, while we used a `normal(0, 2)` prior on (the
population-level intercept of) `b2`. Setting priors is a non-trivial task in all
kinds of models, especially in non-linear models, so you should always invest
some time to think of appropriate priors. Quite often, you may be forced to
change your priors after fitting a non-linear model for the first time, when you
observe different MCMC chains converging to different posterior regions. This is
a clear sign of an idenfication problem and one solution is to set stronger
a clear sign of an identification problem and one solution is to set stronger
(i.e., more narrow) priors.

To obtain summaries of the fitted model, we apply
Expand Down Expand Up @@ -151,13 +151,13 @@ loo(fit1, fit2)

Since smaller `LOOIC` values indicate better model fit, it is immediately
evident that the non-linear model fits the data better, which is of course not
too surpirsing since we simulated the data from exactly that model.
too surprising since we simulated the data from exactly that model.

## A Real-World Non-Linear model

On his blog, Markus Gesmann predicts the growth of cumulative insurance loss
payments over time, originated from different origin years (see
http://www.magesblog.com/post/2015/11/loss-developments-via-growth-curves-and.html).
https://www.magesblog.com/post/2015-11-03-loss-developments-via-growth-curves-and/).
We will use a slightly simplified version of his model for demonstration
purposes here. It looks as follows:

Expand All @@ -169,10 +169,10 @@ dependency using the variable $dev$. Further, $ult_{AY}$ is the (to be
estimated) ultimate loss of accident each year. It constitutes a non-linear
parameter in our framework along with the parameters $\theta$ and $\omega$,
which are responsible for the growth of the cumulative loss and are assumed to
be the same across years. We load the data
be the same across years. The data is already shipped with brms.

```{r}
loss <- read.csv("https://paul-buerkner.github.io/data/loss.csv")
data(loss)
head(loss)
```

Expand Down Expand Up @@ -221,7 +221,8 @@ plot(me_loss, ncol = 5, points = TRUE)
It is evident that there is some variation in cumulative loss across accident
years, for instance due to natural disasters happening only in certain years.
Further, we see that the uncertainty in the predicted cumulative loss is larger
for later years with fewer available data points.
for later years with fewer available data points. For a more detailed discussion
of this data set, see Section 4.5 in Gesmann & Morris (2020).

## Advanced Item-Response Models

Expand Down Expand Up @@ -323,3 +324,8 @@ that accounting for item and person variability (e.g., using a multilevel model
with varying intercepts) becomes necessary as we have multiple observations per
item and person. Luckily, this can all be done within the non-linear framework
of **brms** and I hope that this vignette serves as a good starting point.

## References

Gesmann M. & Morris J. (2020). Hierarchical Compartmental Reserving Models.
*CAS Research Papers*.
Loading

0 comments on commit 44249d7

Please sign in to comment.