translate point estimates to quantile #4

vpnagraj · 2020-12-15T14:30:53Z

the covid 19 forecast hub requires point estimates and pre-determined quantile values for each target.

the quantile values required are enumerated in the technical readme:

https://github.com/reichlab/covid19-forecast-hub/blob/master/data-processed/README.md#quantile

once we decide on an initial model / targets to pursue, we will need establish how to go about translating a point estimate to quantiles.

vpnagraj · 2020-12-15T15:30:31Z

a little more detail on one possible approach here ...

may be a naive way of thinking about it, but what if we used the point estimate from a model prediction along with its standard error to seed a simulated distribution of predicted values? to do so, we would have to assume the predicted values follow a certain distribution ... but from there we could cut the simulated values at relevant quantiles

below is some code to do that with a linear model of faithful data under the assumption that the predicted value follows a normal distribution:

## fit the model
eruptfit <- lm(eruptions ~ waiting, data = faithful)

## get the prediction interval
## from prediction interval calculate standard error
pred <- 
  predict(eruptfit, data.frame(waiting = 80), level = 0.95, interval = "predict") %>%
  as.data.frame() %>%
  dplyr::mutate(se = (upr - fit) / qt(0.975, nrow(faithful)))

## sidebar ... is this standard error formula correct?
## see https://stackoverflow.com/questions/38109501/how-does-predict-lm-compute-confidence-interval-and-prediction-interval

## what is standard error for CI?
tmp <- predict(eruptfit, data.frame(waiting = 80), level = 0.95, interval = "predict", se.fit = TRUE)
tmp$se.fit

[1] 0.03625177

## use standard error for CI to calculate standard error for PI
sqrt(tmp$se.fit ^ 2 + tmp$residual.scale ^ 2)

[1] 0.4978346

## what whas the se for the PI per our calculation?
pred$se

[1] 0.4978511

## assume distribution of forecast values follows normal distribution ...
## mean = prediction
## sd = standard error of prediction
## "simulate" values by generate random normal distribution
sims <- rnorm(1000, mean = pred$fit, sd = pred$se)

## cut the simulated values into the quantiles of interest
quantile(sims, probs = c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99))

      1%     2.5%       5%      10%      15%      20%      25%      30%      35%      40%      45%      50%      55%      60%      65% 
3.001737 3.149908 3.346475 3.560254 3.672461 3.768598 3.843439 3.925468 3.986631 4.069986 4.132211 4.175931 4.251500 4.288355 4.344637 
     70%      75%      80%      85%      90%      95%    97.5%      99% 
4.407399 4.499772 4.603745 4.719678 4.833324 5.030727 5.146453 5.357499

stephenturner · 2020-12-15T18:14:56Z

So the 50th percentile here, the median -- how does that relate to the point estimate?

stephenturner · 2020-12-15T18:16:40Z

Few other approaches in the fable framework:

See fable bookdown section on simulation: https://tidyverts.github.io/tidy-forecasting-principles/methods.html#simulation
See about halfway down, the hilo() function.

The resulting forecasts are contained in a “fable” (forecast table), and both point forecasts and forecast distributions are available in the table for the next five years. Confidence intervals can be extracted from the distribution using the hilo() function.

The hilo() function can also be used on fable objects, which allows you to extract multiple intervals at once.

vpnagraj · 2020-12-15T18:21:13Z

So the 50th percentile here, the median -- how does that relate to the point estimate?

good point. but in this case the 50th percentile from quantile is generated from a random normal distribution that is centered on the point estimate. rnorm(size=1000, ...) above so i would think that the median would be pretty close to the mean

looks like it is (using objects created from example code above):

pred$fit

[1] 4.17622

stephenturner · 2020-12-21T12:03:59Z

This section of the FPP3 book is a good primer on forecast distributions, and later on shows how to use bootstrapping to get prediction intervals. https://otexts.com/fpp3/prediction-intervals.html

Also, from: https://otexts.com/fpp3/forecasting-using-transformations.html

If a transformation has been used, then the prediction interval is first computed on the transformed scale, then the end points are back-transformed to give a prediction interval on the original scale.

stephenturner · 2020-12-21T16:24:48Z

This is demonstrated around f4a58da#diff-da4c2dfe4ffa338da3407b8db34eac89ce01bcfc58032a6688085c6452350d0fR108-R115

vpnagraj · 2021-01-06T13:49:53Z

we have this working (see https://github.com/signaturescience/focustools/blob/master/scratch/fable-submission-mockup-allmetrics.R#L28-L38)

the quantiles meet the valid entry file format once bound, formatted with target names, etc.

closing for now, though we may need to revisit if/when we implement another kind of model outside of the fable framework

stephenturner added a commit that referenced this issue Dec 21, 2020

lots of updates (#4)

f4a58da

stephenturner added a commit that referenced this issue Dec 29, 2020

correct forecast dates #6 #4

b20dff5

vpnagraj closed this as completed Jan 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

translate point estimates to quantile #4

translate point estimates to quantile #4

vpnagraj commented Dec 15, 2020

vpnagraj commented Dec 15, 2020

stephenturner commented Dec 15, 2020

stephenturner commented Dec 15, 2020

vpnagraj commented Dec 15, 2020

stephenturner commented Dec 21, 2020 •

edited

Loading

stephenturner commented Dec 21, 2020

vpnagraj commented Jan 6, 2021

translate point estimates to quantile #4

translate point estimates to quantile #4

Comments

vpnagraj commented Dec 15, 2020

vpnagraj commented Dec 15, 2020

stephenturner commented Dec 15, 2020

stephenturner commented Dec 15, 2020

vpnagraj commented Dec 15, 2020

stephenturner commented Dec 21, 2020 • edited Loading

stephenturner commented Dec 21, 2020

vpnagraj commented Jan 6, 2021

stephenturner commented Dec 21, 2020 •

edited

Loading