Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forecasting with ZIM / forecast method? #1

Open
andybega opened this issue Dec 3, 2018 · 9 comments
Open

Forecasting with ZIM / forecast method? #1

andybega opened this issue Dec 3, 2018 · 9 comments

Comments

@andybega
Copy link

andybega commented Dec 3, 2018

Hi, thanks for writing and making your package publicly available!

If I understand correct there is no straightforward way to forecast with ZIM right now. I'd like to make a request for a forecast (or predict) method.

Since the forecast package already includes a broad range of univariate forecasting models with a consistent API, it would be nice to have something that is easy to use in a similar fashion, e.g.:

library("forecast")
library("ZIM")

data(syph)
count <- syph$a33
count.lag1 <- bshift(count > 0)
trend <- 1:length(count) / 1000

mdl1 <- auto.arima(count)
forecast(mdl1, h = 3)
# Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
# 210       2.836695 -1.014251 6.687642 -3.052820 8.726211
# 211       2.636530 -1.243722 6.516783 -3.297804 8.570865
# 212       2.618010 -1.265695 6.501715 -3.321605 8.557624

# with covariates
mdl2 <- auto.arima(head(count, -3), xreg = head(trend, -3))
forecast(mdl2, h = 3, xreg = tail(trend, 3))
# Point Forecast     Lo 80    Hi 80     Lo 95    Hi 95
# 207       2.148899 -1.655740 5.953538 -3.669794 7.967592
# 208       2.135985 -1.668653 5.940624 -3.682707 7.954678
# 209       2.123072 -1.681567 5.927711 -3.695621 7.941765

# ZIM
mdl3 <- zim(count ~ count.lag1 + trend | trend)
# e.g.:
forecast(mdl1, h = 6, xreg = ..., zreg = ...)
# something something here
@mingstat
Copy link
Owner

mingstat commented Dec 9, 2018

Thank you very much for your suggestion!

ZIM currently has no built-in function that can be directly used to forecast/predict. To do so, one will need to extract results from fitted models. For example,

# regression coefficients
mdl3$para
[1]  1.4894234  0.2211137 -1.0100405 -1.9332129  8.6051691
# variance-covariance
solve(mdl3$info)
             [,1]          [,2]         [,3]          [,4]         [,5]
[1,]  0.014386842 -0.0097871129 -0.054419108  0.0019238845 -0.010199695
[2,] -0.009787113  0.0101438422  0.016866160 -0.0009788275  0.003066099
[3,] -0.054419108  0.0168661602  0.444716763 -0.0087614983  0.078876967
[4,]  0.001923885 -0.0009788275 -0.008761498  0.1383537758 -0.939161355
[5,] -0.010199695  0.0030660993  0.078876967 -0.9391613549  7.886376532

Adding point predictions should be straightforward, but figuring out the prediction intervals may require some thinking and extra work. I'll take a look at the API of the {forecast} package and consider adding a forecast/predict S3 function to ZIM.

@andybega
Copy link
Author

Thanks!

@ladm2110
Copy link

ladm2110 commented Aug 1, 2019

Is there any work in progress towards this objective?

@mingstat
Copy link
Owner

mingstat commented Aug 2, 2019

Thanks for checking! Computing prediction intervals requires a significant amount of work to derive the correct mathematical formulas. Unfortunately, I currently have limited time to add this new feature. Any contributions are welcome and much appreciated!

@ladm2110
Copy link

ladm2110 commented Aug 5, 2019

I have been reading your the paper "Markov regression models for count time series with excess zeros: A partial likelihood approach" and I would like to replicate your forecast results. Do you have any docs related to the one-step ahead predictions procedure you used? I am also working on a PoC for a time series with excess zeros.

@mingstat
Copy link
Owner

mingstat commented Aug 5, 2019

Yes, I just uploaded the data example to https://biostatstudio.github.io/ZIM/example.pdf. With this document, you should be able to replicate all the results in the paper. As I mentioned earlier, point predictions are relatively easier to implement, the difficult part is to calculate prediction intervals.

@ladm2110
Copy link

ladm2110 commented Aug 6, 2019

Hi, I have two things I would like to clarify:

  1. It seems to me that the order of the parameters is wrong in this expression, when I print the fit.zip$para vector the order I see is (intercept, ar1, trend, ...).

zip.lambda <- exp(sum(c(1, trend[t], ar1[t]) * fit.zip$para[1:3]))
Can you comment on this?

  1. In the example the point forecast are fixed, they are stored in the vector "count.new". How where these values obtained? Is it safe to say that using the rzip method I can sample new point forecasts?

Thanks in advanced!

@ladm2110
Copy link

ladm2110 commented Aug 6, 2019

Hi again, I don't need an answer on the first question, I see that it depends on the order of the params in the "formula" you pass to the zim method. The mistake was mine.

@mingstat
Copy link
Owner

mingstat commented Aug 7, 2019

The 'count.old' and 'count.new' variables contain observed time series that are used as training and validation data, respectively. In the example, the focus is to predict Pr(Y = 0) and Pr(Y > c), where 'c' is a prespecified cutoff. With 'zip.lambda' and 'zip.omega' from the fitted model, one can easily calculate the predicted mean or use the 'rzip' function to sample observations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants