Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict_indicator() #2

Open
sebastian-fox opened this issue Jun 19, 2017 · 4 comments
Open

predict_indicator() #2

sebastian-fox opened this issue Jun 19, 2017 · 4 comments
Assignees

Comments

@sebastian-fox
Copy link
Member

A function that predicts the next year value of an indicator.
predict_indicator():

  • IndicatorID
  • R asks for area type for prediction (eg, UTLA), then:
    • Extracts all indicator data for indicators in same profile(s) at the same geography
    • Identifies latest year for target indicator
    • Subsets dataframe of latest year information for all indicators
    • Creates flat, wide table of remaining indicators with variables for each previous year of data available for each indicator (eg, indicator_x_1yr_previous, indicator_x_2yr_previous, … , indicator_x_nyr_previous)
    • Trains and tests model on second latest year for target indicator (maybe multiple machine learning methods)
    • Uses best model to predict next year of data for indicator
      • Lasso
      • Glm
      • Svm
      • Randomforest
@julianflowers
Copy link

julianflowers commented Aug 24, 2017

This is not so much forecasting but prediction (subtle I know) but prediction seems to be about fitting values to unseen data, forecasting about the future. To forecast next year we would need to be able to estimate all the model inputs as well...

@julianflowers
Copy link

Have been trying a few other models - xgboost, gbm, brnn...

xgboost seems to be very popular - a bit fiddly
brnn is a bayesian neural network which seems quite accurate

@sebastian-fox
Copy link
Member Author

This looks really good. I'm starting to think this belongs to a different package. This package has been reviewed by some rOpenSci reviewers and one of the comments is to reduce dependencies on other packages. That is a good suggestion and helps draw the boundaries around the limits of this package. I think we need to start developing the insights package internally...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants