Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor prediction #62

Closed
wants to merge 12 commits into from
Closed

Refactor prediction #62

wants to merge 12 commits into from

Conversation

alexpghayes
Copy link

This is an aggressive refactor to move all prediction code to safepredict. safepredict doesn't have methods for several of the existing models, so definitely don't merge until I've got that done. Plan is to push commits to this PR as an open project.

@alexpghayes
Copy link
Author

I'm currently pulling most of the prediction tests into safepredict. That's where correctness checks should happen.

Once tests are passing, I'm planning to go back through and add simple tests not for correctness, but just that parsnip successfully passed the prediction off to safepredict and got a reasonable looking result in return. These sanity check tests still need to written, and probably should live in modeltests. I'm imagining something along the lines of

model <- fit(model_constructor(...), ...)
output <- predict(model, ...)
modeltests::check_predict_output(model, data, predict_options)

(also, this PR should be linked with #41)

@alexpghayes
Copy link
Author

Current status: close to an MVP.

What I've done so far:

  • Stripped out all the prediction code from parsnip
  • Stripped out all the prediction tests from parsnip
  • Moved it to safepredict

There's a fair amount of translation that needs to happen that is rather time consuming on the safepredict side. What I've got right now is minimal functionality -- most of the safepredict methods are thiiiiiin wrappers around predict(). They most definitely need to get fleshed out and tested on their own. Some (glm, glmnet, ranger) are much more fully featured, but still nothing is well tested.

Things I couldn't implement safe_predict methods for because of installation hell with sparklyr and keras:

  • spark boosted trees
  • spark linear model
  • spark logistic reg
  • spark random forest (class and reg)
  • keras mlp regression
  • keras mlp classification

These models will need safe_predict methods (they should have skeletons at the moment) and a sanity check that parsnip::predict() is getting something back.

I added some tiny tests that should check that at very least safe_predict is returning something with the right number of rows. So I think this PR won't cause any failures, but I could wrong. There may be some temporary feature loss as certain type arguments available in parsnip take a moment to become available in safepredict.

This PR has taught me an important lesson: namely, that it is sometimes better to make small incremental changes than to burn everything down and start from scratch.

@alexpghayes
Copy link
Author

alexpghayes commented Oct 5, 2018

Oh, and I also need to add a kknn method to safe_predict() since Davis added kknn models.

Update: done.

@alexpghayes
Copy link
Author

Ok this is so out of date I'm going to close and once safe_predict() is ready will re-open

@github-actions
Copy link

github-actions bot commented Mar 8, 2021

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant