Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ways to immediately warn when a forecast object becomes invalid #816

Open
nikosbosse opened this issue May 19, 2024 · 6 comments

Comments

@nikosbosse
Copy link
Contributor

nikosbosse commented May 19, 2024

I see it has been discussed a bit in #507 but I am uncomfortable with the fact we re-validate the class in the print() method. Not just because it’s inefficient and non-standard, but mostly because it may be the symptom of a deeper issue. Once it has been created, we should ensure (to the best of our ability) that it remains valid no matter what. And the moment it stopps being valid, we should throw an error immediately, not wait for the print() method to be called.

This is an inherent weakness of S3 classes, which are loosely defined, but we can add safeguards. This could take the form of a custom [ method which ensures crucial columns are never dropped.

Originally posted by @Bisaloo in #791 (review)

@nikosbosse
Copy link
Contributor Author

Some ways in which a forecast object can become invalid:

  • you delete one of the protected columns (model, observed, predicted, quantile_level...)
  • you delete a column that is part of the forecast unit in a way that now a single forecast is not uniquely identified anymore (meaning that get_duplicate_forecasts() would fail
  • you add additional columns (e.g. you end up with both a quantile_level and a sample_id column)
  • you rbind additional rows, leading to the same error
  • you delete rows such that none remain or that only rows remain that have an NA value somewhere (meaning they get filtered out)

I'm not sure we'd be able to catch all of these immediately.
What do you think @Bisaloo @seabbs @sbfnk

Also is this something we should do before the CRAN release, or possibly afterwards?

@seabbs
Copy link
Contributor

seabbs commented May 20, 2024

I am also not sure how we should do this. Suggestions @Bisaloo?

@Bisaloo
Copy link
Member

Bisaloo commented May 21, 2024

As far as I can tell, all the changes mentioned here are done via [, or a function that internally uses it (to be confirmed for rbind()).

So with a custom [.forecast() method, you should be able to immediately error when the object becomes invalid:

  1. Run NextMethod() ([.data.frame() in most/all cases)
  2. Validate the object
  3. Error if invalid OR return is valid

Also is this something we should do before the CRAN release, or possibly afterwards?

I'd say it's strictly speaking a breaking change but it doesn't really change anything in the API so it can possibly wait.

@nikosbosse
Copy link
Contributor Author

@Bisaloo would you possibly be willing to take on this PR? I imagine you'd by far be best placed to tackle it.

@Bisaloo
Copy link
Member

Bisaloo commented Jun 3, 2024

I'm fully booked until mid-July but could have a go after. Would this work with your timeline?

@nikosbosse
Copy link
Contributor Author

Yes, that would be perfect, merci beaucoup!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants