Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing Default Values of plotting and prediction percentiles #250

Closed
wants to merge 3 commits into from

Conversation

edwinnglabs
Copy link
Collaborator

@edwinnglabs edwinnglabs commented Oct 24, 2020

Description

Having prediction percentiles=None is quite annoying since I find more often the reason to have LGT/DTLFull is to get reliable inference. Each time if I want to create a new DLTFULL(after testing DLTMAP), i need to figure out the right arg.

Fixes # (issue)

Change prediction_percentiles=None = prediction_percentiles=[5,95] and some default plotting value changed to make it less input required if we always use default prediction outcomes.

Please delete options that are not relevant.

  • Change related tutorial/docs update for cosmetic purpose (should not trigger any error)
  • restore prediction outcomes label by using input prediction percentiles directly
  • set prediction percentiles default as [5, 95] internally

How Has This Been Tested?

Since it is plotting, no test is related for this.

@edwinnglabs
Copy link
Collaborator Author

change of notebook will be done after approval. it is more for cosmetic than any syntax/runtime error.

@wangzhishi
Copy link
Contributor

this also requires a docs update? the update is done or not yet?

@@ -792,7 +792,7 @@ class LGTFull(BaseLGT):
"""
_supported_estimator_types = [StanEstimatorMCMC, StanEstimatorVI, PyroEstimatorVI]

def __init__(self, n_bootstrap_draws=None, prediction_percentiles=None, **kwargs):
def __init__(self, n_bootstrap_draws=None, prediction_percentiles=[5, 95], **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd very strong urge against this. Default args being mutable is generally a bad idea. (I think this is a good description of why: https://docs.python-guide.org/writing/gotchas/)

I'd suggest setting following the design of the otther args and set the default in _set_default_base_args()

tests/orbit/models/test_dlt.py Outdated Show resolved Hide resolved
@steveyang90
Copy link
Collaborator

steveyang90 commented Oct 24, 2020

@edwinnglabs @wangzhishi Also worth noting that changing the default value here changes the output for LGTFull and isn't consistent with the other classes. That's ok, but we should make sure that that's what we really want. I feel the end user setting the interval arg is an ok trade off for consistent api behavior. Also because [5, 95] is sort of arbitrary lower and upper bounds?

@edwinnglabs
Copy link
Collaborator Author

edwinnglabs commented Oct 24, 2020

this also requires a docs update? the update is done or not yet?

As mentioned, they are not material since i am adding an argument on top of a None default. So the old notebooks should work perfectly. it's just cosmetic since we are putting an input that is already default (in this refined version). But I can do that ONLY after we agree this is the change. @wangzhishi

@edwinnglabs
Copy link
Collaborator Author

@edwinnglabs @wangzhishi Also worth noting that changing the default value here changes the output for LGTFull and isn't consistent with the other classes. That's ok, but we should make sure that that's what we really want. I feel the end user setting the interval arg is an ok trade off for consistent api behavior. Also because [5, 95] is sort of arbitrary lower and upper bounds?

are you just against the idea of renaming the output to x_upper and x_lower or just against the idea having a list in default argument? @steveyang90

@steveyang90
Copy link
Collaborator

Hey sorry let me clarify a bit, because I think there are multiple points!

We should definitely not set the default explicitly in the method signature to prediction_percentiles=[5,95]. However, if we feel strongly about [5,95] being the default, we should keep the default as None, but instead set the behavior of prediction_percentiles=None to mean [5,95] by setting the default values within _set_default_base_args(). For example, this is the same way we set a private var called self._regressor_sign when the arg for regression_sign=None.

The second part is the unit test that you updated. Where is "prediction_lower" and "prediction_upper" coming from? Was this a recent change? Last I recall, we were using the actual ints for the intervals since users can put in an arbitrary list of ints, not just upper and lower?

@edwinnglabs
Copy link
Collaborator Author

edwinnglabs commented Oct 25, 2020

Hey sorry let me clarify a bit, because I think there are multiple points!

We should definitely not set the default explicitly in the method signature to prediction_percentiles=[5,95]. However, if we feel strongly about [5,95] being the default, we should keep the default as None, but instead set the behavior of prediction_percentiles=None to mean [5,95] by setting the default values within _set_default_base_args(). For example, this is the same way we set a private var called self._regressor_sign when the arg for regression_sign=None.

The second part is the unit test that you updated. Where is "prediction_lower" and "prediction_upper" coming from? Was this a recent change? Last I recall, we were using the actual ints for the intervals since users can put in an arbitrary list of ints, not just upper and lower?

So for the first point, I would prefer to set the percentiles default as [5, 95] using the _set_default_base_args method you mention. The advantage of using MCMC is to generate inference. I think it makes sense to make it as default. It also helps a cleaner demo where we don't need to put [5, 95] as an argument everywhere

As regard to the second point, we changed to upper and lower some version ago. I find it is annoying to keep tracking what we input originally in the model to pipe the same into plotting and we will only plot one band with two end points anyway. So I feel converting them into ['upper' and 'lower'] may actually make more sense. However, I don't completely oppose to use [5, 95]. I just feel it may take some effort to make sure user has a consistent set of upper and lower bound. For example do we allow user put prediction_perfcentile=[10,20,30]? And if so, it needs some manual input for the plotting. For the upper and lower approach, we can always be lazy in plotting since the labels are always upper and lower.

@steveyang90
Copy link
Collaborator

Cool, I think we're agreed on the first point then!

Regarding the second point, I see what you mean with plotting. So maybe the solution is to be more restrictive with prediction_percentiles to always just have two elements which corresponds to upper and lower bounds, rather than allowing an arbitrary list?

- relabel output to {x}_{p} where `x` is the prediction component and `p` is the input prediction percentile
- set  prediction percentile default to be [5, 95] with an internal function to dodge mutable input
- set prediction percentile deault to [5, 95] in plot functions in the similar way of above
- within plot functions, it searches for {x}_{p} format again for columns to plot
@edwinnglabs edwinnglabs added refactor Issues to remove tech debt or improve design review needed need someone to review WIP Someone is aware of this and working in progress and removed review needed need someone to review labels Oct 25, 2020
@edwinnglabs edwinnglabs deleted the small-default-change branch November 25, 2020 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Issues to remove tech debt or improve design WIP Someone is aware of this and working in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants