Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: flexible number of returns #2367

Open
josef-pkt opened this issue Apr 21, 2015 · 5 comments
Open

Design: flexible number of returns #2367

josef-pkt opened this issue Apr 21, 2015 · 5 comments

Comments

@josef-pkt
Copy link
Member

What are the functions that have a variable number of returns based on the keyword arguments?

acf return see #2366

design options

  • keep variable number of returns
  • always return Bunch or Results if we need variable number of returns (current trend in pattern for many functions/methods that need more flexibility and can have lazily evaluated extra results)
  • fill returns with None, so the number of returns remains unchanged (proposed in Uniform return value from {,p}acf #2366 )

Any changes in number and types of returns will break backwards compatibility, and we should settle on a pattern so we have to break it only once

@kshedden
Copy link
Contributor

For more complex generic functions like predict I prefer the bunch option, with unfilled slots as appropriate.

I did this for predict in PHReg without thinking of the implications. I wanted to return the predicted log hazard, cumulative hazard, standard errors, etc.

Then I tried to use this with new generic procedures like MICE and Mediation and ran into trouble because predict usually just returns the fitted values. So I had to (temporarily) special case PHReg, or leave it broken.

@josef-pkt
Copy link
Member Author

For the special case of predict:
I'm now reasonably convinced that we should use two methods predict and get_prediction. The latter name is not yet fixed.
predict should just return the array, that can be used in cross-validation or other functions without calculating extras, and get_prediction for all the additional results. get_prediction returns a Results instance, and has confidence intervals, standard errors in my PR and will get all other optional extras in it.
The basic predict also has the same or similar pattern as the one in scikit-learn.
I think predict is important enough that we can survive some code duplication if necessary.

Can you split it up this way in PHReg, so the plain predict is compatible with MICE and Mediation?

@anntzer
Copy link
Contributor

anntzer commented Apr 22, 2015

I'm not such a heavy user of statsmodels so perhaps my opinion isn't worth much but this may be a better idea: assuming that most of the time, there is one (or two, but a fixed number) of "main" results and a number of additional results that should be computed only on request, provide two functions, func(*args, **kwargs) and func.full(*args, request_additional=..., **kwargs). func returns a single value (or a pair, etc. as appropriate), whereas func.full returns a pair or 1. the main result and 2. a namedtuple of additional results with non-computed values set to None (or a Bunch).

This way, the generic user can just call val = func(...) without having to fetch the value out of a tuple or a Bunch, whereas if you want more stuff, you can call val, extras = func.full(...) and still know the type of your return.

A properly designed decorator should be able to make this not so hard on the library side, e.g.

@decorator_left_as_exercise(requests_extras=["foo"], extras_return_names=["bar"])
def func(arg1, foo=None):
    retval = <compute>
    if foo is not None:
        additional_info = <compute>
   else:
        additional_info = None
    return retval, additional_info

The decorator would generate a first wrapper function that would call the wrappee while disallowing requests for extra args, and drop the additional info (which should be set to None). It would also generate a second wrapper function (the .full function), which allows requests for extra args, wraps the additional info in the proper type (namedtuple/bunch).

@josef-pkt
Copy link
Member Author

@anntzer Changing to this doesn't really change the "feature" that we have different returns based on the keyword arguments, does it?

In general I don't like decorators, although we make heavy use of our cache decorators. Also we already have one level of wrapping in the results for pandas objects.

@anntzer
Copy link
Contributor

anntzer commented Apr 22, 2015

No, now the return value would always be the same: a single value for func, a pair value, extra_namedtuple_info (the latter may have some fields set to None, depending on user input) for func.full.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants