RFC Expose a plublic method to compute the objective function #28169

jeremiedbb · 2024-01-18T14:58:32Z

I think that it would be valuable that all estimators that optimize some objective function expose a public method to compute it.
My main motivation is for the callbacks, for early stopping or monitoring, but I'm sure it would be useful in other contexts.

To really be practical, the signature should be the same across all estimators. What I have in mind is something like:

def objective_function(self, X, y=None, *, sample_weight=None, normalize=False):
    y_pred = self.predict(X, y)
    # or Xt = self.transform(X) because some transformers do optimize an objective function.

    data_fit = <computation of the data fit term>
    penalization = <computation of the penalization term>

    if normalize:                     # allow to return a per sample
        data_fit /= X.shape[0]        # objective function
        penalization /= X.shape[0]    # for convenience
        # X.shape[0] probably needs to be replaced by sample_weight.sum()
 
    return data_fit + penalization, data_fit, penalization

If we want to compute the objective function of the training set during fitting, we could allow to provide the current variables that are required to compute it at a given iteration, encapsulated in a single argument (a dict):

def objective_function(self, X, y=None, *, sample_weight=None, fit_state=None, normalize=False):
    if fit_state is None:
        y_pred = self.predict(X)
    else:
        y_pred = <compute y_pred using the information in fit_state>

    ...

where the content of fit_state would be estimator specific, and detailed in the docstring of objective_function for each estimator.

The text was updated successfully, but these errors were encountered:

ogrisel · 2024-01-18T15:45:32Z

I think the normalize option should be true by default to make the train and test set values comparable on the same scale even if the test set is 10x smaller than the training set.

We might not want not expose an estimator-specific fit_state argument as part of the public API of scikit-learn estimators. To achieve the same results, estimator could optionally implement a private _objective_function() with the freedom to pass estimator specific kwargs with meaningful names.

When this private method is present, the callbacks would attempt to use the private method first by looking up kwargs names from the signature from the fit state (or assume that they are present as attributes on self, updated at each iteration of fit). Or alternatively we keep the estimator fit_state argument, but we only expose it in the private version of method.

About the return value(s) of this method:

I think it should be more than a scalar or a fixed number of scalars, but instead a dataclass (or namedtuple):

it should maybe communicate its name ("penalized_squared_error", "explained_variance", "inertia" ...),
it should communicate if this is something to minimize or maximize,
it should provide access to the aggregate scalar value,
but also the individual terms (data fits, and each regularizer).

If we return a dataclass, we can have an HTML repr for notebooks.

jeremiedbb · 2024-02-22T15:41:07Z

I opened #28505 which implements it for NMF to help moving forward with this discussion.

I think it should be more than a scalar or a fixed number of scalars, but instead a dataclass

I agree and made it a dataclass

When this private method is present, the callbacks would attempt to use the private method first by looking up kwargs names from the signature from the fit state

I think it's better to have it in the public method because callbacks are public and users can write custom callbacks so they should not have to rely on a private method.

We might not want not expose an estimator-specific fit_state argument as part of the public API of scikit-learn estimators. To achieve the same results, estimator could optionally implement a private _objective_function() with the freedom to pass estimator specific kwargs with meaningful names.

Thinking more about it, I think that it's better to have 1 keyword param per fitted parameter of the model instead of a single param being a dict. The callback documentation will mention that if a callback needs to compute the objective function on the train set, the fit_state dict that the callback receives should be used: model.objective_fuction(X, y, **fit_state).

ogrisel · 2024-02-22T17:18:23Z

The callback documentation will mention that if a callback needs to compute the objective function on the train set, the fit_state dict that the callback receives should be used: model.objective_fuction(X, y, **fit_state).

I suspect that there might be cases that we have in fit_state things that are not useful to pass as a kwarg to objective_function. Should we filter those out based on the signature of the objective function? Or make all objective function signature include an extra **ignored_kwargs (that sounds ugly)...

jeremiedbb · 2024-02-23T13:35:23Z

That's a good point. I guess I was planning to add a **kwargs param but I understand that it's not a very good pattern.

For the context, my motivation for being able to provide fit state parameters comes from when a callback would need to compute the objective function on the training set. It doesn't apply for any other metric and it doesn't apply for the validation set. And it's just to speed-up the computation, avoiding to call predict or transform.

So I'm wondering if we really need to have this/these param(s) at all. Maybe we can just accept that computing the objective function on the train set is as fast as computing it on the validation set, despite knowing that it could be done faster. What do you think ?

ogrisel · 2024-02-23T14:12:19Z

Maybe we can just accept that computing the objective function on the train set is as fast as computing it on the validation set, despite knowing that it could be done faster. What do you think ?

Indeed, let's start simple and consider adding API complexity only when it is really justified.

jeremiedbb added Hard Hard level of difficulty RFC Meta-issue General issue associated to an identified list of tasks labels Jan 18, 2024

jeremiedbb mentioned this issue Feb 22, 2024

[WIP] [POC] FEA public objective function methods in estimators #28505

Draft

jeremiedbb mentioned this issue Apr 5, 2024

DOC adapt logistic regression objective in user guide #28706

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC Expose a plublic method to compute the objective function #28169

RFC Expose a plublic method to compute the objective function #28169

jeremiedbb commented Jan 18, 2024 •

edited by ogrisel

ogrisel commented Jan 18, 2024 •

edited

jeremiedbb commented Feb 22, 2024 •

edited

ogrisel commented Feb 22, 2024

jeremiedbb commented Feb 23, 2024

ogrisel commented Feb 23, 2024

RFC Expose a plublic method to compute the objective function #28169

RFC Expose a plublic method to compute the objective function #28169

Comments

jeremiedbb commented Jan 18, 2024 • edited by ogrisel

ogrisel commented Jan 18, 2024 • edited

jeremiedbb commented Feb 22, 2024 • edited

ogrisel commented Feb 22, 2024

jeremiedbb commented Feb 23, 2024

ogrisel commented Feb 23, 2024

jeremiedbb commented Jan 18, 2024 •

edited by ogrisel

ogrisel commented Jan 18, 2024 •

edited

jeremiedbb commented Feb 22, 2024 •

edited