Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tests for sample weights #11316

Open
jnothman opened this issue Jun 19, 2018 · 3 comments
Open

Refactor tests for sample weights #11316

jnothman opened this issue Jun 19, 2018 · 3 comments
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices module:test-suite everything related to our tests

Comments

@jnothman
Copy link
Member

jnothman commented Jun 19, 2018

In various parts of the code, we have tests for sample_weight support, including in metrics, and for individual estimators. we have some common estimator checks for class_weight, but not really for sample_weight functionality (only for weight type invariance).

Recent implementations of sample_weight include #10933 (KMeans) and #10803 (density estimation). But as well as estimators we have things like common tests for evaluation metrics.

Invariance testing for sample weights should include:

  • sample_weight=np.ones(len(X)) makes the same model as sample_weight=None
  • sample_weight=random can make a different model to sample_weight=None
  • sample_weight=s for integer array s makes the same model as X=np.repeat(X, s, axis=0), y=np.repeat(y, s, axis=0) (although there may be exceptions to this depending on how the estimator defines iteration, convergence, etc., as in Test test_weighted_vs_repeated is somehow flaky #11236)
  • sample_weight=s * k for array s and positive constant k makes the same model as sample_weight=s

I wonder if it is possible to establish a generic test for this, e.g. something like:

def check_sample_weight_invariance(data_args, fit, is_equal):
    """
    Parameters
    ----------
    data_args : dict
        Keyword arguments to pass to fit, and which would need to be repeated
        to test equivalence to integer sample weights.
    fit : callable
        Passed data args, returns a model that can be compared with is_equal
    is_equal : callable
        Passed two models returned from fit, returns a bool to indicate equality
        between models
    """
@jnothman jnothman added Moderate Anything that requires some knowledge of conventions and best practices help wanted labels Jun 19, 2018
@sergulaydore
Copy link
Contributor

I want to clarify I understand this correct. Those 4 tests you mentioned are already in check_sample_weight_invariance. But you are suggesting to change the input parameters to the method, right? Basically, instead of having metric, y1 and y2; we will feed fit and is_equal, right?

@jnothman
Copy link
Member Author

What I am suggesting is that we should have the same testing code used for weighted metrics as for weighted model fitting, insofar as this is possible.

@Higgs32584
Copy link
Contributor

Is this resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Moderate Anything that requires some knowledge of conventions and best practices module:test-suite everything related to our tests
Projects
None yet
Development

No branches or pull requests

5 participants