Add estimator check for NaNs and Infs in input X matrix #4

WojciechMigda · 2020-05-27T21:33:49Z

With newer scikit-learn estimator checks assert that the estimator will throw ValueError if there are NaNs or Infs in input X matrix.
Once this check is added a version of scikit-learn in requirements.txt should be relaxed.

The text was updated successfully, but these errors were encountered:

WojciechMigda · 2020-05-27T21:40:26Z

Error message on Travis:

=================================== FAILURES ===================================
____________________ test_classifier_passes_check_estimator ____________________

    def test_classifier_passes_check_estimator():
        from sklearn.utils.estimator_checks import check_estimator
    
>       check_estimator(XTsetlinMachineClassifier)

tests/test_check_estimator.py:88: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../virtualenv/python3.8.0/lib/python3.8/site-packages/sklearn/utils/estimator_checks.py:502: in check_estimator
    check(estimator)
../../../virtualenv/python3.8.0/lib/python3.8/site-packages/sklearn/utils/_testing.py:317: in wrapper
    return fn(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

name = 'XTsetlinMachineClassifier', estimator_orig = XTsetlinMachineClassifier()

    @ignore_warnings(category=FutureWarning)
    def check_estimators_nan_inf(name, estimator_orig):
        # Checks that Estimator X's do not contain NaN or inf.
        rnd = np.random.RandomState(0)
        X_train_finite = _pairwise_estimator_convert_X(rnd.uniform(size=(10, 3)),
                                                      estimator_orig)
        X_train_nan = rnd.uniform(size=(10, 3))
        X_train_nan[0, 0] = np.nan
        X_train_inf = rnd.uniform(size=(10, 3))
        X_train_inf[0, 0] = np.inf
        y = np.ones(10)
        y[:5] = 0
        y = _enforce_estimator_tags_y(estimator_orig, y)
        error_string_fit = "Estimator doesn't check for NaN and inf in fit."
        error_string_predict = ("Estimator doesn't check for NaN and inf in"
                                " predict.")
        error_string_transform = ("Estimator doesn't check for NaN and inf in"
                                  " transform.")
        for X_train in [X_train_nan, X_train_inf]:
            # catch deprecation warnings
            with ignore_warnings(category=FutureWarning):
                estimator = clone(estimator_orig)
                set_random_state(estimator, 1)
                # try to fit
                try:
                    estimator.fit(X_train, y)
                except ValueError as e:
                    if 'inf' not in repr(e) and 'NaN' not in repr(e):
                        print(error_string_fit, estimator, e)
                        traceback.print_exc(file=sys.stdout)
                        raise e
                except Exception as exc:
                    print(error_string_fit, estimator, exc)
                    traceback.print_exc(file=sys.stdout)
                    raise exc
                else:
>                   raise AssertionError(error_string_fit, estimator)
E                   AssertionError: ("Estimator doesn't check for NaN and inf in fit.", XTsetlinMachineClassifier(random_state=1))

../../../virtualenv/python3.8.0/lib/python3.8/site-packages/sklearn/utils/estimator_checks.py:1494: AssertionError

Resolves #4

WojciechMigda added a commit that referenced this issue May 30, 2020

Enforce finite input values in X and y

3c62335

Resolves #4

WojciechMigda mentioned this issue May 30, 2020

Enforce finite input values in X and y #5

Merged

WojciechMigda closed this as completed in #5 May 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add estimator check for NaNs and Infs in input X matrix #4

Add estimator check for NaNs and Infs in input X matrix #4

WojciechMigda commented May 27, 2020

WojciechMigda commented May 27, 2020

Add estimator check for NaNs and Infs in input X matrix #4

Add estimator check for NaNs and Infs in input X matrix #4

Comments

WojciechMigda commented May 27, 2020

WojciechMigda commented May 27, 2020