MLkNN breaks when sklearn >= 1.0 #250

alexeyev · 2022-10-31T01:38:18Z

Hi, dear colleagues, thank you for your work.

This is a bug report.

With scikit-learn==0.24.0, this code (X_train is a dense 2D numpy array, y_train is a sparse scipy matrix with the same number of rows) works:

classifier = MLkNN(k=2)
classifier.fit(X=X_train, y=y_train)

With scikit-learn==1.0 and scikit-learn==1.1.3, however, I get:

    classifier.fit(X=X_train, y=y_train)
  File "C:\Users\<me>\AppData\Local\Programs\Python\Python38\lib\site-packages\skmultilearn\adapt\mlknn.py", line 218, in fit
    self._cond_prob_true, self._cond_prob_false = self._compute_cond(X, self._label_cache)
  File "C:\Users\<me>\AppData\Local\Programs\Python\Python38\lib\site-packages\skmultilearn\adapt\mlknn.py", line 165, in _compute_cond
    self.knn_ = NearestNeighbors(self.k).fit(X)
TypeError: __init__() takes 1 positional argument but 2 were given

Thanks.

The text was updated successfully, but these errors were encountered:

jamesee · 2023-02-20T10:43:00Z

I faced the same problem.

The following is the example given.
my sklearn version :

from skmultilearn.adapt import MLkNN
from sklearn.model_selection import GridSearchCV
x, y = make_multilabel_classification(n_samples=10000, n_features=20,
                                      n_classes=5, random_state=88)

parameters = {'k': range(1,3), 's': [0.5, 0.7, 1.0]}

clf = GridSearchCV(MLkNN(), parameters, scoring='f1_macro')
clf.fit(x, y)

print (clf.best_params_, clf.best_score_)

stacktrace as follows:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[56], line 12
      9 parameters = {'k': range(1,3), 's': [0.5, 0.7, 1.0]}
     11 clf = GridSearchCV(MLkNN(), parameters, scoring='f1_macro')
---> 12 clf.fit(x, y)
     14 print (clf.best_params_, clf.best_score_)

File ~/miniconda3/envs/torch/lib/python3.9/site-packages/sklearn/model_selection/_search.py:875, in BaseSearchCV.fit(self, X, y, groups, **fit_params)
    869     results = self._format_results(
    870         all_candidate_params, n_splits, all_out, all_more_results
    871     )
    873     return results
--> 875 self._run_search(evaluate_candidates)
    877 # multimetric is determined here because in the case of a callable
    878 # self.scoring the return type is only known after calling
    879 first_test_score = all_out[0]["test_scores"]

File ~/miniconda3/envs/torch/lib/python3.9/site-packages/sklearn/model_selection/_search.py:1389, in GridSearchCV._run_search(self, evaluate_candidates)
   1387 def _run_search(self, evaluate_candidates):
   1388     """Search all candidates in param_grid"""
-> 1389     evaluate_candidates(ParameterGrid(self.param_grid))

File ~/miniconda3/envs/torch/lib/python3.9/site-packages/sklearn/model_selection/_search.py:852, in BaseSearchCV.fit.<locals>.evaluate_candidates(candidate_params, cv, more_results)
    845 elif len(out) != n_candidates * n_splits:
    846     raise ValueError(
    847         "cv.split and cv.get_n_splits returned "
    848         "inconsistent results. Expected {} "
    849         "splits, got {}".format(n_splits, len(out) // n_candidates)
    850     )
--> 852 _warn_or_raise_about_fit_failures(out, self.error_score)
    854 # For callable self.scoring, the return type is only know after
    855 # calling. If the return type is a dictionary, the error scores
    856 # can now be inserted with the correct key. The type checking
    857 # of out will be done in `_insert_error_scores`.
    858 if callable(self.scoring):

File ~/miniconda3/envs/torch/lib/python3.9/site-packages/sklearn/model_selection/_validation.py:367, in _warn_or_raise_about_fit_failures(results, error_score)
    360 if num_failed_fits == num_fits:
    361     all_fits_failed_message = (
    362         f"\nAll the {num_fits} fits failed.\n"
    363         "It is very likely that your model is misconfigured.\n"
    364         "You can try to debug the error by setting error_score='raise'.\n\n"
    365         f"Below are more details about the failures:\n{fit_errors_summary}"
    366     )
--> 367     raise ValueError(all_fits_failed_message)
    369 else:
    370     some_fits_failed_message = (
    371         f"\n{num_failed_fits} fits failed out of a total of {num_fits}.\n"
    372         "The score on these train-test partitions for these parameters"
   (...)
    376         f"Below are more details about the failures:\n{fit_errors_summary}"
    377     )

ValueError: 
All the 30 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
30 fits failed with the following error:
Traceback (most recent call last):
  File "/Users/james/miniconda3/envs/torch/lib/python3.9/site-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/Users/james/miniconda3/envs/torch/lib/python3.9/site-packages/skmultilearn/adapt/mlknn.py", line 218, in fit
    self._cond_prob_true, self._cond_prob_false = self._compute_cond(X, self._label_cache)
  File "/Users/james/miniconda3/envs/torch/lib/python3.9/site-packages/skmultilearn/adapt/mlknn.py", line 165, in _compute_cond
    self.knn_ = NearestNeighbors(self.k).fit(X)
TypeError: __init__() takes 1 positional argument but 2 were given

ChristianSch · 2023-03-14T16:36:19Z

pull requests are welcome 🎉

northern-64bit · 2024-03-17T09:47:46Z

The issue is the following row:

self.knn_ = NearestNeighbors(self.k).fit(X)

It has to be changed to:

self.knn_ = NearestNeighbors(n_neighbors=self.k, n_jobs=self.n_jobs)

Note that this has been fixed in https://github.com/scikit-multilearn-ng/scikit-multilearn-ng/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLkNN breaks when sklearn >= 1.0 #250

MLkNN breaks when sklearn >= 1.0 #250

alexeyev commented Oct 31, 2022

jamesee commented Feb 20, 2023 •

edited

ChristianSch commented Mar 14, 2023

northern-64bit commented Mar 17, 2024

MLkNN breaks when sklearn >= 1.0 #250

MLkNN breaks when sklearn >= 1.0 #250

Comments

alexeyev commented Oct 31, 2022

jamesee commented Feb 20, 2023 • edited

ChristianSch commented Mar 14, 2023

northern-64bit commented Mar 17, 2024

jamesee commented Feb 20, 2023 •

edited