compare_model() does not print anything after finished #872

yixin0711 · 2020-11-21T21:58:42Z

I'm using the compare_model() for the classification problem. It was running well during the process but in the end, it does not print anything but an empty list.

print(best_specific)
>>> []

I used code like this:

clf = setup(data=df_full, target='blabel', remove_perfect_collinearity = True)
best_specific = compare_models(include = ['lr','knn','nb','lda','svm','rf'], fold = 5, n_select = 5)

Is there anything I did wrong with?

The text was updated successfully, but these errors were encountered:

Yard1 · 2020-11-22T10:14:41Z

Could you send us your log.logs file? It will be present in the directory from which you ran the notebook or script.

yixin0711 · 2020-11-22T22:35:05Z

Could you send us your log.logs file? It will be present in the directory from which you ran the notebook or script.

Thank you for directing me! I was able to identify the error in the log file. Now it works as expected!


2020-11-20 15:02:39,473:INFO:Cross validating with StratifiedKFold(n_splits=10, random_state=5888, shuffle=False), n_jobs=-1
2020-11-20 15:04:31,889:WARNING:create_model() for lr raised an exception or returned all 0.0, trying without fit_kwargs:
2020-11-20 15:04:31,911:WARNING:joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 560, in _fit_and_score
    test_scores = _score(estimator, X_test, y_test, scorer)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 607, in _score
    scores = scorer(estimator, X_test, y_test)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 88, in __call__
    *args, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 213, in _score
    **self._kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/pycaret/internal/metrics.py", line 10, in wrapper
    return score_func(y_true, y_pred, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1741, in recall_score
    zero_division=zero_division)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1434, in precision_recall_fscore_support
    pos_label)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1257, in _check_set_wise_labels
    "%r" % (pos_label, present_labels))
ValueError: pos_label=1 is not a valid label: array(['0.0', '1.0'], dtype='<U3')
"""

Seems like the data type is not identifiable.

pycaret · 2020-11-23T02:45:39Z

@yixin0711 Can you try passing n_jobs=1 during the setup call. As such your code will become:

clf = setup(data=df_full, target='blabel', remove_perfect_collinearity = True, n_jobs=1)
best_specific = compare_models(include = ['lr','knn','nb','lda','svm','rf'], fold = 5, n_select = 5)

Yard1 · 2020-11-23T08:19:47Z

Also, would it be possible for you to share the dataset and the part of the script where you load the data? There seems to be some sort of a type issue.

pycaret · 2020-12-02T16:17:39Z

@yixin0711 Are you still facing the issue? if not, can you please close this thread.

kilotwo · 2020-12-04T11:57:20Z

I have the same problem.It was running well during the process but in the end, it does not print anything but an empty list.Could you tell me how to solve this problem?

Yard1 · 2020-12-04T12:02:55Z

Please run it with errors="raise" to see what could have caused it.

kilotwo · 2020-12-04T13:36:35Z

Thanks for your help, the problem has been solved!
Seems like the data type is not identifiable.
The problem can be solved by changing the type of 'label' to integer.

Daniel-SanchezG · 2021-10-14T17:29:09Z

I have the same issue, I can't figure out where is the problem, it's a Valeuerror. I think it is because the severe imbalance of the dataset but I was thinking that the setup() function handles this

ValueError: Expected n_neighbors <= n_samples, but n_samples = 4, n_neighbors = 6

ngupta23 · 2021-10-15T13:08:12Z

@DASA39 It is hard to say without looking at the code you are using. It seems that you have very few data points somwhere in the flow (only 4) but are doing KNN with neighbors = 6 which is not possible. Try reducing the number of folds to see if you can increase the number of points in each fold.

Daniel-SanchezG · 2021-10-15T13:41:44Z

Thanks, @ngupta23, yes the problem is because of the severe imbalance of my dataset, I solved using the oversampling method before running the setup() function. But still, I can't figure out why if setup() function already have the fix_imblance_method() this imbalance doesn't was solved when I ran it

ngupta23 · 2021-10-15T20:26:53Z

Imbalance is one thing, but having only 4 data points in a fold is another. How many observations do you have in your dataset and how many folds are you using?

Daniel-SanchezG · 2021-10-15T20:46:37Z

the original shape of my dataset is (1087, 22). The value_counts of the target is this:
Aluminosilicates of K 323
Phosphates of Al alone 313
Aluminosilicates of Fe and Mg 108
Silicates of Mg not containing Al 104
Carbonates of Ca 45
Aluminosilicates of Mg 35
Halides of the alkaline earths and Mg 28
Phosphates of Zn 27
Phosphates of Cu 27
Phosphates of Fe alone 26
Silicates of Aluminum 14
Oxides of Si 14
Silicates of Fe, Mg, Ca not containing Al 13
Phosphates of Al and other metals 10
After oversampling the shape is (4522, 22) and the setup() works properly. Always use the k_fold by default =10

pycaret closed this as completed Dec 5, 2020

github-actions bot locked as resolved and limited conversation to collaborators May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compare_model() does not print anything after finished #872

compare_model() does not print anything after finished #872

yixin0711 commented Nov 21, 2020 •

edited

Yard1 commented Nov 22, 2020

yixin0711 commented Nov 22, 2020 •

edited

pycaret commented Nov 23, 2020 •

edited

Yard1 commented Nov 23, 2020

pycaret commented Dec 2, 2020

kilotwo commented Dec 4, 2020

Yard1 commented Dec 4, 2020

kilotwo commented Dec 4, 2020

Daniel-SanchezG commented Oct 14, 2021

ngupta23 commented Oct 15, 2021

Daniel-SanchezG commented Oct 15, 2021

ngupta23 commented Oct 15, 2021

Daniel-SanchezG commented Oct 15, 2021 •

edited

compare_model() does not print anything after finished #872

compare_model() does not print anything after finished #872

Comments

yixin0711 commented Nov 21, 2020 • edited

Yard1 commented Nov 22, 2020

yixin0711 commented Nov 22, 2020 • edited

pycaret commented Nov 23, 2020 • edited

Yard1 commented Nov 23, 2020

pycaret commented Dec 2, 2020

kilotwo commented Dec 4, 2020

Yard1 commented Dec 4, 2020

kilotwo commented Dec 4, 2020

Daniel-SanchezG commented Oct 14, 2021

I have the same issue, I can't figure out where is the problem, it's a Valeuerror. I think it is because the severe imbalance of the dataset but I was thinking that the setup() function handles this

ngupta23 commented Oct 15, 2021

Daniel-SanchezG commented Oct 15, 2021

ngupta23 commented Oct 15, 2021

Daniel-SanchezG commented Oct 15, 2021 • edited

yixin0711 commented Nov 21, 2020 •

edited

yixin0711 commented Nov 22, 2020 •

edited

pycaret commented Nov 23, 2020 •

edited

Daniel-SanchezG commented Oct 15, 2021 •

edited