Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compare_model() does not print anything after finished #872

Closed
yixin0711 opened this issue Nov 21, 2020 · 13 comments
Closed

compare_model() does not print anything after finished #872

yixin0711 opened this issue Nov 21, 2020 · 13 comments

Comments

@yixin0711
Copy link

yixin0711 commented Nov 21, 2020

I'm using the compare_model() for the classification problem. It was running well during the process but in the end, it does not print anything but an empty list.

print(best_specific)
>>> []

I used code like this:

clf = setup(data=df_full, target='blabel', remove_perfect_collinearity = True)
best_specific = compare_models(include = ['lr','knn','nb','lda','svm','rf'], fold = 5, n_select = 5)

Is there anything I did wrong with?

@Yard1
Copy link
Member

Yard1 commented Nov 22, 2020

Could you send us your log.logs file? It will be present in the directory from which you ran the notebook or script.

@yixin0711
Copy link
Author

yixin0711 commented Nov 22, 2020

Could you send us your log.logs file? It will be present in the directory from which you ran the notebook or script.

Thank you for directing me! I was able to identify the error in the log file. Now it works as expected!


2020-11-20 15:02:39,473:INFO:Cross validating with StratifiedKFold(n_splits=10, random_state=5888, shuffle=False), n_jobs=-1
2020-11-20 15:04:31,889:WARNING:create_model() for lr raised an exception or returned all 0.0, trying without fit_kwargs:
2020-11-20 15:04:31,911:WARNING:joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 418, in _process_worker
    r = call_item()
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 272, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 567, in __call__
    return self.func(*args, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 560, in _fit_and_score
    test_scores = _score(estimator, X_test, y_test, scorer)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_validation.py", line 607, in _score
    scores = scorer(estimator, X_test, y_test)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 88, in __call__
    *args, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_scorer.py", line 213, in _score
    **self._kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/pycaret/internal/metrics.py", line 10, in wrapper
    return score_func(y_true, y_pred, **kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1741, in recall_score
    zero_division=zero_division)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
    return f(**kwargs)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1434, in precision_recall_fscore_support
    pos_label)
  File "/Users/Sharonvy/opt/anaconda3/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 1257, in _check_set_wise_labels
    "%r" % (pos_label, present_labels))
ValueError: pos_label=1 is not a valid label: array(['0.0', '1.0'], dtype='<U3')
"""

Seems like the data type is not identifiable.

@pycaret
Copy link
Collaborator

pycaret commented Nov 23, 2020

@yixin0711 Can you try passing n_jobs=1 during the setup call. As such your code will become:

clf = setup(data=df_full, target='blabel', remove_perfect_collinearity = True, n_jobs=1)
best_specific = compare_models(include = ['lr','knn','nb','lda','svm','rf'], fold = 5, n_select = 5)

@Yard1
Copy link
Member

Yard1 commented Nov 23, 2020

Also, would it be possible for you to share the dataset and the part of the script where you load the data? There seems to be some sort of a type issue.

@pycaret
Copy link
Collaborator

pycaret commented Dec 2, 2020

@yixin0711 Are you still facing the issue? if not, can you please close this thread.

@kilotwo
Copy link

kilotwo commented Dec 4, 2020

I have the same problem.It was running well during the process but in the end, it does not print anything but an empty list.Could you tell me how to solve this problem?

@Yard1
Copy link
Member

Yard1 commented Dec 4, 2020

Please run it with errors="raise" to see what could have caused it.

@kilotwo
Copy link

kilotwo commented Dec 4, 2020

Thanks for your help, the problem has been solved!
Seems like the data type is not identifiable.
The problem can be solved by changing the type of 'label' to integer.

@pycaret pycaret closed this as completed Dec 5, 2020
@Daniel-SanchezG
Copy link

I have the same issue, I can't figure out where is the problem, it's a Valeuerror. I think it is because the severe imbalance of the dataset but I was thinking that the setup() function handles this

ValueError: Expected n_neighbors <= n_samples, but n_samples = 4, n_neighbors = 6

@ngupta23
Copy link
Collaborator

@DASA39 It is hard to say without looking at the code you are using. It seems that you have very few data points somwhere in the flow (only 4) but are doing KNN with neighbors = 6 which is not possible. Try reducing the number of folds to see if you can increase the number of points in each fold.

@Daniel-SanchezG
Copy link

Thanks, @ngupta23, yes the problem is because of the severe imbalance of my dataset, I solved using the oversampling method before running the setup() function. But still, I can't figure out why if setup() function already have the fix_imblance_method() this imbalance doesn't was solved when I ran it

@ngupta23
Copy link
Collaborator

Imbalance is one thing, but having only 4 data points in a fold is another. How many observations do you have in your dataset and how many folds are you using?

@Daniel-SanchezG
Copy link

Daniel-SanchezG commented Oct 15, 2021

the original shape of my dataset is (1087, 22). The value_counts of the target is this:
Aluminosilicates of K 323
Phosphates of Al alone 313
Aluminosilicates of Fe and Mg 108
Silicates of Mg not containing Al 104
Carbonates of Ca 45
Aluminosilicates of Mg 35
Halides of the alkaline earths and Mg 28
Phosphates of Zn 27
Phosphates of Cu 27
Phosphates of Fe alone 26
Silicates of Aluminum 14
Oxides of Si 14
Silicates of Fe, Mg, Ca not containing Al 13
Phosphates of Al and other metals 10
After oversampling the shape is (4522, 22) and the setup() works properly. Always use the k_fold by default =10

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants