Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Shap outputs #588

Open
dbrami opened this issue Dec 6, 2022 · 9 comments
Open

No Shap outputs #588

dbrami opened this issue Dec 6, 2022 · 9 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@dbrami
Copy link

dbrami commented Dec 6, 2022

Hi,
I'm not seeing any shap outputs when using the following:

# Initialize AutoML in Explain Mode
automl = AutoML(mode="Explain", 
                explain_level=2,
               ml_task='multiclass_classification')
automl.fit(X, y)

This in spte of shap being properly installed.
What I get out of the above code is the following:

AutoML directory: AutoML_7
The task is multiclass_classification with evaluation metric logloss
AutoML will use algorithms: ['Baseline', 'Linear', 'Decision Tree', 'Random Forest', 'Neural Network']
AutoML will ensemble available models
AutoML steps: ['simple_algorithms', 'default_algorithms', 'ensemble']
* Step simple_algorithms will try to check up to 3 models
1_Baseline logloss 3.229533 trained in 25.56 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
2_DecisionTree logloss 2.15877 trained in 59.34 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
3_Linear logloss 1.707406 trained in 47.68 seconds
* Step default_algorithms will try to check up to 2 models
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
4_Default_NeuralNetwork logloss 4.045366 trained in 7.02 seconds
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either `df[df.columns[i]] = newvals` or, if columns are non-unique, `df.isetitem(i, newvals)`
5_Default_RandomForest logloss 1.858415 trained in 75.39 seconds
* Step ensemble will try to check up to 1 model
Ensemble logloss 1.288517 trained in 0.56 seconds
AutoML fit time: 226.47 seconds
AutoML best model: Ensemble
AutoML(explain_level=2, ml_task='multiclass_classification')
@pplonski
Copy link
Contributor

pplonski commented Dec 6, 2022

Thanks @dbrami for reporting. Is it possible to include data to reproduce the issue?

@pplonski pplonski self-assigned this Dec 6, 2022
@pplonski pplonski added bug Something isn't working help wanted Extra attention is needed labels Dec 6, 2022
@dbrami
Copy link
Author

dbrami commented Dec 6, 2022

Sure.
Uploading my ipynb and data
Archive.zip

@dbrami
Copy link
Author

dbrami commented Dec 12, 2022

Hi Pavel,
Any luck?

@jasperan
Copy link

It happened to me too @dbrami but with tree visualizations, with the same explain_level value, let me know if you find something

@williamty
Copy link

it happened to me too. also with tree visualizations. i think maybe it's related to the mission, tree visualizations are not suitable for binary classification.

@pplonski
Copy link
Contributor

Hi @williamty, please make sure that you have the latest version of package pip install -U mljar-supervised, decision trees should be produced. Regarding missing SHAP plots - it might be a bug.

@csetzkorn
Copy link

I have the same issue with the latest version - that no SHAP values are produced. Is there a previous stable version w.r.t. this feature?

@pplonski
Copy link
Contributor

Maybe there were some changes in shap API?

@csetzkorn
Copy link

I think the issue is that the current implementation does not accept object/category/string types and everything needs to numeric fro SHAP to work, which kind of defeats the objective of AutoML should one use SHAP to guide feature selection ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants