Yellowbrick extends the Scikit-Learn API to make model selection and hyperparameter tuning easier.
import model_evaluation_reports as rpts
http://www.scikit-yb.org/en/latest/tutorial.html
Yellowbrick's visual report returns a matrix of P, R, and F1 scores for each model.
It is indeed very neat, but in my opinion, not very practical since the goal of the visualization is to enable picking "the best" model...
# Mushroom dataset & models list:
X, y, labels = rpts.get_mushroom_data()
models = rpts.get_models()
rpts.yellowbrick_model_evaluation_report(X, y, models)
rpts.model_evaluation_report_tbl(models, X, y, labels, 'Model selection report') # green: max; pink: min
rpts.model_evaluation_report_bar(models, X, y, labels, xlim_to_1=False, encode=True)
# Note: xlim_to_1=False, encode=True :: default values
rpts.model_evaluation_report_bar(models, X, y, labels, xlim_to_1=True)
# Iris dataset from sklearn & same models:
X, y, labels = rpts.get_iris_data()
models = rpts.get_models()
# The encoding is alredy done on this dataset, so encode=False.
rpts.model_evaluation_report_bar(models, X, y, labels, encode=False)
I've attempted reproducing the radar plots in a single row (whenever possible), but that implementation needs more work as the plots end up being squished too close together.
I'm glad I went through adapting DeepMind/bsuite radar charts, but I am I not quite satisfied with the outcome, at least with the Iris dataset: they only make it easy to id the least performant model, here SGDClassifier.
Additionally, until — and if — I find a way to line up the plots more compactly, they also suffer from the same 'scrolling objection' I initially made...with only 3 classes!
dfm_iris = rpts.get_scores_df(models, X, y, labels, encode=False)
for lbl in labels:
rpts.scores_radar_plot_example(dfm_iris, cat=lbl)
This function has a parameter, class_col
, that acts as a swicth to output either the scores or classes as columns.
Only the reporting function using the Pandas Styler, model_evaluation_report_tbl()
, requires class_col=True
, while the others do not (default = False).
dfm_iris_tbl = rpts.get_scores_df(models, X, y, labels, encode=False, class_col=True)
dfm_iris_tbl.head()
dfm_iris = rpts.get_scores_df(models, X, y, labels, encode=False)
dfm_iris.head()
Either df can be passed through the styler function, so this 'DataFrame approach' could be the most straightforward for cases with a large number of classes:
rpts.model_evaluation_report_from_df(dfm_iris_tbl, 'Model selection report (from df)')
rpts.model_evaluation_report_from_df(dfm_iris, 'Model selection report (from df)')