Yellowbrick: Machine Learning Visualization
Yellowbrick is a suite of visual diagnostic tools called "Visualizers" that extend the Scikit-Learn API to allow human steering of the model selection process. In a nutshell, Yellowbrick combines scikit-learn with matplotlib in the best tradition of the scikit-learn documentation, but to produce visualizations for your models! For more on Yellowbrick, please see the :doc:`about`.
If you're new to Yellowbrick, checkout the :doc:`quickstart` or skip ahead to the :doc:`tutorial`. Yellowbrick is a rich library with many Visualizers being added on a regular basis. For details on specific Visualizers and extended usage head over to the :doc:`api/index`. Interested in contributing to Yellowbrick? Checkout the :ref:`contributing guide <contributing>` . If you've signed up to do user testing, head over to the :doc:`evaluation` (and thank you!).
Visualizers are estimators (objects that learn from data) whose primary objective is to create visualizations that allow insight into the model selection process. In Scikit-Learn terms, they can be similar to transformers when visualizing the data space or wrap an model estimator similar to how the "ModelCV" (e.g. RidgeCV, LassoCV) methods work. The primary goal of Yellowbrick is to create a sensical API similar to Scikit-Learn. Some of our most popular visualizers include:
- :doc:`api/features/rankd`: pairwise ranking of features to detect relationships
- :doc:`api/features/pcoords`: horizontal visualization of instances
- :doc:`Radial Visualization <api/features/radviz>`: separation of instances around a circular plot
- :doc:`api/features/pca`: projection of instances based on principal components
- :doc:`api/features/manifold`: high dimensional visualization with manifold learning
- :doc:`api/features/importances`: rank features by importance or linear coefficients for a specific model
- :doc:`api/features/rfecv`: find the best subset of features based on importance
- :doc:`Joint Plots <api/features/jointplot>`: direct data visualization with feature selection
- :doc:`api/classifier/class_balance`: see how the distribution of classes affects the model
- :doc:`api/classifier/class_prediction_error`: shows error and support in classification
- :doc:`api/classifier/classification_report`: visual representation of precision, recall, and F1
- :doc:`ROC/AUC Curves <api/classifier/rocauc>`: receiver operator characteristics and area under the curve
- :doc:`Confusion Matrices <api/classifier/confusion_matrix>`: visual description of class decision making
- :doc:`Discrimination Threshold <api/classifier/threshold>`: find a threshold that best separates binary classes
- :doc:`api/regressor/peplot`: find model breakdowns along the domain of the target
- :doc:`api/regressor/residuals`: show the difference in residuals of training and test data
- :doc:`api/regressor/alphas`: show how the choice of alpha influences regularization
- :doc:`K-Elbow Plot <api/cluster/elbow>`: select k using the elbow method and various metrics
- :doc:`Silhouette Plot <api/cluster/silhouette>`: select k by visualizing silhouette coefficient values
Model Selection Visualization
- :doc:`api/model_selection/validation_curve`: tune a model with respect to a single hyperparameter
- :doc:`api/model_selection/learning_curve`: show if a model might benefit from more data or less complexity
- :doc:`Term Frequency <api/text/freqdist>`: visualize the frequency distribution of terms in the corpus
- :doc:`api/text/tsne`: use stochastic neighbor embedding to project documents
- :doc:`api/text/dispersion`: visualize how key terms are dispersed throughout a corpus
... and more! Visualizers are being added all the time; be sure to check the examples (or even the develop branch) and feel free to contribute your ideas for new Visualizers!
Yellowbrick is a welcoming, inclusive project in the tradition of matplotlib and scikit-learn. Similar to those projects, we follow the Python Software Foundation Code of Conduct. Please don't hesitate to reach out to us for help or if you have any contributions or bugs to report!
The primary way to ask for help with Yellowbrick is to post on our Google Groups Listserv. This is an email list/forum that members of the community can join and respond to each other; you should be able to receive the quickest response here. Please also consider joining the group so you can respond to questions! You can also ask questions on Stack Overflow and tag them with "yellowbrick". Or you can add issues on GitHub. You can also tweet or direct message us on Twitter @scikit_yb.
Table of Contents
The following is a complete listing of the Yellowbrick documentation for this version of the library:
.. toctree:: :maxdepth: 2 quickstart tutorial api/index evaluation contributing matplotlib gallery about code_of_conduct changelog