Skip to content

Commit

Permalink
Merge branch 'develop' into explained-variance
Browse files Browse the repository at this point in the history
  • Loading branch information
bbengfort committed Apr 16, 2020
2 parents 6ec08c8 + 4737f0f commit 188aff1
Show file tree
Hide file tree
Showing 90 changed files with 141 additions and 30 deletions.
4 changes: 2 additions & 2 deletions docs/api/classifier/confusion_matrix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,9 @@ Class names can be added to a ``ConfusionMatrix`` plot using the ``label_encoder
:alt: ConfusionMatrix plot with class names

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split as tts
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split as tts

from yellowbrick.classifier import ConfusionMatrix

iris = load_iris()
Expand All @@ -88,7 +89,6 @@ Class names can be added to a ``ConfusionMatrix`` plot using the ``label_encoder

iris_cm.fit(X_train, y_train)
iris_cm.score(X_test, y_test)

iris_cm.show()

Quick Method
Expand Down
13 changes: 13 additions & 0 deletions docs/api/regressor/residuals.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,19 @@ Note that if the histogram is not desired, it can be turned off with the ``hist=

.. warning:: The histogram on the residuals plot requires matplotlib 2.0.2 or greater. If you are using an earlier version of matplotlib, simply set the ``hist=False`` flag so that the histogram is not drawn.

Histogram can be replaced with a Q-Q plot, which is a common way to check that residuals are normally distributed. If the residuals are normally distributed, then their quantiles when plotted against quantiles of normal distribution should form a straight line. The example below shows, how Q-Q plot can be drawn with a ``qqplot=True`` flag. Notice that ``hist`` has to be set to ``False`` in this case.

.. plot::
:context: close-figs
:alt: Residuals Plot on the Concrete dataset with a Q-Q plot

visualizer = ResidualsPlot(model, hist=False, qqplot=True)
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.show()



Quick Method
------------

Expand Down
1 change: 0 additions & 1 deletion docs/api/text/postag.rst
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,6 @@ The same functionality above can be achieved with the associated quick method ``

# Create the visualizer, fit, score, and show it
postag(machado)
plt.tight_layout()


Part of Speech Tags
Expand Down
3 changes: 1 addition & 2 deletions tests/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,9 +287,8 @@ def _image_path(self, root):
imgdir = os.path.join(root, self.test_module_path)

# Create directory if it doesn't exist
# TODO: remove dependency on mpl.cbook
if not os.path.exists(imgdir):
mpl.cbook.mkdirs(imgdir)
os.makedirs(imgdir, mode=0o777, exist_ok=True)

# Create the image path from the test name
return os.path.join(imgdir, self.test_func_name + self.ext)
Expand Down
Binary file modified tests/baseline_images/test_base/test_draw_visualizer_grid.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_base/test_draw_with_cols.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_base/test_draw_with_rows.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_cluster/test_icdm/test_kmeans_mds.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_cluster/test_icdm/test_quick_method.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_draw/test_manual_legend.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified tests/baseline_images/test_features/test_pca/test_biplot_2d.png
Binary file modified tests/baseline_images/test_features/test_pca/test_colorbar.png
Binary file modified tests/baseline_images/test_features/test_pca/test_continuous.png
Binary file modified tests/baseline_images/test_features/test_pca/test_discrete.png
Binary file modified tests/baseline_images/test_features/test_pca/test_heatmap.png
Binary file modified tests/baseline_images/test_features/test_pca/test_single.png
Binary file modified tests/baseline_images/test_meta/test_random_visualizer.png
Binary file modified tests/baseline_images/test_text/test_tsne/test_no_target_tsne.png
1 change: 0 additions & 1 deletion tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ def pytest_configure(config):
# TODO: this is currently being reset before each test; needs fixing.
mpl.rcParams["font.family"] = "DejaVu Sans"


##########################################################################
## PyTest Hooks
##########################################################################
Expand Down
2 changes: 1 addition & 1 deletion tests/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Library Dependencies
matplotlib>=2.0.2,!=3.0.0,!=3.1.1
matplotlib>=3.2.1
scipy>=1.0.0
scikit-learn>=0.20
numpy>=1.13.0
Expand Down
3 changes: 2 additions & 1 deletion tests/test_classifier/test_prcurve.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,8 @@ def test_pandas_integration(self):

oz.finalize()

self.assert_images_similar(oz, tol=5.0)
# Miniconda & Appveyor: images not close (RMS 5.089)
self.assert_images_similar(oz, tol=5.5)

def test_no_scoring_function(self):
"""
Expand Down
4 changes: 2 additions & 2 deletions tests/test_contrib/test_classifier/test_boundaries.py
Original file line number Diff line number Diff line change
Expand Up @@ -373,7 +373,7 @@ def test_real_data_set_viz(self):
data = datasets.load_iris()
feature_names = [name.replace(" ", "_") for name in data.feature_names]
df = pd.DataFrame(data.data, columns=feature_names)
X = df[["sepal_length_(cm)", "sepal_width_(cm)"]].as_matrix()
X = df[["sepal_length_(cm)", "sepal_width_(cm)"]].values
y = data.target

visualizer = DecisionBoundariesVisualizer(model)
Expand All @@ -390,7 +390,7 @@ def test_quick_method(self):
data = datasets.load_iris()
feature_names = [name.replace(" ", "_") for name in data.feature_names]
df = pd.DataFrame(data.data, columns=feature_names)
X = df[["sepal_length_(cm)", "sepal_width_(cm)"]].as_matrix()
X = df[["sepal_length_(cm)", "sepal_width_(cm)"]].values
y = data.target

decisionviz(model, X, y)
26 changes: 26 additions & 0 deletions tests/test_regressor/test_residuals.py
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,32 @@ def test_residuals_plot(self):

self.assert_images_similar(visualizer)

@pytest.mark.xfail(
IS_WINDOWS_OR_CONDA,
reason="font rendering different in OS and/or Python; see #892",
)
def test_residuals_plot_QQ_plot(self):
"""
Image similarity of residuals and Q-Q plot on random data with OLS
"""
_, ax = plt.subplots()

visualizer = ResidualsPlot(LinearRegression(), hist=False,
qqplot=True, ax=ax)

visualizer.fit(self.data.X.train, self.data.y.train)
visualizer.score(self.data.X.test, self.data.y.test)

self.assert_images_similar(visualizer)

def test_either_hist_or_QQ_plot(self):
"""
Setting both hist=True and qqplot=True raises exception.
"""
with pytest.raises(YellowbrickValueError,
match="Set either hist or qqplot to False"):
ResidualsPlot(LinearRegression(), hist=True, qqplot=True)

@pytest.mark.xfail(
sys.platform == "win32", reason="images not close on windows (RMSE=32)"
)
Expand Down
16 changes: 11 additions & 5 deletions tests/test_text/test_postag.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,13 @@
##########################################################################

import pytest
import matplotlib.pyplot as plt

from yellowbrick.exceptions import YellowbrickValueError
from yellowbrick.text.postag import *
from tests.base import VisualTestCase
import matplotlib.pyplot as plt
from tests.base import IS_WINDOWS_OR_CONDA

from yellowbrick.text.postag import *
from yellowbrick.exceptions import YellowbrickValueError

try:
import nltk
Expand Down Expand Up @@ -176,7 +178,9 @@ def test_quick_method(self):
viz = postag(tagged_docs, ax=ax, show=False)
viz.ax.grid(False)

self.assert_images_similar(viz)
# Fails on Miniconda/Appveyor with images not close (RMS 5.157)
tol = 5.5 if IS_WINDOWS_OR_CONDA else 0.25
self.assert_images_similar(viz, tol=tol)

def test_unknown_tagset(self):
"""
Expand Down Expand Up @@ -229,7 +233,9 @@ def test_frequency_mode(self):
# Assert that ticks are set properly
assert ticks_ax == sorted_tags

self.assert_images_similar(ax=ax, tol=0.5)
# Fails on Miniconda/Appveyor with images not close (RMS 5.302)
tol = 5.5 if IS_WINDOWS_OR_CONDA else 0.5
self.assert_images_similar(ax=ax, tol=tol)

@pytest.mark.skipif(nltk is None, reason="test requires nltk")
def test_word_tagged(self):
Expand Down
4 changes: 1 addition & 3 deletions yellowbrick/classifier/class_prediction_error.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
##########################################################################

import numpy as np
import matplotlib.pyplot as plt

from sklearn.utils.multiclass import unique_labels
from sklearn.metrics.classification import _check_targets
Expand Down Expand Up @@ -229,8 +228,7 @@ def finalize(self, **kwargs):
self.ax.set_ylim(0, cmax + cmax * 0.1)

# Ensure the legend fits on the figure
plt.tight_layout(rect=[0, 0, 0.90, 1]) # TODO: Could use self.fig now

self.fig.tight_layout(rect=[0, 0, 0.90, 1])

##########################################################################
## Quick Method
Expand Down
14 changes: 7 additions & 7 deletions yellowbrick/classifier/classification_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ def finalize(self, **kwargs):
self.ax.set_xticklabels(self._displayed_scores, rotation=45)
self.ax.set_yticklabels(self.classes_)

plt.tight_layout() # TODO: Could use self.fig now
self.fig.tight_layout()


def classification_report(
Expand Down Expand Up @@ -314,24 +314,24 @@ def classification_report(
not a classifier, an exception is raised. If the internal model is not
fitted, it is fit when the visualizer is fitted, unless otherwise specified
by ``is_fitted``.
X_train : ndarray or DataFrame of shape n x m
A feature array of n instances with m features the model is trained on.
Used to fit the visualizer and also to score the visualizer if test splits are
not directly specified.
y_train : ndarray or Series of length n
An array or series of target or class values. Used to fit the visualizer and
also to score the visualizer if test splits are not specified.
X_test : ndarray or DataFrame of shape n x m, default: None
An optional feature array of n instances with m features that the model
is scored on if specified, using X_train as the training data.
y_test : ndarray or Series of length n, default: None
An optional array or series of target or class values that serve as actual
labels for X_test for scoring purposes.
ax : matplotlib Axes, default: None
The axes to plot the figure on. If not specified the current axes will be
used (or generated if required).
Expand Down Expand Up @@ -407,7 +407,7 @@ def classification_report(
)
else:
visualizer.score(X_train, y_train)

# Draw the final visualization
if show:
visualizer.show()
Expand Down
3 changes: 3 additions & 0 deletions yellowbrick/classifier/confusion_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,9 @@ def finalize(self, **kwargs):
self.ax.set_ylabel("True Class")
self.ax.set_xlabel("Predicted Class")

# Call tight layout to maximize readability
self.fig.tight_layout()


##########################################################################
## Quick Method
Expand Down
2 changes: 1 addition & 1 deletion yellowbrick/contrib/classifier/boundaries.py
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ def _select_feature_columns(self, X):

# Handle the feature names if they're None.
elif self.features_ is not None and is_dataframe(X):
X_two_cols = X[self.features_].as_matrix()
X_two_cols = X[self.features_].values

# handle numpy named/ structured array
elif self.features_ is not None and is_structured_array(X):
Expand Down
2 changes: 1 addition & 1 deletion yellowbrick/contrib/scatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ def fit(self, X, y=None, **kwargs):

# Handle the feature names if they're None.
elif self.features_ is not None and is_dataframe(X):
X_two_cols = X[self.features_].as_matrix()
X_two_cols = X[self.features_].values

# handle numpy named/ structured array
elif self.features_ is not None and is_structured_array(X):
Expand Down
2 changes: 1 addition & 1 deletion yellowbrick/features/jointplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,7 @@ def finalize(self, **kwargs):
plt.sca(self.ax)

# Call tight layout to maximize readability
plt.tight_layout()
self.fig.tight_layout()

def _index_into(self, idx, data):
"""
Expand Down
3 changes: 1 addition & 2 deletions yellowbrick/model_selection/importances.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@

import warnings
import numpy as np
import matplotlib.pyplot as plt

from yellowbrick.draw import bar_stack
from yellowbrick.base import ModelVisualizer
Expand Down Expand Up @@ -288,7 +287,7 @@ def finalize(self, **kwargs):
self.ax.grid(False, axis="y")

# Ensure we have a tight fit
plt.tight_layout()
self.fig.tight_layout()

def _find_classes_param(self):
"""
Expand Down

0 comments on commit 188aff1

Please sign in to comment.