Utility function to plot decision regions #5070

rasbt · 2015-07-31T17:24:10Z

A simple utility function to plot decision regions to avoid implementing the code over and over (e.g., in the scikit-learn documentation examples; I feel like this could make the code leaner and easier to read).

I am wondering though how to implement unittests for this. Any ideas?

Other "to dos" may be:

In addition to True and False let the user provide a custom marker list via cycle marker, e.g., in the format, e.g., as string 'sxo^v' (squares, crosses, circles, upper triangles, lower triangles)
maybe create and return a figure object?

Here are some examples how it currently looks like:

Simple 2D Plot

from sklearn.utils import plot_decision_regions
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Loading some example data
iris = datasets.load_iris()
X = iris.data[:, [0,2]]
sc = StandardScaler()
X = sc.fit_transform(X)

y = iris.target

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X, y)

##########################################################
# Plotting decision regions

plot_decision_regions(X, y, clf=svm, res=0.02, legend=2)
##########################################################

# Adding axes annotations
plt.xlabel('sepal length [standardized]')
plt.ylabel('petal length [standardized]')
plt.title('SVM on Iris')
plt.show()

Highlighting test data points

from sklearn.cross_validation import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=0) 

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X_train, y_train)

##########################################################
# Plotting decision regions

plot_decision_regions(X, y, clf=svm, 
                      X_highlight=X_test, 
                      res=0.02, legend=2)
##########################################################

# Adding axes annotations
plt.xlabel('sepal length [standardized]')
plt.ylabel('petal length [standardized]')
plt.title('SVM on Iris')
plt.show()

1D example

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC

# Loading some example data
iris = datasets.load_iris()
X = iris.data[:, 2]
X = X[:, None]
y = iris.target

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X,y)

# Plotting decision regions
plot_decision_regions(X, y, clf=svm, res=0.02, legend=2)

# Adding axes annotations
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.title('SVM on Iris')
plt.show()

Let me know what you think @amueller

amueller · 2015-07-31T18:18:09Z

Thanks for the PR.
It would be cool to optionally have a decision function / predict proba.
For reference, my implementation is here: https://github.com/amueller/scipy_2015_sklearn_tutorial/blob/master/notebooks/figures/plot_2d_separator.py

It would also be nice to use this function in the docs instead of reimplementing it all the time.
This is also a good test of whether it is useful.
git grep contourf reveals some candidates.

rasbt · 2015-07-31T20:59:23Z

Sure, once we agreed on a nice interface and implementation, I think it would be worthwhile to go through the documentation and substitute the lengthy code for plotting decision regions.

About the tests: How are we going to handle those? I am not sure if it makes sense to write unittests for a plotting function, but maybe at least testing that it gets important correctly. Also, I could maybe add an IPython notebook somewhere so that someone can just hit "run all" and see if the plots make sense after making some modifications to the function or so.
About travis: Also here, shall we exclude the function from the test suite, or add matplotlib to the travis setup?

So, the current to do list is

resolve travis issue
add example scripts
add argument for custom markers
support decision_function
support predict_proba
replace "plotting code" in documentation examples if applicable

amueller · 2015-07-31T21:33:29Z

I'm not sure about the testing. We have a decorator that skips if no matplotlib is available. I guess just running the function and seed that it runs without errors should be ok?
You can also check sklearn/ensemble/tests/test_partial_dependence.py. We could check that the right number of data points is in the plot. Not sure how meaningful that is.

rasbt · 2015-08-20T18:10:26Z

I was just going through my to do list and wanted to leave a little sign of life here since it has been already ~3 weeks since the last commit. Currently, I am very, very busy, but this will change in about ~1-2 weeks so that I can finish what I have started here ;)

amueller · 2015-08-24T21:43:17Z

no worries, thanks for the heads-up :)

sklearn-lgtm · 2018-07-14T15:50:18Z

This pull request introduces 4 alerts when merging 32267ea into 3e29334 - view on LGTM.com

new alerts:

4 for Unused local variable

Comment posted by LGTM.com

thomasjpfan · 2022-04-14T03:59:54Z

I am closing this PR because #16061 has been merged with a new API to plot decision boundaries. Examples in the gallery have been updated using the new API.

decision region plot 1st commit

32267ea

rasbt mentioned this pull request Oct 8, 2015

RFECV with SVC & kernel != 'linear' == ValueError #5168

Closed

amueller mentioned this pull request Jul 14, 2018

[MRG+1] EXA Adding cv indices example #11475

Merged

amueller added Easy Well-defined and straightforward way to resolve help wanted Sprint labels Jul 14, 2018

amueller mentioned this pull request Jul 14, 2018

Discussion of useful plotting #7116

Open

rth removed the Sprint label Jun 27, 2019

thomasjpfan mentioned this pull request Jul 14, 2019

[MRG] Plotting API starting with ROC curve #14357

Merged

amueller mentioned this pull request Jan 9, 2020

FEA Add DecisionBoundaryDisplay #16061

Merged

github-actions bot added the module:utils label Mar 2, 2020

cmarmo added Superseded PR has been replace by a newer PR and removed help wanted labels Jul 18, 2020

Base automatically changed from master to main January 22, 2021 10:48

thomasjpfan closed this Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utility function to plot decision regions #5070

Utility function to plot decision regions #5070

rasbt commented Jul 31, 2015

amueller commented Jul 31, 2015

rasbt commented Jul 31, 2015

amueller commented Jul 31, 2015

rasbt commented Aug 20, 2015

amueller commented Aug 24, 2015

sklearn-lgtm commented Jul 14, 2018

thomasjpfan commented Apr 14, 2022 •

edited

Utility function to plot decision regions #5070

Utility function to plot decision regions #5070

Conversation

rasbt commented Jul 31, 2015

Simple 2D Plot

Highlighting test data points

1D example

amueller commented Jul 31, 2015

rasbt commented Jul 31, 2015

amueller commented Jul 31, 2015

rasbt commented Aug 20, 2015

amueller commented Aug 24, 2015

sklearn-lgtm commented Jul 14, 2018

thomasjpfan commented Apr 14, 2022 • edited

thomasjpfan commented Apr 14, 2022 •

edited