Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility function to plot decision regions #5070

Closed
wants to merge 1 commit into from

Conversation

rasbt
Copy link
Contributor

@rasbt rasbt commented Jul 31, 2015

A simple utility function to plot decision regions to avoid implementing the code over and over (e.g., in the scikit-learn documentation examples; I feel like this could make the code leaner and easier to read).

I am wondering though how to implement unittests for this. Any ideas?

Other "to dos" may be:

  • In addition to True and False let the user provide a custom marker list via cycle marker, e.g., in the format, e.g., as string 'sxo^v' (squares, crosses, circles, upper triangles, lower triangles)
  • maybe create and return a figure object?

Here are some examples how it currently looks like:

Simple 2D Plot

from sklearn.utils import plot_decision_regions
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Loading some example data
iris = datasets.load_iris()
X = iris.data[:, [0,2]]
sc = StandardScaler()
X = sc.fit_transform(X)

y = iris.target

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X, y)

##########################################################
# Plotting decision regions

plot_decision_regions(X, y, clf=svm, res=0.02, legend=2)
##########################################################

# Adding axes annotations
plt.xlabel('sepal length [standardized]')
plt.ylabel('petal length [standardized]')
plt.title('SVM on Iris')
plt.show()

unknown

Highlighting test data points

from sklearn.cross_validation import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=0) 

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X_train, y_train)

##########################################################
# Plotting decision regions

plot_decision_regions(X, y, clf=svm, 
                      X_highlight=X_test, 
                      res=0.02, legend=2)
##########################################################

# Adding axes annotations
plt.xlabel('sepal length [standardized]')
plt.ylabel('petal length [standardized]')
plt.title('SVM on Iris')
plt.show()

unknown-1

1D example

import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC

# Loading some example data
iris = datasets.load_iris()
X = iris.data[:, 2]
X = X[:, None]
y = iris.target

# Training a classifier
svm = SVC(C=0.5, kernel='linear')
svm.fit(X,y)

# Plotting decision regions
plot_decision_regions(X, y, clf=svm, res=0.02, legend=2)

# Adding axes annotations
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.title('SVM on Iris')
plt.show()

unknown-2

Let me know what you think @amueller

@amueller
Copy link
Member

Thanks for the PR.
It would be cool to optionally have a decision function / predict proba.
For reference, my implementation is here: https://github.com/amueller/scipy_2015_sklearn_tutorial/blob/master/notebooks/figures/plot_2d_separator.py

It would also be nice to use this function in the docs instead of reimplementing it all the time.
This is also a good test of whether it is useful.
git grep contourf reveals some candidates.

@rasbt
Copy link
Contributor Author

rasbt commented Jul 31, 2015

Sure, once we agreed on a nice interface and implementation, I think it would be worthwhile to go through the documentation and substitute the lengthy code for plotting decision regions.

About the tests: How are we going to handle those? I am not sure if it makes sense to write unittests for a plotting function, but maybe at least testing that it gets important correctly. Also, I could maybe add an IPython notebook somewhere so that someone can just hit "run all" and see if the plots make sense after making some modifications to the function or so.
About travis: Also here, shall we exclude the function from the test suite, or add matplotlib to the travis setup?

So, the current to do list is

  • resolve travis issue
  • add example scripts
  • add argument for custom markers
  • support decision_function
  • support predict_proba
  • replace "plotting code" in documentation examples if applicable

@amueller
Copy link
Member

I'm not sure about the testing. We have a decorator that skips if no matplotlib is available. I guess just running the function and seed that it runs without errors should be ok?
You can also check sklearn/ensemble/tests/test_partial_dependence.py. We could check that the right number of data points is in the plot. Not sure how meaningful that is.

@rasbt
Copy link
Contributor Author

rasbt commented Aug 20, 2015

I was just going through my to do list and wanted to leave a little sign of life here since it has been already ~3 weeks since the last commit. Currently, I am very, very busy, but this will change in about ~1-2 weeks so that I can finish what I have started here ;)

@amueller
Copy link
Member

no worries, thanks for the heads-up :)

@sklearn-lgtm
Copy link

This pull request introduces 4 alerts when merging 32267ea into 3e29334 - view on LGTM.com

new alerts:

  • 4 for Unused local variable

Comment posted by LGTM.com

@rth rth removed the Sprint label Jun 27, 2019
@cmarmo cmarmo added Superseded PR has been replace by a newer PR and removed help wanted labels Jul 18, 2020
Base automatically changed from master to main January 22, 2021 10:48
@thomasjpfan
Copy link
Member

thomasjpfan commented Apr 14, 2022

I am closing this PR because #16061 has been merged with a new API to plot decision boundaries. Examples in the gallery have been updated using the new API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve module:utils Superseded PR has been replace by a newer PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants