Ready: Feature 139 confusion matrix #144

NealHumphrey · 2017-03-04T21:39:38Z

Work in progress branch; initiating pull request for discussion with @rebeccabilbro

So far:

Made ConfusionMatrix class
Added init, fit, score, and draw methods. These mostly mimic the ClassificationReport
created example demo using the music data set I used in my user test. Can clean this up and include it in the commit, or remove this after development of this feature is complete (i.e. if we don't want it in the examples folder)

To discuss:

Should we use imshow like ClassificationReport, or convert over to pcolormesh like seaborn uses for their heatmap?
Hoisting duplicated code between ConfusionMatrix and ClassificationReport - what's worth moving up?
Handling of font size based on image size (so font size gets smaller with increased categories)
Handling of font size when predicted quantities have too-large-to-fit number of digits. Do we force the user over to percent-based heatmap?
Color defaults - force light gray background with medium gray font when the estimate is zero? Use a different categorical label for 100% right prediction squares (e.g. green, but w/ flexibility)?
Handling class imbalance - should we default to percent-of-true confusion matrix?
Handling class imbalance - should we create an option for a treemap-style resizing of row width, to visually demonstrate class sizes of the 'true' values?

coveralls · 2017-03-04T21:48:04Z

Coverage decreased (-1.4%) to 67.917% when pulling c113fbc on NealHumphrey:feature-139-confusion-matrix into 41ab384 on DistrictDataLabs:master.

coveralls · 2017-03-06T05:02:45Z

Coverage decreased (-2.4%) to 66.926% when pulling 57e4f1b on NealHumphrey:feature-139-confusion-matrix into 41ab384 on DistrictDataLabs:master.

bbengfort

It would also be great if you could show a sample of the visualizer in the comments - I think you can just drag and drop the image into the comment box!

bbengfort · 2017-03-06T15:27:33Z

yellowbrick/classifier.py

+
+        self.cmap = color_sequence(kwargs.pop('cmap', 'YlOrRd'))
+
+        #If possible, assign the classes. If this can't happen now, it will happen in the .fit() method


This is absolutely one of the things that needs to be done; e.g. what if a fitted model is passed into the classifier, then fit() won't be called. The question then becomes; what happens if the estimator.classes_ is not None, but then fit() is called? Right now, if the user specifies the classes, then we don't want fit() to override the user-specified classes, but if we get them off the estimator, then we do want to change the classes on fit.

So here's my suggestion (that I think we'll use in almost all the classifier score models); let's make Visualizer.classes a property rather than a simple attribute that performs the lookup on demand:

def __init__(self, model, ax=None, classes=None, **kwargs): self._classes = classes @property def classes(self): if self._classes is None: try: return self.estimator.classes_ except AttributeError: return None return self._classes @classes.setter def classes(self, value): self._classes = value

Then in fit() we can check the value of self._classes but in draw() we can simply use self.classes. Thoughts?

Good point, I had forgotten about the need to not override user defined classes in fit(). We could simply use existing code and make sure to wrap .classes_ assignment in an if statement in the fit() method; but, making it a property instead is a good way to clue future developers in to the fact that they need to handle the assignment properly if classes is used anywhere else. I'll use the property approach.

Another thing @rebeccabilbro and I discussed is a .util for checking if an estimator is already fit or not. It does not look like there is a tool in sklearn that does that already. Each estimator returns a NotFittedError if you call predict without fit, but each estimator type has a different way of deciding if it should return NotFittedError (e.g. KNeighbors checks for existence of self.fit_method, Forest checks for self.estimators, linear classifiers check for self.coeff_). Might just use .classes_ for the classifiers for now and make the .util when we need it.

Is the classes vs. classes the naming convention for properties vs. attributes, respectively?

oops, markdown misinterpreted my underscores. I meant to say _classes vs. classes_

Yes, we do need an is_fitted() utility, and I think I did the same search you did and found what you did. I was working on the following:

def is_fitted(estimator): X = np.random.rand(1, m) try: estimator.predict(X) return True except NotFittedError: return False

The problem I had was that I didn't know what m should be … it has to be the same number of columns as the expected features matrix. This is, however, the "right" way of doing it. Though frankly, it's even more "right" to just expect the model to be fitted or not fitted and pass the exceptions to the user.

So _classes is a Python thing, and signals to developers that this is an internal property and shouldn't be messed with, whereas classes is the external property (there is no public, private, or protected in Python). Note that __classes causes name mangling.

http://radek.io/2011/07/21/private-protected-and-public-in-python/

The name classes_ is actually a Scikit-Learn thing, from the docs

Estimated parameters: When data is fitted with an estimator, parameters are estimated from the data at hand. All the estimated parameters are attributes of the estimator object ending by an underscore:

bbengfort · 2017-03-06T15:35:08Z

yellowbrick/classifier.py

+        else:
+            self.classes_ = classes
+
+    #TODO this is the same as ClassificationReport, should hoist it up a level


Agreed, this should probably be a part of ClassificationScoreVisualizer as should the property above, and the refactored __init__. Would you be willing to make this change and run the tests?

I suggest we do this as part of #151

And yes I could do that next sprint

Ok, will you put this property into this PR (for just this class) or do you just want to do it all in the next sprint?

I'll put this property directly in this class for this PR, and then we can hoist it up a level in #151

Sounds good.

bbengfort · 2017-03-06T15:36:08Z

yellowbrick/classifier.py

+        y_pred = self.predict(X)
+        self.confusion_matrix = confusion_matrix(y_true = y, y_pred = y_pred, labels=self.classes_, sample_weight=sample_weight)
+        #Convert confusion matrix to percent of true
+        self.confusion_matrix = np.round(self.confusion_matrix / np.sum(self.confusion_matrix, axis=1) * 100, decimals=0)


Potentially I'd like to make this an option to show raw numbers or percents; one that is passed on commit. What do you think?

The issue we discussed is what to do when the raw numbers have too many digits to display properly for the size of the matrix. Options include:

Display it anyways, formatting will break, and user will manually change to % mode. Kind of ugly.

Use logic to decide to override the user choice and display it as percent when it's too large. (maybe raise a warning? Potentially confusing behavior that doesn't fit to user expectations)

Only display the cells with too many digits as percent (potentially confusing when there are mixed meanings of numbers. Could use smart formatting to clarify) if whole number mode is selected.

Resize the font of all cells based on the max number of digits and available space. I would like to do this anyways to deal with situations that have lots of classes (like my example), but I need to find out how to get the display size of the graphic. This might mean adding the text needs to get moved into poof()? Do you know how to get pixel dimensions of the resulting graphic?

Resize the font of each cell individually based on digit count. Same implementation issue as above bullet.

Due to these complications, we thought we'd start first with only allowing the percent-based display format. What do you think about the choices between these options, for future implementation?

Percent seems like a convenient choice so as not to have to worry about fitting big numbers into a tight space; I guess scientific notation is another option to make the length predictable... maybe slightly less readable though?

Actually, I like option 1 - default to percent, let the user manually change it to raw numbers, and if it breaks it breaks.

hate making things that come out ugly. But also hate tools that try to do too much formatting for me without letting me override it (looking at you Excel). I suppose you're right @bbengfort - start out with a default that will always be pretty, and if the user breaks formatting by choosing raw numbers so be it.

bbengfort · 2017-03-06T15:37:02Z

yellowbrick/classifier.py

+
+    def finalize(self, **kwargs):
+        pass
+        #TODO add this stuff


At the very least set the title - I think that's good form; Some of the other visualizers have the ability to let the user specify the title, so we should probably review that and add here.

Yep, planning to.

Note, I used the convention of WIP for "work in progress" in the title of my Pull Request. The only way to do line-by-line comments is with a pull request, so this was a way to discuss progress so far (Rebecca and I had a video chat last night) - I wasn't expecting this to be ready to merge yet.

I get it; mostly wanted to mention the set_title thing. I meant having a title on figures is good form; and also you don't necessarily have to do too much work in finalize()

bbengfort · 2017-03-06T15:38:06Z

yellowbrick/classifier.py

+        Y = np.linspace(start=0, stop=len(self.classes_), num=len(self.classes_))
+
+        # Draw the heatmap
+        mesh = self.ax.pcolormesh(X,Y,self.confusion_matrix, vmin=0, vmax=100,


Does pcolormesh handle light text on dark background and dark text on light background? If so, we should definitely move to it. But also in my reading, I'm seeing that it can handle larger matrices; so I think it's a good idea to use it over imshow.

pcolormesh does not inherently deal with text. I agree we want to handle dark/light text vs. background (my example, and the existing ClassificationReport, both have this issue). We will need to handle this separately (probably manually). This sounds like something that should have a higher-level solution - i.e. for any given color, you can check whether the text should be light or dark. I don't understand how the palettes in yellowbrick work yet, so I'm not quite ready to add that logic in a clean and reusable way yet. I'll make a ticket.

One benefit of pcolormesh is that it allows you to manually specify the X and Y grid corner locations, so you can have adaptively sized blocks. It wasn't much harder to use than imshow, but you do have to generate the X and Y to use in most cases. I agree that pcolormesh is a good idea for any future creation of heatmapping.

bbengfort · 2017-03-06T15:40:16Z

@rebeccabilbro one of the things we wanted to discuss was the code review process; so let's take a look at the code review I did here as an example. Thanks @NealHumphrey for being a good sport!

coveralls · 2017-03-08T21:11:57Z

Coverage decreased (-3.9%) to 65.439% when pulling 9cc35b9 on NealHumphrey:feature-139-confusion-matrix into 41ab384 on DistrictDataLabs:master.

NealHumphrey · 2017-03-08T21:48:49Z

@rebeccabilbro and @bbengfort - This should be ready enough to merge, or at least to show the group tonight. I made an example - below are a few copied images.

In my commit message I noted a few more things I'd like to do:

Illogical behavior if the user selects a subset of classes and chooses to display percent - I flagged this in the documentation and with a TODO, shouldn't take long to resolve
Write tests
Improve how it chooses the font size of the labels
Allow custom color coding
Dark vs. light text as per Create utility function for determining light vs. dark text when given a specific palette color. #154
and of course refactor this w/ the ClassificationReport per Refactor classifier.py into module #151.

Here's the screenshots:

coveralls · 2017-03-08T21:52:58Z

Coverage decreased (-3.9%) to 65.439% when pulling b322d82 on NealHumphrey:feature-139-confusion-matrix into 41ab384 on DistrictDataLabs:master.

rebeccabilbro · 2017-03-08T23:24:57Z

@NealHumphrey can you please add tests for the new class here to make sure our Travis coverage stays good? Note that once you convert classifiers to its own module (#151 - I think you said you wanted to take this one in the next sprint), you'll also need to refactor the tests accordingly.

NealHumphrey · 2017-03-09T02:22:07Z

Ok, going to do one more thing before this is ready to merge - going to fix that bug on the shortened classes list with percent. I got it started, will push a new commit in the next day or two.

bbengfort · 2017-03-13T13:35:41Z

@NealHumphrey let me know when you want me to review.

…olormesh. - Allows for percent or raw count representation of the predictions - Implements heatmap with white=0, green=100%, and yellow-orange-red heatmap for everything else - Allows zooming in on confusion matrix using passed list of classes, with accurate %-of-all-true calculations - Tested for moderately large class numbers (30+) - Diagonal line indicates accurate predictions - Documentation added to docs/examples/methods.rst for one example matrix Suggested future improvements: - Resize font based on image size + class count - Allow custom color coding, including custom colors for _over and _under values (e.g. zero and 100%) - Vary text font color based on background color - While this branch currently adds an example to methods.rst, the examples/confusionMatrix.ipynb has additional examples using different of the passed parameters. This should probably also be exported as rst and added to the docs, but there was not an obvious place to put it so I am excluding that for now. Note this commit squashes all previous commits on this branch

NealHumphrey · 2017-03-16T17:11:20Z

@bbengfort ready for you to review! I added tests, fixed that bug I mentioned, and made a few other improvements. I also exported the rst of the ipynb and added it to the documentation.

Note, I decided to go ahead and squash and rebase this on the current develop branch to keep a clean version history - thoughts on doing this in the future vs. not?

coveralls · 2017-03-16T17:16:13Z

Coverage increased (+1.9%) to 70.374% when pulling ceee7f8 on NealHumphrey:feature-139-confusion-matrix into 942a070 on DistrictDataLabs:develop.

coveralls · 2017-03-16T17:16:13Z

Coverage increased (+1.9%) to 70.374% when pulling ceee7f8 on NealHumphrey:feature-139-confusion-matrix into 942a070 on DistrictDataLabs:develop.

bbengfort

@NealHumphrey just did a review and it all looks good - very nice implementation of the Visualizer.

The only thing I'd request is that you move the examples/confusionMatrix.ipynb to examples/NealHumphrey -- this is kind of where we landed for both development and galleries, that the examples/examples.ipynb would be the "official" gallery of examples and that other notebooks for testing and development would be in our user folders.

You can certainly add the confusion matrix example to examples.ipynb if you think it belongs there as wel..

bbengfort · 2017-03-16T22:21:14Z

yellowbrick/utils.py

@@ -155,6 +156,17 @@ def is_dataframe(obj):
 isdataframe = is_dataframe


+#From here: http://stackoverflow.com/questions/26248654/numpy-return-0-with-divide-by-zero
+def numpy_div0( a, b ):


So I have a couple of questions with this function. first, can a be a list, scalar, ndarray, etc? And second, can b be a list, scalar, ndarray, etc? E.g. what happens when we do:

numpy_div0(10, [0,2,5])

My question relates to the signature of the function. My preference (and this isn't blocking the pull request) is to call this div_safe or something like that, and to call the arguments something a bit more specific since we'll have potentially many other developers use this. I do like the over-func inclusion of the stackoverflow link; it's not part of the docstring but does give developers more information.

I converted this to div_safe, added some documentation to the function, and wrote tests demonstrating what the function can and can't do.

bbengfort · 2017-03-16T23:14:36Z

@NealHumphrey as for the rebasing -- that's fine with me; I want to make sure we merge into develop with --no-ff --no-edit so that we can keep track of who's doing what with the feature branches looping off; but keeping them compact and rebasing internally is fine. In fact, it really did help me with the code review; so I think maybe we should definitely make this a best practice.

bbengfort · 2017-03-17T17:11:13Z

Actually, I see that we can select "squash and merge" from the dropdown, so potentially you don't have to rebase, we can do that automatically through github.

NealHumphrey · 2017-03-20T01:34:02Z

@bbengfort I made both those changes, should be good to go now.

coveralls · 2017-03-20T01:42:42Z

Coverage increased (+2.0%) to 70.499% when pulling 23ac982 on NealHumphrey:feature-139-confusion-matrix into 942a070 on DistrictDataLabs:develop.

coveralls · 2017-03-20T12:18:10Z

Coverage increased (+2.0%) to 70.499% when pulling 4b900ef on NealHumphrey:feature-139-confusion-matrix into abb0550 on DistrictDataLabs:develop.

bbengfort · 2017-03-23T00:46:29Z

@NealHumphrey are we ready to resolve the conflicts and merge this PR?

# Conflicts: # examples/nealhumphrey/confusionMatrix.ipynb # yellowbrick/classifier.py # yellowbrick/utils.py

… notebooks to make sure they work.

coveralls · 2017-03-23T15:48:04Z

Coverage increased (+0.1%) to 70.989% when pulling ce71fc6 on NealHumphrey:feature-139-confusion-matrix into 45268fc on DistrictDataLabs:develop.

NealHumphrey · 2017-03-23T15:58:12Z

@bbengfort in this branch I used the 'Squash and merge' button. This does indeed perform what you want - squashes all the commits into one commit AND rebases it on top of the current yellowbrick/develop branch. The existing branch (i.e. the one in my fork) remains unchanged, so as such there's a slightly weird disconnect in version history in my local git history using git log, as seen in this screenshot of my GUI:

If we'd done a regular merge (without squash or rebase), those two would be connected. So people will just need to be aware of this, but it will make a cleaner /develop version history.

Note, this doesn't do what you said you wanted about fast forward - the way it handles squash+rebase by only doing this to the ddl/yellowbrick repo means its effectively doing a fast forward merge.

Personally, going forward I'll probably squash and rebase my forked branch before completing the pull request, which makes the squash button on Github redundant.

bbengfort · 2017-03-23T21:06:22Z

Alright, thanks for looking into this - that all sounds good!

bbengfort · 2017-03-23T21:08:26Z

@NealHumphrey also, we can add the ConfusionMatrix to the list in the README and in the front page of the documentation if you want.

bbengfort added the review PR is open label Mar 4, 2017

bbengfort requested changes Mar 6, 2017

View reviewed changes

bbengfort mentioned this pull request Mar 6, 2017

Confusion Matrix for multi-class classifiers #139

Closed

bbengfort assigned NealHumphrey Mar 6, 2017

NealHumphrey mentioned this pull request Mar 6, 2017

Refactor classifier.py into module #151

Closed

NealHumphrey changed the title ~~WIP - Feature 139 confusion matrix~~ Complete: Feature 139 confusion matrix Mar 8, 2017

NealHumphrey changed the title ~~Complete: Feature 139 confusion matrix~~ WIP: Feature 139 confusion matrix Mar 9, 2017

NealHumphrey force-pushed the feature-139-confusion-matrix branch from b322d82 to ceee7f8 Compare March 16, 2017 17:06

NealHumphrey changed the base branch from master to develop March 16, 2017 17:08

NealHumphrey changed the title ~~WIP: Feature 139 confusion matrix~~ Ready: Feature 139 confusion matrix Mar 16, 2017

This was referenced Mar 16, 2017

Feature 154 light vs dark text #172

Merged

Visual unit tests #173

Closed

bbengfort requested changes Mar 16, 2017

View reviewed changes

bbengfort added the type: feature a new visualizer or utility for yb label Mar 19, 2017

Moved example notebook, and expanded numpy_div0 to div_safe

23ac982

bbengfort approved these changes Mar 20, 2017

View reviewed changes

Merge branch 'develop' into feature-139-confusion-matrix

4b900ef

NealHumphrey added 2 commits March 23, 2017 11:33

Merge branch 'develop' into feature-139-confusion-matrix

4a84a3f

# Conflicts: # examples/nealhumphrey/confusionMatrix.ipynb # yellowbrick/classifier.py # yellowbrick/utils.py

Fixes one incorrect automatic merge issue and re-runs the two example…

ce71fc6

… notebooks to make sure they work.

NealHumphrey merged commit 1320a9a into DistrictDataLabs:develop Mar 23, 2017

NealHumphrey removed the review PR is open label Mar 23, 2017

NealHumphrey mentioned this pull request Aug 11, 2017

Lines in Classification Report #267

Closed


		self.cmap = color_sequence(kwargs.pop('cmap', 'YlOrRd'))

		#If possible, assign the classes. If this can't happen now, it will happen in the .fit() method

Ready: Feature 139 confusion matrix #144

Ready: Feature 139 confusion matrix #144

Conversation

NealHumphrey commented Mar 4, 2017 • edited by bbengfort

coveralls commented Mar 4, 2017 • edited

coveralls commented Mar 6, 2017 • edited

bbengfort left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbengfort Mar 6, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbengfort commented Mar 6, 2017

coveralls commented Mar 8, 2017 • edited

NealHumphrey commented Mar 8, 2017 • edited

coveralls commented Mar 8, 2017 • edited

rebeccabilbro commented Mar 8, 2017

NealHumphrey commented Mar 9, 2017

bbengfort commented Mar 13, 2017

NealHumphrey commented Mar 16, 2017

coveralls commented Mar 16, 2017

coveralls commented Mar 16, 2017

bbengfort left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbengfort commented Mar 16, 2017

bbengfort commented Mar 17, 2017

NealHumphrey commented Mar 20, 2017

coveralls commented Mar 20, 2017 • edited

coveralls commented Mar 20, 2017 • edited

bbengfort commented Mar 23, 2017

coveralls commented Mar 23, 2017

NealHumphrey commented Mar 23, 2017

bbengfort commented Mar 23, 2017

bbengfort commented Mar 23, 2017

NealHumphrey commented Mar 4, 2017 •

edited by bbengfort

coveralls commented Mar 4, 2017 •

edited

coveralls commented Mar 6, 2017 •

edited

bbengfort Mar 6, 2017 •

edited

coveralls commented Mar 8, 2017 •

edited

NealHumphrey commented Mar 8, 2017 •

edited

coveralls commented Mar 8, 2017 •

edited

coveralls commented Mar 20, 2017 •

edited

coveralls commented Mar 20, 2017 •

edited