New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class Balance Heat Map Visualizer #321

Merged
merged 18 commits into from Mar 12, 2018

Conversation

Projects
None yet
5 participants
@lwgray
Contributor

lwgray commented Mar 5, 2018

In this PR I've implemented a ClassBalanceHeatMap visualizer based on @bbengfort prototype code . I was asked by @rebeccabilbro to take a crack at this implementation.

This visualizer is a heatmap implementation of the class balance visualizer. It is being added because the heatmap provides a way to quickly understand how good your classifier is at predicting the right classes.

Take for instance, the sklearn digits data set (see below) with 10 classes. The heatmap displays data as a stacked bar chart. On the X axis are the different classes. The Y axis is a count of how many times your model predicted a certain class. For example... The second bar shows the total number of instances where a Class 1 prediction was assigned. Additionally, each bar is color coded to show the number of predicted class. In the case of the second bar, it shows three colors with the following counts, Class 1(Purple, 50), Class 8 (Red, 3), Class 9(Grey, 1). This means that of the total number of "Class 1" predictions, 92.5% were actually "Class 1", 5.5% were "Class 8" and 2% were "Class 9"

heatmap

HELP

Major Issue

If I build the visualizer in a python notebook as seen here it works.

However, if I write the visualizer into yellowbrick/classifier/class_balance_heat_map.py (in this pull request) I receive a RuntimeError: Invalid DISPLAY variable. I can provide the traceback if required.

To Do

  • Graphics -
    Strike throughs are being left as user-defined parameters

    • Fix axis labels
    • Fix axis ticks
    • Draw box around legend
    • Get rid of grid lines in Figure
    • Set ylim of chart so that there exists space between bars and chart border
  • Tests

    • Test1: Test that indices and classes are the same length. Here are two instances to test
      • This would be the case where more labels are supplied in our test data, via score than in our training data;
      • if there are less labels in y and y_pred, that's fine - we just need to ensure there are zeros for the fit classes that do not have any associated values
  • Documentation

    • Create Example Notebook
    • Review docstrings - Bad wording
    • class_prediction_error.py/rst doc files
  • Issues

    • Open issue regarding self.score_ and self.classes_ nomenclature
      throughout yellowbrick. see @bbengfort @ndanielsen comments below
    • Open up an issue to create the target module as per @rebeccabilbro
  • Misc

    • Correct RuntimeError
    • Perform _check_target on y and y_pred - raise YellowbrickValueError

lwgray added some commits Mar 5, 2018

feat: If applied, this commit will add a heat map visualizer
This visualizer is a heatmap implementation of the class balance visualizer.
It is being added because the heatmap provides a way to quickly
understand how well your classifier is at predicting the right classes.

Take for instance,  a data set with three classes(I assigned colors),
('Good:Green', "Better:Blue", and "Best:Black").  The heatmap displays
data as a stacked bar chart.  The X axis is the different classes. The
Y axis is a count of how many times your model predicted a class.  For
example... Bar 1 would be "Good".  Bar 1 shows the total number of instances
where a "Good" prediction was assigned. Additionally, bar 1 also
shows the how many were incorrectly labeled.  Let's just say that my model is
only ok at assigning labels, the visualizer could show
a multicolored bar 70% Green, 20% Blue, and 10% Black.  This meaning that
of the total number of "Good" predictions, 70% were actually "Good", 20% were
"Better" and 10% were "Best"
Merge branch 'feature-ClassBalanceHeatMap' into develop
This merge is necessary because it creates a new class balance heatmap
visualizer.  The visualizer allows for quick view of how well your
model is performing (properly labeling test cases).  It presents
multicolored bars with colors depicting how much of each label has
been assigned per class.  The hope is that each class consists
mainly of the correct label.

@bbengfort bbengfort added the review label Mar 5, 2018

@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 6, 2018

Contributor

Side Note / RuntimeError Correction

I can correct the RuntimeError if I include "%matplotlib inline" in jupyter notebook

import matplotlib
from yellowbrick.classifier import ClassBalanceHeatMap
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split as tts
from sklearn.ensemble import RandomForestClassifier
import matplotlib.pyplot as plt
%matplotlib inline

digits = load_digits()
X_train, X_test, y_train, y_test = tts(digits.data, digits.target, test_size=0.33, random_state=42)
visualizer = ClassBalanceHeatMap(RandomForestClassifier())
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.poof(outpath='heatmap.png')

heatmap2

Contributor

lwgray commented Mar 6, 2018

Side Note / RuntimeError Correction

I can correct the RuntimeError if I include "%matplotlib inline" in jupyter notebook

import matplotlib
from yellowbrick.classifier import ClassBalanceHeatMap
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split as tts
from sklearn.ensemble import RandomForestClassifier
import matplotlib.pyplot as plt
%matplotlib inline

digits = load_digits()
X_train, X_test, y_train, y_test = tts(digits.data, digits.target, test_size=0.33, random_state=42)
visualizer = ClassBalanceHeatMap(RandomForestClassifier())
visualizer.fit(X_train, y_train)
visualizer.score(X_test, y_test)
visualizer.poof(outpath='heatmap.png')

heatmap2

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 6, 2018

Member

@lwgray in terms of the error you mentioned, this has to do with your operating system and the environment you're running matplotlib in. See RuntimeError: Invalid DISPLAY variable for more details. Are you running on Ubuntu or Windows? Your solution is the correct one in a Jupyter notebook, using %matplotlib inline or better yet, %matplotlib notebook sets the matplotlib backend to the correct display mode. If you're running in a python script and still having trouble you can simply savefig instead of show (or poof(outpath=path)) or you can use matplotlib.use('agg') to specify the backend and avoid the error.

I'm going to go ahead and do a brief code review of the current pull request -- you should see that momentarily!

Member

bbengfort commented Mar 6, 2018

@lwgray in terms of the error you mentioned, this has to do with your operating system and the environment you're running matplotlib in. See RuntimeError: Invalid DISPLAY variable for more details. Are you running on Ubuntu or Windows? Your solution is the correct one in a Jupyter notebook, using %matplotlib inline or better yet, %matplotlib notebook sets the matplotlib backend to the correct display mode. If you're running in a python script and still having trouble you can simply savefig instead of show (or poof(outpath=path)) or you can use matplotlib.use('agg') to specify the backend and avoid the error.

I'm going to go ahead and do a brief code review of the current pull request -- you should see that momentarily!

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 6, 2018

Member

By the way - very good PR intro text and thanks for including an image!

Member

bbengfort commented Mar 6, 2018

By the way - very good PR intro text and thanks for including an image!

@bbengfort

@lwgray this is a great start, thank you for contributing! I made a few comments to hopefully help direct you as you move forward with this visualizer. I am open to discussion on all of them and look forward to that discussion.

I think the biggest thing we need to determine is whether or not this is an independent visualizer or if it should be part of the ClassBalance visualizer. I think names are important and that will go a long time to helping me form my opinion.

Right now I'm leaning toward making ClassBalance a new kind of visualizer - a target visualizer that is a subclass of data visualizer - it can be fit to display information, and having this become the ModelVisualizer of the same. However, I could also be convinced that these should be in the same class.

@rebeccabilbro @ndanielsen @NealHumphrey @jkeung do you have any thoughts on this?

Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
@rebeccabilbro

This comment has been minimized.

Show comment
Hide comment
@rebeccabilbro

rebeccabilbro Mar 7, 2018

Collaborator

@bbengfort @ndanielsen @NealHumphrey @jkeung @lwgray - my sense is that both ClassBalance and ClassBalanceHeatMap should be DataVisualizers (subclass of Visualizer), housed in a module separate from ModelVisualizers and FeatureVisualizers, focused purely on target visualization.

Collaborator

rebeccabilbro commented Mar 7, 2018

@bbengfort @ndanielsen @NealHumphrey @jkeung @lwgray - my sense is that both ClassBalance and ClassBalanceHeatMap should be DataVisualizers (subclass of Visualizer), housed in a module separate from ModelVisualizers and FeatureVisualizers, focused purely on target visualization.

@ndanielsen

Looks great! Just needs to some tests and it would be great if you also pushed in your notebook with an example of it's use -- it would help with writing the docs for this

Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 7, 2018

Contributor

@bbengfort Thanks for reviewing so quickly; there's no need to thank me - you did all the leg work. I will try my best to expediently address your comments ..

Contributor

lwgray commented Mar 7, 2018

@bbengfort Thanks for reviewing so quickly; there's no need to thank me - you did all the leg work. I will try my best to expediently address your comments ..

@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 7, 2018

Contributor

I finished all the corrections except one... I was wondering if a consensus was reached on how to (on the whole ) treat this visualizer, as a visualizer? as a part of classbalance,? etc.....

Contributor

lwgray commented Mar 7, 2018

I finished all the corrections except one... I was wondering if a consensus was reached on how to (on the whole ) treat this visualizer, as a visualizer? as a part of classbalance,? etc.....

@bbengfort

I think you're doing a lot of thoughtful work that goes well beyond the prototype, and I definitely appreciate that!

Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance_heat_map.py Outdated
@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 8, 2018

Member

@lwgray let's go with a new visualizer called ClassPredictionError but in the class_balance.py module. Then let's open up an issue to create the target module as per @rebeccabilbro -- I think that's beyond the scope of this PR.

Member

bbengfort commented Mar 8, 2018

@lwgray let's go with a new visualizer called ClassPredictionError but in the class_balance.py module. Then let's open up an issue to create the target module as per @rebeccabilbro -- I think that's beyond the scope of this PR.

@NealHumphrey

This comment has been minimized.

Show comment
Hide comment
@NealHumphrey

NealHumphrey Mar 8, 2018

Contributor

@bbengfort and @rebeccabilbro @lwgray @ndanielsen @jkeung

A brief but related digression - one thing that has always bothered me about the plain ClassBalance visualizer is that it uses precision_recall_fscore_support and then only uses the .support portion of that to get the class balance. Technically to just draw the class balance all you need is to bin the y_actual into their classes - no fitted model needed, and in fact the same class balance would apply across models. My vote would be to rewrite the regular ClassBalance visualizer to be based purely on the data itself, which also opens it up to being used on training data not just test data.

In any case ClassPredictionError visualizer (i.e. this new one by @lwgray) clearly needs a fitted model, so I agree with your suggestion to keep it as a new visualizer in class_balance.py

Contributor

NealHumphrey commented Mar 8, 2018

@bbengfort and @rebeccabilbro @lwgray @ndanielsen @jkeung

A brief but related digression - one thing that has always bothered me about the plain ClassBalance visualizer is that it uses precision_recall_fscore_support and then only uses the .support portion of that to get the class balance. Technically to just draw the class balance all you need is to bin the y_actual into their classes - no fitted model needed, and in fact the same class balance would apply across models. My vote would be to rewrite the regular ClassBalance visualizer to be based purely on the data itself, which also opens it up to being used on training data not just test data.

In any case ClassPredictionError visualizer (i.e. this new one by @lwgray) clearly needs a fitted model, so I agree with your suggestion to keep it as a new visualizer in class_balance.py

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 8, 2018

Member

@NealHumphrey yeh I agree and so does @rebeccabilbro -- I think we'll create a target module to go along with the features module to do this.

For the current class balance, there is a score version that shows the balance of y_true vs the balance of y_pred, though it would need to be a CV scoring method, not a single split method:

https://gist.github.com/bbengfort/bd524672aff751f4340be58833f256ec

Which produces:

cb_preds

@lwgray check out the above gist, it has some of the _check_target and validation stuff I mentioned in earlier comments (though it is not tested prototype code and I've noted a few places where some assumptions are made).

Member

bbengfort commented Mar 8, 2018

@NealHumphrey yeh I agree and so does @rebeccabilbro -- I think we'll create a target module to go along with the features module to do this.

For the current class balance, there is a score version that shows the balance of y_true vs the balance of y_pred, though it would need to be a CV scoring method, not a single split method:

https://gist.github.com/bbengfort/bd524672aff751f4340be58833f256ec

Which produces:

cb_preds

@lwgray check out the above gist, it has some of the _check_target and validation stuff I mentioned in earlier comments (though it is not tested prototype code and I've noted a few places where some assumptions are made).

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 9, 2018

The purpose of this commit is to address several review statements fr…
…om PR DistrictDataLabs#321

    1. This commit changes the Visualizer Class names from ClassBalanceHeatMap to ClassPredictionError
    2. ClassPredictError now resides as it's own visualizer in the class_balance.py module
    3. I've affixed estimator attributes, such as self.classes_, with an underscore suffix
    4. I've Used _check_target func to check if y_true and y_pred belong to the same classification task

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 9, 2018

Merge branch 'feature-ClassBalanceHeatMap' into develop
The purpose of this commit is to address several review statements from PR DistrictDataLabs#321
    1. This commit changes the Visualizer Class names from ClassBalanceHeatMap to ClassPredictionError
    2. ClassPredictError now resides as it's own visualizer in the class_balance.py module
    3. I've affixed estimator attributes, such as self.classes_, with an underscore suffix
    4. I've Used _check_target func to check if y_true and y_pred belong to the same classification task
@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 9, 2018

Contributor

@bbengfort Thank you for all the attention you have giving this PR.
When you review this most recent push, could you pay particular attention on the lines of code in class_balance.py where I am comparing the lengths of indices and classes then raising an Error if they don't match (lines 239-241).

Contributor

lwgray commented Mar 9, 2018

@bbengfort Thank you for all the attention you have giving this PR.
When you review this most recent push, could you pay particular attention on the lines of code in class_balance.py where I am comparing the lengths of indices and classes then raising an Error if they don't match (lines 239-241).

@bbengfort

This is really coming together and looks great, thank you so much for the hard work! If you'd like, you could just put together a test file with some test stubs describing what tests you think are necessary and decorating them with @pytest.mark.skip(reason='not implemented yet') -- and that would be sufficient to complete this PR. I can then go over testing with you at the next meeting and we can do another PR to actually write the tests.

Show outdated Hide outdated yellowbrick/classifier/class_balance.py Outdated
Show outdated Hide outdated yellowbrick/classifier/class_balance.py Outdated
@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 10, 2018

Member

@lwgray tests have been fixed - if you hit the "update branch" button and then do a pull your local branch should have the tests fixed as well.

Member

bbengfort commented Mar 10, 2018

@lwgray tests have been fixed - if you hit the "update branch" button and then do a pull your local branch should have the tests fixed as well.

bbengfort and others added some commits Mar 4, 2018

FeatureImportances Visualizer (#317)
* Implements FeatureImportances visualizer, tests, and documentation 
* Plots a bar graph of `feature_importances_` or `coef_` attributes 
* Supports relative and absolute feature importances
The purpose of this commit is to address several review statements fr…
…om PR #321

    1. This commit changes the Visualizer Class names from ClassBalanceHeatMap to ClassPredictionError
    2. ClassPredictError now resides as it's own visualizer in the class_balance.py module
    3. I've affixed estimator attributes, such as self.classes_, with an underscore suffix
    4. I've Used _check_target func to check if y_true and y_pred belong to the same classification task
Change ClassPredictionError Chart aesthetics
There was a lack of space between the tallest bar and the upper
edge of the chart.  I increased the y-lim relative to the tallest
bar.

Additionally, hard-coded plot attributes were removed so that
they will be user-defined instead.  For instance, the grid parameter
was removed and is at it default setting of True.
Squashed commit of the following:
commit 537fb452734c357d5a4e4587f80dc2649b73bd9f
Author: lwgray <lwgray@gmail.com>
Date:   Sat Mar 10 16:47:41 2018 -0500

    The purpose of this commit is to create testing template with test stubs
    The second purpose is to fix misspelling in class_balance.py

    The testing template was created in file test_class_prediction_error.py
    The file contain suggested tests but is set up to be skipped over by pytest

    The misspelling was "NotImplementedError", this prevented this error from being raised.

lwgray added some commits Mar 11, 2018

Squashed commit of the following:
commit 9900ea65ccdf16d8829e6105f485e73f3a39e2a3
Author: lwgray <lwgray@gmail.com>
Date:   Sat Mar 10 19:16:42 2018 -0500

    Alter parameter names for new colors: resolve_color function

commit bdf8f9e64b8204f61211780038e08db47d97d023
Merge: 537fb45 a6e5045
Author: lwgray <lwgray@gmail.com>
Date:   Sat Mar 10 19:10:26 2018 -0500

    Merge branch 'develop' into feature-ClassBalanceHeatMap

    Bring up-to-date lastest ddl/develop changes

commit 537fb452734c357d5a4e4587f80dc2649b73bd9f
Author: lwgray <lwgray@gmail.com>
Date:   Sat Mar 10 16:47:41 2018 -0500

    The purpose of this commit is to create testing template with test stubs
    The second purpose is to fix misspelling in class_balance.py

    The testing template was created in file test_class_prediction_error.py
    The file contain suggested tests but is set up to be skipped over by pytest

    The misspelling was "NotImplementedError", this prevented this error from being raised.
@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 11, 2018

Contributor

@bbengfort

  1. I created the test stubs that you requested and placed them in tests/test_classifiers/test_class_prediction_error.py
  2. I updated the code to comply with your changes to resolve_colors function.
  3. Created an example notebook under /examples/lwgray
  4. Travis is still not working... I updated branch them pulled to my local branch
Contributor

lwgray commented Mar 11, 2018

@bbengfort

  1. I created the test stubs that you requested and placed them in tests/test_classifiers/test_class_prediction_error.py
  2. I updated the code to comply with your changes to resolve_colors function.
  3. Created an example notebook under /examples/lwgray
  4. Travis is still not working... I updated branch them pulled to my local branch
@bbengfort

Looks good; one quick question - why did you delete the rocauc baseline images? That might also be contributing to tests failing.

Show outdated Hide outdated tests/test_classifier/test_class_prediction_error.py Outdated
class ClassPredictionErrorTests(VisualTestCase):
@pytest.mark.skip(reason="not implemented yet")
def test_class_report(self):

This comment has been minimized.

@bbengfort

bbengfort Mar 11, 2018

Member

test_class_prediction_error?

@bbengfort

bbengfort Mar 11, 2018

Member

test_class_prediction_error?

This comment has been minimized.

@lwgray

lwgray Mar 12, 2018

Contributor

@bbengfort I am not quite sure what you are asking here... 😕

@lwgray

lwgray Mar 12, 2018

Contributor

@bbengfort I am not quite sure what you are asking here... 😕

Show outdated Hide outdated tests/test_classifier/test_class_prediction_error.py Outdated
@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 12, 2018

Contributor

@bbengfort
I must have deleted the rocauc baseline images by accident. In all honesty, I am not quite sure what you are referring to

Contributor

lwgray commented Mar 12, 2018

@bbengfort
I must have deleted the rocauc baseline images by accident. In all honesty, I am not quite sure what you are referring to

lwgray added some commits Mar 12, 2018

1. Correct docstring in ClassPredictionError test function. 2. Remove…
… extraneous imports in ClassPredictionError test file. 3. Change initial docstring in test_class_prediction_error to properly reflect the Class that test are referencing.
@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 12, 2018

Contributor

@bbengfort All the test have passed now. The fix was simple; I only had to correct a misspelling in pytest decorator

Contributor

lwgray commented Mar 12, 2018

@bbengfort All the test have passed now. The fix was simple; I only had to correct a misspelling in pytest decorator

@bbengfort

The deleted changes must have been when you pulled from what @ndanielsen was doing. Don't worry about it. I've approved the PR and pulled it into develop, thanks again for all your hard work on this!

@bbengfort bbengfort merged commit 1804ce7 into DistrictDataLabs:develop Mar 12, 2018

3 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-1.09%) to 80.946%
Details
lgtm analysis: Python No alert changes
Details

@bbengfort bbengfort removed the review label Mar 12, 2018

@lwgray

This comment has been minimized.

Show comment
Hide comment
@lwgray

lwgray Mar 12, 2018

Contributor

@bbengfort Thanks for letting me help out.
Also, I don't want you guys to forget about this issue you wanted created. I would but I don't know enough to post this issue.

ISSUE:

"Then let's open up an issue to create the target module as per @rebeccabilbro "

Contributor

lwgray commented Mar 12, 2018

@bbengfort Thanks for letting me help out.
Also, I don't want you guys to forget about this issue you wanted created. I would but I don't know enough to post this issue.

ISSUE:

"Then let's open up an issue to create the target module as per @rebeccabilbro "

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Mar 13, 2018

Member

@lwgray thanks! see #334

Member

bbengfort commented Mar 13, 2018

@lwgray thanks! see #334

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 16, 2018

If applied this commit: Adds potential tests for ClassPredictionError D…
…istrictDataLabs#321

The following test primarily test for Image Similarity and Raised Errors.

Test Name:  test_class_type
Not all data types are appropriate for this classifier.
Our classifier handles both multiclass and binary classed data
We test this fact by supplying our Classifier with Multilabel data
which should raise a YellowbrickValueError

Test Names: test_classes_less_than_indices & test_classes_greater_than_indices
There can be a mismatch between the number of classes and number of indices in the data.
There are two sides to the mismatch.  The first is when there are greater number of
classes than indices and secondly the opposite, when there are less classes than indices.

    * Let's look at when there are more classes than indices
        * This can occur 2 different ways
            * the first is when a user specifies less classes than indices.

            * The second is when y and y_pred contain zero values for one of the specified classes
    * Now let's look at when there less classes than indices
        * This occurs when the user is trying to filter classes

Test Names: test_integration_class_prediction_error & test_class_prediction_error_quickmethod
Simply the Visualizer must complete without error
    * Assert produced images are similar to baseline images
Again, simply the visualizers quick method must work without error
    * Assert produced images are similar to baseline images

@lwgray lwgray referenced this pull request Mar 16, 2018

Merged

ClassPredictionError Tests #342

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 16, 2018

If applied this commit: Adds potential tests for ClassPredictionError D…
…istrictDataLabs#321

The following test primarily test for Image Similarity and Raised Errors.

Test Name:  test_class_type
Not all data types are appropriate for this classifier.
Our classifier handles both multiclass and binary classed data
We test this fact by supplying our Classifier with Multilabel data
which should raise a YellowbrickValueError

Test Names: test_classes_less_than_indices & test_classes_greater_than_indices
There can be a mismatch between the number of classes and number of indices in the data.
There are two sides to the mismatch.  The first is when there are greater number of
classes than indices and secondly the opposite, when there are less classes than indices.

    * Let's look at when there are more classes than indices
        * This can occur 2 different ways
            * the first is when a user specifies less classes than indices.

            * The second is when y and y_pred contain zero values for one of the specified classes
    * Now let's look at when there less classes than indices
        * This occurs when the user is trying to filter classes

Test Names: test_integration_class_prediction_error & test_class_prediction_error_quickmethod
Simply the Visualizer must complete without error
    * Assert produced images are similar to baseline images
Again, simply the visualizers quick method must work without error
    * Assert produced images are similar to baseline images

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 17, 2018

If applied this commit: Creates docs for ClassPredictError Visualizer
This commit is related to DistrictDataLabs#321 and DistrictDataLabs#342

ClassPredictionError is a new visualizer and lacks documentation

This commits creates both the python file and rst file demonstrating
how to create a ClassPredictError visualizer.  A image was also
produced from this demonstration, for its inclusion in the rst file.

bbengfort added a commit that referenced this pull request Mar 18, 2018

Creates docs for ClassPredictError Visualizer (#347)
This commit is related to #321 and #342

ClassPredictionError is a new visualizer and lacks documentation

This commits creates both the python file and rst file demonstrating
how to create a ClassPredictError visualizer.  A image was also
produced from this demonstration, for its inclusion in the rst file.

bbengfort added a commit that referenced this pull request Mar 18, 2018

Adds tests for ClassPredictionError (#342)
Extends the new visualizer, ClassPredictionError #321, with tests. 

In the PR you will find a detailed outline of the test that I have already created. Writing test is not my strong suit. For the following group of tests, I mainly wrote tests for image similarity and assert raised errors.

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 19, 2018

The purpose of this commit is to address several review statements fr…
…om PR DistrictDataLabs#321

    1. This commit changes the Visualizer Class names from ClassBalanceHeatMap to ClassPredictionError
    2. ClassPredictError now resides as it's own visualizer in the class_balance.py module
    3. I've affixed estimator attributes, such as self.classes_, with an underscore suffix
    4. I've Used _check_target func to check if y_true and y_pred belong to the same classification task

lwgray added a commit to lwgray/yellowbrick that referenced this pull request Mar 19, 2018

If applied this commit: Adds potential tests for ClassPredictionError D…
…istrictDataLabs#321

The following test primarily test for Image Similarity and Raised Errors.

Test Name:  test_class_type
Not all data types are appropriate for this classifier.
Our classifier handles both multiclass and binary classed data
We test this fact by supplying our Classifier with Multilabel data
which should raise a YellowbrickValueError

Test Names: test_classes_less_than_indices & test_classes_greater_than_indices
There can be a mismatch between the number of classes and number of indices in the data.
There are two sides to the mismatch.  The first is when there are greater number of
classes than indices and secondly the opposite, when there are less classes than indices.

    * Let's look at when there are more classes than indices
        * This can occur 2 different ways
            * the first is when a user specifies less classes than indices.

            * The second is when y and y_pred contain zero values for one of the specified classes
    * Now let's look at when there less classes than indices
        * This occurs when the user is trying to filter classes

Test Names: test_integration_class_prediction_error & test_class_prediction_error_quickmethod
Simply the Visualizer must complete without error
    * Assert produced images are similar to baseline images
Again, simply the visualizers quick method must work without error
    * Assert produced images are similar to baseline images
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment