Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend DataVisualizer and Update DataVisualizer Subclasses #927

Merged
merged 13 commits into from
Jul 23, 2019
Merged

Extend DataVisualizer and Update DataVisualizer Subclasses #927

merged 13 commits into from
Jul 23, 2019

Conversation

bbengfort
Copy link
Member

@bbengfort bbengfort commented Jul 17, 2019

This PR is primarily to fix #923 - which I thought would be a simple and short update to get @naresh-bachwani moving forward on his ProjectionVisualizer project. Nothing is simple, however, and the cleanup caused me to have to extend functionality and one thing led to another so here we are.

Please see my inline comments in the PR for more detail, but broadly I have made the following changes:

  1. I have extended the DataVisualizer with better color handling capability
  2. I have updated ParallelCoordinates and RadViz to make use of this functionality
  3. I have cleaned up some other areas in the code that were affected by the changes.
  4. Fixed technical debt created by Detect continuous/discrete target vals for color util #73

These changes pave the way to make ParallelCoordinates and RadViz more general data visualizers for different target types (radviz for regressors as discussed in #73!), and could be the start for solutions such as those requested in #677, #476, #556

CHECKLIST

  • Is the commit message formatted correctly?
  • Have you noted the new functionality/bugfix in the release notes of the next release?
  • Included a sample plot to visually illustrate your changes?
  • Do all of your functions and methods have docstrings?
  • Have you added/updated unit tests where appropriate?
  • Have you updated the baseline images if necessary?
  • Have you run the unit tests using pytest?
  • Is your code style correct (are you using PEP8, pyflakes)?
  • Have you documented your new feature/functionality in the docs?
  • Have you built the docs using make html?

The class labels that define the discrete values in the target. Only
available if the target type is discrete. This is guaranteed to be
strings even if the classes are a different type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about consistency in the docstring of ParallelCoordinates, RadViz, and DataVisualizer ... then eventually in PCA and Manifold as well. Any advice would be welcome!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question... perhaps we should open up a follow-on task. @Kautumn06 — thoughts? You're always so good at helping with consistency in our communications across similar visualizers!

if color==None and colormap==None:
colormap=palettes.DEFAULT_SEQUENCE
if colors is None and colormap is None:
colormap = palettes.DEFAULT_SEQUENCE
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I did look into moving this to resolve_colors -- right now if both are None, resolve_colors just returns the default color cycle. We do still need to do something better than this though.

CONTINUOUS = 'continuous'
UNKNOWN = "unknown"

def __eq__(self, other):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required so you can do TargetType.AUTO == 'auto'

Copy link
Member

@rebeccabilbro rebeccabilbro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew! This is a big contribution @bbengfort that not only clears a lot of debt but also clears a really nice path forward for simplifying existing visualizers that perform similar computations and transformations, and for creating new visualizers that can leverage nice, well-documented base classes. I particularly like the new get_colors method!

I've noted a few minor things, suggestions and questions (for instance, do you think we should update the coloration FAQ answer?) — let me know what you think and thanks again for taking on such a big effort with such aplomb!

tests/test_contrib/test_scatter.py Show resolved Hide resolved
yellowbrick/contrib/scatter.py Show resolved Hide resolved
yellowbrick/contrib/scatter.py Show resolved Hide resolved
tests/test_features/test_base.py Show resolved Hide resolved
tests/test_features/test_base.py Show resolved Hide resolved
yellowbrick/features/pcoords.py Show resolved Hide resolved
yellowbrick/features/pcoords.py Show resolved Hide resolved
yellowbrick/features/pcoords.py Show resolved Hide resolved
yellowbrick/features/projection.py Show resolved Hide resolved
yellowbrick/features/radviz.py Show resolved Hide resolved
@rebeccabilbro
Copy link
Member

Have just been reading through old issues — wondering if this (or perhaps one of the upcoming Manifold PRs that you and @naresh-bachwani are working on) will also close #437?

visualizer = ParallelCoordinates(
classes=classes, features=features, colormap="PRGn"
visualizer = Manifold(
manifold="isomap", target="continuous", colormap="YlOrRd"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naresh-bachwani let's make sure this is true in #930!

@bbengfort
Copy link
Member Author

@rebeccabilbro as requested, I've updated the FAQ.

@rebeccabilbro
Copy link
Member

Thank you @bbengfort! The updated FAQ looks terrific — I just pushed up a tiny typo correction, so will get this merged in as soon as the tests finish running.

@rebeccabilbro rebeccabilbro merged commit b10fc17 into DistrictDataLabs:develop Jul 23, 2019
@bbengfort bbengfort deleted the finalize-datavisualizer branch July 31, 2019 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove duplicate code from sub classes of DataVisualizer.
2 participants