Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect continuous/discrete target vals for color util #73

Closed
bbengfort opened this issue Oct 8, 2016 · 7 comments
Closed

Detect continuous/discrete target vals for color util #73

bbengfort opened this issue Oct 8, 2016 · 7 comments
Assignees
Labels
level: expert deep knowledge of packages required priority: high should be done before next release type: bug something isn't working type: feature a new visualizer or utility for yb
Milestone

Comments

@bbengfort
Copy link
Member

bbengfort commented Oct 8, 2016

Right now the ParallelCoordinates and RadViz do not work for regressions since it supports a class based visualization.

Give these classes the opportunity to do a continuous space by using a sequential colormap and coloring the instance lines according to target value.

@bbengfort bbengfort added this to the Backlog milestone Oct 8, 2016
@bbengfort bbengfort added level: intermediate python coding expertise required priority: high should be done before next release type: bug something isn't working type: feature a new visualizer or utility for yb ready labels Oct 8, 2016
@bbengfort bbengfort modified the milestones: Version 0.3.2, Backlog Oct 13, 2016
@bbengfort bbengfort added in progress label for Waffle board and removed ready labels Jan 13, 2017
@bbengfort
Copy link
Member Author

Now that we have the color_sequence function, we can create a binning method to assign continuous valued points to the number of classes specified by the colormap.

@bbengfort
Copy link
Member Author

bbengfort commented Jan 14, 2017

Finding a solution to this has been tricky because discrete classes has been embedded into the logic of both RadViz and ParallelCoordinates. There are two primary features that continuous data requires:

  1. A colormap that will bin values into a consistent space.
  2. Instead of a label legend, a colorbar that shows the ranges of the instances.

To handle this for now, I'm going to do add a principle argument to both visualizers called "target_type" whose values should be "auto", "discrete", or "continuous" or None. Discrete or continuous will select the correct methodology. For auto or None values. I will create two utility functions:

  • is_continuous(y)
  • is_discrete(y)

Which will accept a vector and use rules to decide if the data is continuous or discrete. Not exactly sure how to implement this.

This should solve the issue for now, but we should revisit it when we develop RadViz and ParallelCoordinates with more features and functionality.

@bbengfort bbengfort self-assigned this Jan 14, 2017
@bbengfort
Copy link
Member Author

Ok, this ticket has been extremely difficult to work on -- the RadViz and Parallel Coordinates classes are entirely set up for class based visualization only (thanks to their being pulled from matplotlib.

This needs to be looked at a bit more closely, and I stashed my changes (which touched nearly every module in yellowbrick). We'll have to push this back to another sprint.

@bbengfort bbengfort modified the milestones: Backlog, Version 0.3.2 Jan 17, 2017
@bbengfort bbengfort added level: expert deep knowledge of packages required and removed in progress label for Waffle board level: intermediate python coding expertise required labels Jan 17, 2017
@bbengfort
Copy link
Member Author

@bbengfort bbengfort modified the milestones: PyCon Sprints, Backlog May 11, 2017
@rebeccabilbro rebeccabilbro changed the title Sequential ParallelCoordinates and RadViz Detect continuous/discrete target vals for color util Aug 16, 2017
@bbengfort
Copy link
Member Author

Sklearn apparently does have a type of target checker that returns continuous or classification as one of it's outputs:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/multiclass.py#L175

@bbengfort
Copy link
Member Author

A very lightweight version of this has been implemented in #399

@bbengfort
Copy link
Member Author

See #334 for package where this should be placed.

@bbengfort bbengfort self-assigned this Dec 28, 2018
@bbengfort bbengfort added the review PR is open label Dec 28, 2018
bbengfort added a commit that referenced this issue Dec 31, 2018
Implements a helper function that returns continuous or discrete
depending on the type of target variable, `y`. This function is similar
to the functionality in the Manifold visualizer but makes use of
sklearn.util.multiclass.type_of_target to make its determination, along
with a limit to the number of discrete colors that can be drawn.

Was undecided if this belonged in `yellowbrick.utils` or in
`yellowbrick.target` -- am open to discussion on this topic.

Fixes #73
@bbengfort bbengfort removed the review PR is open label Dec 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level: expert deep knowledge of packages required priority: high should be done before next release type: bug something isn't working type: feature a new visualizer or utility for yb
Projects
None yet
Development

No branches or pull requests

2 participants