New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature suggestion] Add `shared_limits=True` option to PredictionError #263

Closed
ianozsvald opened this Issue Jun 20, 2017 · 9 comments

Comments

Projects
None yet
3 participants
@ianozsvald
Contributor

ianozsvald commented Jun 20, 2017

Depending on the quality of a regressor we can get different output ranges to the input range. The PredictionErrror scatter plot may then have a slope that isn't 45 degrees, yet the current plot will look as though there's a 45 degree relationship in the data. Here's an example:
image

An option for shared_limits=True might be sensible, maybe even default, so that relationships in the data are more apparent.

If I set the axis to share their limits then I get quite a different plot. Here it looks as though my higher measured values (e.g. >130) are consistently under-predicted, in fact my prediction range doesn't look brilliant at all!
image

fig, ax = plt.subplots(figsize=(8,6)); 
model.score(X_test, y_test)
ax.set_xlim((70, 170))
ax.set_ylim((70, 170))
model.poof(ax=ax)
@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Jun 22, 2017

Member

@ianozsvald you're absolutely right, and thank you for this outstanding feature request. We'll add it to the backlog and get to it ASAP; though if you're interested in doing a PR for it, we'd be more than happy to accept it!

We generally have difficulty setting the range of the axes in many of our figures, precisely for the reasons you describe and it's easy to rely on the auto axes. I propose the following to address your feature request, let me know what you think:

  • Add an argument to init, shared_limits=True
  • Set the axes x and y limits in PredictionError.finalize()
  • If shared_limits (default), set both the xlim and ylim to (min(xlim[0], ylim[0]), max(xlim[1], ylim[1])), e.g. the lowest valued limit to the highest valued limit on either axis.

This should ensure that the 45 degree line is, in fact, 45 degrees.

Member

bbengfort commented Jun 22, 2017

@ianozsvald you're absolutely right, and thank you for this outstanding feature request. We'll add it to the backlog and get to it ASAP; though if you're interested in doing a PR for it, we'd be more than happy to accept it!

We generally have difficulty setting the range of the axes in many of our figures, precisely for the reasons you describe and it's easy to rely on the auto axes. I propose the following to address your feature request, let me know what you think:

  • Add an argument to init, shared_limits=True
  • Set the axes x and y limits in PredictionError.finalize()
  • If shared_limits (default), set both the xlim and ylim to (min(xlim[0], ylim[0]), max(xlim[1], ylim[1])), e.g. the lowest valued limit to the highest valued limit on either axis.

This should ensure that the 45 degree line is, in fact, 45 degrees.

@ianozsvald

This comment has been minimized.

Show comment
Hide comment
@ianozsvald

ianozsvald Jun 22, 2017

Contributor

@bbengfort that looks lovely!

Contributor

ianozsvald commented Jun 22, 2017

@bbengfort that looks lovely!

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Jun 22, 2017

Member

I even went so far as to set the aspect ratio to sqaure:

shared_limits=False:

slfalse

shared_limits=True:

sltrue

Let me know what you think or if I should remove that.

TODOS:

  • write tests for shared_limits
  • create visual tests for PE plots
  • update the documentation
  • handle the +/- 1 bounds in draw
  • add r2 to title.
Member

bbengfort commented Jun 22, 2017

I even went so far as to set the aspect ratio to sqaure:

shared_limits=False:

slfalse

shared_limits=True:

sltrue

Let me know what you think or if I should remove that.

TODOS:

  • write tests for shared_limits
  • create visual tests for PE plots
  • update the documentation
  • handle the +/- 1 bounds in draw
  • add r2 to title.
@ianozsvald

This comment has been minimized.

Show comment
Hide comment
@ianozsvald

ianozsvald Jun 22, 2017

Contributor

Squared looks sensible. How about (perhaps) adding a 45 degree greyed out line to give a visual reference for the 1:1 prediction ideal vs the black-dashed fitted line (which above is at approx. 50 degrees or so)?

Contributor

ianozsvald commented Jun 22, 2017

Squared looks sensible. How about (perhaps) adding a 45 degree greyed out line to give a visual reference for the 1:1 prediction ideal vs the black-dashed fitted line (which above is at approx. 50 degrees or so)?

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Jun 22, 2017

Member

Your wish, as they say:

slfalse

sltrue

Member

bbengfort commented Jun 22, 2017

Your wish, as they say:

slfalse

sltrue

@ianozsvald

This comment has been minimized.

Show comment
Hide comment
@ianozsvald

ianozsvald Jun 22, 2017

Contributor

I'll enjoy giving this a go, thanks!

Contributor

ianozsvald commented Jun 22, 2017

I'll enjoy giving this a go, thanks!

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Aug 11, 2017

Member

@ianozsvald this changes are now in version 0.5 - so you should be able to update yellowbrick and take advantage of them.

Member

bbengfort commented Aug 11, 2017

@ianozsvald this changes are now in version 0.5 - so you should be able to update yellowbrick and take advantage of them.

@bbengfort bbengfort closed this Aug 11, 2017

@ianozsvald

This comment has been minimized.

Show comment
Hide comment
@ianozsvald

ianozsvald Aug 12, 2017

Contributor

@bbengfort just to note - I've had master working with a client for a little while, I've been using this change to help explain a regression model - many thanks! https://github.com/ianozsvald/data_science_delivered/blob/master/ml_explain_regression_prediction.ipynb (see PredictionError against the truth)

Contributor

ianozsvald commented Aug 12, 2017

@bbengfort just to note - I've had master working with a client for a little while, I've been using this change to help explain a regression model - many thanks! https://github.com/ianozsvald/data_science_delivered/blob/master/ml_explain_regression_prediction.ipynb (see PredictionError against the truth)

@bbengfort

This comment has been minimized.

Show comment
Hide comment
@bbengfort

bbengfort Aug 14, 2017

Member

Very cool - looks like there are a couple of potential yellowbrick features/visualizers we could add from that notebook! Please let us know if any stand out that you'd be interested in us pursuing.

Member

bbengfort commented Aug 14, 2017

Very cool - looks like there are a couple of potential yellowbrick features/visualizers we could add from that notebook! Please let us know if any stand out that you'd be interested in us pursuing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment