Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing issue #590:Adding train alpha and test alpha to residual #806

Merged
merged 7 commits into from Apr 8, 2019
Merged

Addressing issue #590:Adding train alpha and test alpha to residual #806

merged 7 commits into from Apr 8, 2019

Conversation

naresh-bachwani
Copy link
Contributor

@naresh-bachwani naresh-bachwani commented Apr 5, 2019

The following changes were made to Residuals referenced in #590:

  1. Added new parameters train_alpha and test_alpha to specify the opacity of train and test data respectively in residuals.py
  2. Modified the test_residuals.py

The result obtained is attached below:
Figure_1

alpha : float, default: 0.75
Specify a transparency where 1 is completely opaque and 0 is completely
transparent. This property makes densely clustered points more visible.
train_alpha : float, default: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason you didn't keep alphas set to default: 0.75

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reasons for that. It's more of a personal choice. It can be set as per the yellowbrick standards.

Copy link
Contributor

@lwgray lwgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have gone through residuals.py and made a few comments but overall you did a very good job. I will look over the test file and get back to you asap.

visualizer = ResidualsPlot(
model=model, ax=ax, hist=hist, train_color=train_color,
test_color=test_color, line_color=line_color, alpha=alpha,
test_color=test_color, line_color=line_color, train_alpha=train_alpha,test_alpha=test_alpha,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 3rd line is a bit long... you could shorten it as such

 visualizer = ResidualsPlot(
        model=model, ax=ax, hist=hist, train_color=train_color,
        test_color=test_color, line_color=line_color, alpha=alpha. 
        train_alpha=train_alpha,test_alpha=test_alpha,
        **kwargs
)

test_color='g', line_color=LINE_COLOR, alpha=0.75,
**kwargs):
test_color='g', line_color=LINE_COLOR, train_alpha=1,
test_alpha=1,**kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again here, we might want to set alphas to 0.75

@@ -422,7 +429,10 @@ def __init__(self, model, ax=None, hist=True, train_color='b',
# Store labels and colors for the legend ordered by call
self._labels, self._colors = [], []

self.alpha = alpha
self.alpha = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor suggestion and if you don't want to change it, it's ok. Maybe we could change the variable name to self.alphas to note that the variable contains multiple alphas

@@ -593,7 +605,8 @@ def residuals_plot(model,
test_color='g',
line_color=LINE_COLOR,
random_state=None,
alpha=0.75,
train_alpha=1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

train and test alpha should be default of 0.75

@lwgray
Copy link
Contributor

lwgray commented Apr 6, 2019

how did you produce your images above? I noticed that they are missing legends

@lwgray
Copy link
Contributor

lwgray commented Apr 6, 2019

A side note for future contributions, you merged develop:develop while our branching conventions are to merge from a feature branch. https://www.scikit-yb.org/en/develop/contributing/getting_started.html#branching-convention

@naresh-bachwani
Copy link
Contributor Author

While producing the above results, I did not use poof(). The legends will show up once poof() is called.

@naresh-bachwani
Copy link
Contributor Author

I have made all the recommended changes.

Copy link
Contributor

@lwgray lwgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have done an excellent job with this PR. We still need to address why the tolerance was changed. There will be a secondary review by @rebeccabilbro

tests/test_regressor/test_residuals.py Outdated Show resolved Hide resolved
)

alpha = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great job of adding this here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -369,4 +372,4 @@ def test_alpha_param(self, mock_sca):
# Test that alpha was passed to internal matplotlib scatterplot
_, scatter_kwargs = visualizer.ax.scatter.call_args
assert "alpha" in scatter_kwargs
assert scatter_kwargs["alpha"] == 0.3
assert scatter_kwargs["alpha"] == 0.75
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this shows a good understanding that alpha is 0.3 after visualizer.fit then changes to 0.75 after visualizer.score

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member

@rebeccabilbro rebeccabilbro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @naresh-bachwani — I think these adjustments to the residuals plot will make it much easier to distinguish the training and test points independently, particularly when they are very densely clustered! I've added a note to the question about the test tolerances and had another minor request. Once those edits are in, this should be good to merge!

tests/test_regressor/test_residuals.py Outdated Show resolved Hide resolved
yellowbrick/regressor/residuals.py Outdated Show resolved Hide resolved
@lwgray
Copy link
Contributor

lwgray commented Apr 8, 2019

@naresh-bachwani I am getting ready to merge your PR, In doing so I've updated your branch with recent changes on yellowbrick develop. You'll need to do a pull in order to incorporate these changes locally before you push anything else to this branch

@lwgray lwgray merged commit e4f9689 into DistrictDataLabs:develop Apr 8, 2019
@lwgray lwgray removed the ready label Apr 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants