-
-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Q-Q plot to the ResidualsPlot (#853) #1042
Conversation
Adds functionality to plot Q-Q plot side by side with residuals plot and verifies that it works as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for submitting this PR. I had a question about your use of the probplot
from scipy and whether it is valid for creating a Q-Q plot. Can you give your reason as to why you choose probplot
@VladSkripniuk For each visualizer there is a "Quick Method". It allows for the Visualizer to be called on a single line. Since you modified the ResidualsPlot Class you will need to alter the quick method also. The quickmethod begins as Line 766 Tasks
|
@VladSkripniuk Another, note... Your branch is out-of-date with the base branch... Please Update this branch before pushing any changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VladSkripniuk Great job on this PR. There are just a couple of things that need to be done before I can test and merge this PR.
- Update Quick method - See comments on main PR page
- Update Documentation to showcase Q-Q Plot here
@lwgray thank you for your help with this PR! |
@@ -719,6 +776,7 @@ def residuals_plot( | |||
y_test=None, | |||
ax=None, | |||
hist=True, | |||
qqplot=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this addition!
@@ -772,6 +830,12 @@ def residuals_plot( | |||
If set to 'density', the probability density function will be plotted. | |||
If set to True or 'frequency' then the frequency will be plotted. | |||
|
|||
qqplot : {True, False}, default: False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this here. We do our best to keep our docstring up-to-date because this is how our docs are generated.
@@ -52,6 +52,19 @@ Note that if the histogram is not desired, it can be turned off with the ``hist= | |||
|
|||
.. warning:: The histogram on the residuals plot requires matplotlib 2.0.2 or greater. If you are using an earlier version of matplotlib, simply set the ``hist=False`` flag so that the histogram is not drawn. | |||
|
|||
Histogram can be replaced with a Q-Q plot, which is a common way to check that residuals are normally distributed. If the residuals are normally distributed, then their quantiles when plotted against quantiles of normal distribution should form a straight line. The example below shows, how Q-Q plot can be drawn with a ``qqplot=True`` flag. Notice that ``hist`` has to be set to ``False`` in this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great explanation. Awesome on notifying the user that hist has to be set to False
We fixed the test failures.... i will do a final review and if test past then I will merge this PR. I have updated your branch so do a pull before you make anymore commits |
@bbengfort or @rebeccabilbro Travis has Image Comparison failures for the Q-Q plot related tests although they are passing locally. I had a couple of questions:
python -m tests.images -C tests/test_regressor/test_residuals.py |
|
||
visualizer.fit(self.data.X.train, self.data.y.train) | ||
visualizer.score(self.data.X.test, self.data.y.test) | ||
visualizer.finalize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lwgray I recommend removing this call to finalize and seeing if that helps fix the tests on Travis - there is a known problem that the image similarity tests aren't removing text from the alternate axes (as you can see in the baseline image) - and text is one of the primary reasons we have a divergence between AppVeyor and Travis.
Were the baseline images created on a Windows machine or Mac/Linux? That might also contribute to this; we have an xfail for windows or conda, but if the image were generated on Windows than it would fail anyway on Travis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VladSkripniuk can you remove the call to finalize as described in the above comment from @bbengfort ?
I updated your branch so make sure you pull before you commit and new changes.
Also, did you create the baseline images on Windows or Mac?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @VladSkripniuk your baseline image is out-of-date. We have since merged fixes issues we were having with matplotlib, so you will need to recreate your baseline images. You can perform the following commands to reset your baseline images.
python -m tests.images -C tests/test_regressor/test_residuals.py
pytest -vs tests/test_regressor/test_residuals.py
python -m tests.images tests/test_regressor/test_residuals.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lwgray I executed the commands you mentioned:
python -m tests.images -C tests/test_regressor/test_residuals.py
pytest -vs tests/test_regressor/test_residuals.py
python -m tests.images tests/test_regressor/test_residuals.py
and ran python -m pytest tests/test_regressor/test_residuals.py
and tests passed successfully on my machine. I upload the updated version of tests/baseline_images/test_regressor/test_residuals/test_residuals_plot_QQ_plot.png
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VladSkripniuk great call on upgrading mpl to 3.2.1 - I forgot to tell you to do that.
…als_plot_QQ_plot.png
@VladSkripniuk Thank you for your patience and your contribution. I will start working on your other PRs. |
This PR fixes #853 which requested a feature to allow the user to create Q-Q plot beside the residual plot as a mean of normality checking of the residuals.
I have made the following changes:
scipy.stats.probplot
Sample Code and Plot
CHECKLIST
pytest
?make html
?