New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conda packages and image test-driven text removal for Miniconda on Appveyor for #744 and #690 #823
Conda packages and image test-driven text removal for Miniconda on Appveyor for #744 and #690 #823
Conversation
@nickpowersys thank you so much for taking a look at this and diving in with the update! We don't have very many Windows/Anacondas users on the core contributor side, so we struggle with supporting it and really appreciate your PR. Also, I apologize if I end up asking basic questions for clarification as a result. To summarize the solution to make sure we're on the same page - if using miniconda (identified by the It looks like the tests are now failing due to increased RMSE in the conda environment. The failing tests all appear to have an elevated tolerance if To answer your questions:
Thank you again for taking a look into this - it's a relief that we're moving forward and may be able to test miniconda on AppVeyor again! |
@bbengfort Thank you for your feedback! I'm glad this is useful. About the conda environment, imports and DLLs, and using pip for non-conda VMs, yes. I would be happy to look at aliasing that may be occurring. It may take me until next weekend to investigate. It is possible to determine within Python whether a regular Python or Anaconda Python is being used. I could reply with more detail after ensuring the method is reliable. |
I tried uncommenting several settings for antialiasing in matplotlibrc, which did not result in any changes in the number of tests failing (16) on a Windows machine with a conda environment. I looked at a pair of images (test_binary_probability.png) and could not see aliasing. It is possible to detect whether the active Python is in a conda environment using Python, based on whether a directory specific to conda installations ('conda-meta') is part of the active environment path. Using an environment variable for the backend seems straightforward to apply for testing purposes. |
@bbengfort Would you recommend an approach of xfails for the failing tests in conda environments for now? For this PR, I would resolve those that are failing on Windows (there are 16). The RMSE are well below the tol of 10 mentioned in #393. To understand possible causes for the image comparison failures, I ran builds on Travis for Python 3.6 and 3.7 Miniconda environments with conda packages on Ubuntu Xenial, and image comparison tests failed there as well (those were the only types of failing tests on Linux). Therefore, there seems to be something associated with conda packages or dependencies that is causing the comparisons to fail. The xfails would be based on the Anaconda interpreter being active. |
There is a possibility that Freetype is the reason. I can investigate this before going to xfails. |
While there are some differences in fonts between baseline and test images in failing tests, text is not found in the baseline and test images in general. I will need to investigate further to see why there is text in failing tests. Before comparisons are done, the ImageComparison class has a cleanup() function that removes the title and ticks on the x, y and z axes with a NullFormatter https://github.com/DistrictDataLabs/yellowbrick/blob/develop/tests/base.py As an example passing test, the baseline images in https://github.com/DistrictDataLabs/yellowbrick/blob/develop/tests/baseline_images/test_classifier/test_class_prediction_error/test_class_prediction_error_quickmethod.png have all text removed. The images in the corresponding actual_images directory also do not have text. |
@nickpowersys I'm sorry it's taken me so long to respond; we've been bogged down in PRs and issues and many of us were at PyCon so your messages got lost in all of the GitHub notices; I apologize. I really do appreciate you taking a look at this for us. We really do want to support our Windows/Anaconda users but unfortunately, none of us are Windows/Anacondas users, so it makes it tricky. To answer some of your earlier questions:
How can I assist you in getting this to the end with the xfails? Do you want to investigate the style overrides for anti-aliasing? Also, sorry if I missed this, but do you have the tests passing locally on your windows machine? I'd be interested in the difference between the baseline and actual images on Windows if you were looking at them. |
@nickpowersys - on #690 there was some discussion about using the testing version of matplotlib to unify the freetype dependency with anaconda. It can be installed using conda-forge as follows:
The version may be a bit behind, (currently 3.0.2) but it might be worth it so we don't have to xfail all these tests. Would you be willing to try this in this PR? |
x and y labels were not being removed for the image comparison tests. The failing Windows Miniconda tests were at 16 after addressing the backend for conda. I will update the rest of the baseline images to remove the text using Linux. Legends are not removed by default. [Note: This PR changed the default to True.] If you have any thoughts on removing legends either way, please let me know. I could look at Freetype if tests were still failing and there was a need to preserve legend text. Aside: before using Windows on some projects, I used Ubuntu until hardware issues affected my previous laptop. Otherwise, I do development on a Mac. |
@nickpowersys thanks for sticking with this! I realize that any changes to the image similarity tests generally affects every single baseline image. If you need help regenerating the images, please let @ndanielsen or I know! |
I have
Then, there were 5 failing tests in tests/test_features/test_jointplot.py on the Miniconda builds. Within that test file, I marked these as xfail conditional on a conda environment, independent of platform. The diffs created through test_jointplot.py show vertical gridlines that were not aligned, and some subtle aliasing otherwise. I did not see yellowbrick.style imported in that module. This PR partially addresses #690. |
@nickpowersys thanks for pinging me. I'm getting caught up on this issue at the moment. I plan to deep dive into this over the next couple days. |
This PR helps to address conda on Travis and #690. Miniconda on Travis: After x, y labels were removed, there were 73 image comparison tests failing with conda on Linux. Therefore, there were also RMSE differences between conda environments and regular Python on Linux. More analysis is needed to quantify tests failing on Linux conda environments with the new baseline images that have legends removed. There should be significantly fewer failing image comparisons. An assessment can determine if the causes for failing are the same on Travis as on Appveyor as the causes for image comparison tests on Windows environments with RMSE>0.1. |
Freetype: I ran a Travis build after installing the testing version of matplotlib 3.0.2 into a conda environment. However, I saw no change in the number of tests passing/failing. |
@nickpowersys Thanks for your patience and comments over the weekend. Please accept my apologies that we were unable to immediately respond to them. As I mentioned in #823 (comment), I'm doing a deep dive on these windows testing issues today, and hopefully I can offer some guidance soon. Definitely want to acknowledge that you've done a ton of work on this issue, and we're grateful for it. |
@nickpowersys I spent a good chunk of today diving into several of the issues that this PR addresses. Also another similar issue popped up in this PR: #854 due to changes in matplotlib==3.1.0 where the backend seems like it can no longer be set programmatically as is done here: Line 24 in e68e28e
The I think that the best way to manage the backend for testing is creating a When I did this on one of my branches, tests were passing on travis, appveyor, my macbook, my ubuntu box, and windows10 in a virtualbox. (the stars seemed to be aligned on this) With matplotlib==3.1.0, there were some failing image comparison tests in What do you think of this as a solution? I know that you've done a ton of work on these issues, I want you to get your credit as a contributor. If you're okay with this do you mind opening a separate PR for #690 adding a After that, let's put our heads together on highly focused PR on #744 and address the miniconda issues. I like your clever solution for removing the x, y labels. I've also been tinkering around with a separate Thanks again for your efforts on these issues. A robust testsuite and cross-platform CI is highly important for a growing library like yellowbrick. |
@ndanielsen and @nickpowersys I know you guys are cranking away at this and I want to say thank you to you both! @ndanielsen is the original author of our image similarity tests ported from matplotlib, so you're in good hands now @nickpowersys - good luck and let me know if you need me to chime in at any point! |
@nickpowersys thanks for the response. I hope that you are enjoying the conference. They definitely take a lot out of you. I feel like I'm just starting to get my normal levels of energy back after spending about a week at pycon. =) I am block time to work on this and other yellowbrick testing related issues on Friday. For setting the MPL backend and related configuration for testing, I think that part of the solution needs to keep the
I think one of the first goals is to have a painless As for structuring this effort, let's try to keep the PRs small, focused and minimally complex so that we can incrementally add value. The best approach might be to slowly "paint these problems into a corner", and (re)-build up cross-platform testingsuite support is the way to go. Again, thanks for your continued follow up on these issues. This is probably one of the most important issues on the board as it directly impacts how easy it is to maintain the project and make contributions to the project easy. Enjoy the rest of your conference. 🥇 |
@nickpowersys I agree with your suggestion for limiting this just to
|
@ndanielsen I pulled develop through and including the PR for the added rankd test (866) then did an interactive rebase (I confirmed that PR 866 is in the git log for the appveyorminiconda branch). The updates reflect the removal of both x and y labels, as well as legends. Before proceeding further, I wanted to confirm that it will be okay to push with a regular force to origin/appveyorminiconda (not force-with-lease). I would do The only baseline_image that I did not understand and did not update (when I tried, the image comparison did not fail as intended) is test_scatter_image_fail. The test is passing with the existing baseline image, which still has labels and a legend. Thank you! |
There were Appveyor build failures for regular Python and Miniconda due to test collection failures with tests/test_features/test_base.py. By excluding pytest 4.6.0 through |
Although tests pass with mpl 3.0.0 the reason for it being excluded was that images broke silently. Therefore, the last commit in the PR recognizes this and maintains the exclusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nickpowersys great job on this so far.
This is really close. I'm getting this error when I run it on Ubuntu with pyenv python-3.6.3.
______________________ DispersionPlotTests.test_dispersion_plot_annotate_docs _______________________
self = <tests.test_text.test_dispersion.DispersionPlotTests testMethod=test_dispersion_plot_annotate_docs>
def test_dispersion_plot_annotate_docs(self):
"""
Assert no errors occur during DispersionPlot integration
with annotate_docs parameter turned on
"""
text = [doc.split() for doc in corpus.data]
target_words = ['girl', 'she', 'boy', 'he', 'man']
visualizer = DispersionPlot(target_words, annotate_docs=True)
visualizer.fit(text)
visualizer.ax.grid(False)
> self.assert_images_similar(visualizer, tol=25)
E yellowbrick.exceptions.ImageComparisonFailure: images not close (RMS 25.070)
E tests/actual_images/test_text/test_dispersion/test_dispersion_plot_annotate_docs.png
E vs
E tests/baseline_images/test_text/test_dispersion/test_dispersion_plot_annotate_docs.png
tests/test_text/test_dispersion.py:108: ImageComparisonFailure
Sounds good! Understanding the state post-this PR is the best way to
move forward. I will finalize this PR shortly. I really appreciate your
feedback.
…On Tue, Jun 4, 2019, 10:09 AM Nathan ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/test_features/test_jointplot.py
<#823 (comment)>
:
> @@ -47,6 +48,13 @@
pd = None
+def is_win_conda_env():
+ return (os.name == 'nt' and
+ os.path.exists(os.path.join(sys.prefix, 'conda-meta')))
+
+
+win_tol = 5.5 if is_win_conda_env() else None
@nickpowersys <https://github.com/nickpowersys> thanks for the thoughtful
and detailed analysis on tolerances across operating systems.
My sense is that this is a bigger conversation. In theory, having
different tolerances across osenvs is suboptimal as we hope that the
library could perform identically in each osenv. We probably should do a
deeper dive into why the images produced are much different and draw
inspiration from the approach that matplotlib takes to this very issue.
I recommend that you open up an issue
<https://github.com/DistrictDataLabs/yellowbrick/issues/new/choose>
essentially with your proposal here. Perhaps you could list out the state
of the image similarity tests across each of the osenvs to help steer the
conversation and eventual work.
For the moment thought, let's close out this PR knowing that we have
windows conda and venv coverage in a much stronger place. It'll allow many
other contributors to directly benefit from this work while the next task
at hand is getting to the bottom of these bigger issues.
What do you think about this approach?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#823?email_source=notifications&email_token=ABCSFOSXGSFCOY5MZPDSO4DPY2ATNA5CNFSM4HHMN2O2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB2RD2YQ#discussion_r290348459>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABCSFOVW6UMW2MURA72RHGDPY2ATNANCNFSM4HHMN2OQ>
.
|
If I could merge from develop (I understand the conflict needs to be removed first?), then I could regenerate the new baseline images from test_draw.py that were added and the PR will hopefully be finalized with that update. |
@nickpowersys I'll see what I can do on this tomorrow. |
@nickpowersys I think that I figured this one out and got it passing in a PR on my personal fork. Following the github instructions...
the only change that I was that instead of
Followed by a Hope this helps! |
if self.remove_legend: | ||
if self.remove_labels: | ||
self.ax.set_xlabel("") | ||
self.ax.set_ylabel("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just note that this solves the xlabel, ylabel issue that has popped up in mtpl 3.x
@nickpowersys I've encountered this #git-problem a couple times where there are protected branches. I think that the problem context is simple that tests/baseline_images/test_draw/test_manual_legend.png was updated ~2 days ago, and your current branch doesn't have those changes. The solution is to merge in the Outside of that, thanks again for your work on this PR and others. I'm starting to actively reference some of the changes in other PRs with other contributors and it's solved specially matplotlib 3.x problems that we didn't know that we had. |
I went through the steps 1-6 (through |
📢 A huge thank you @ndanielsen! |
@nickpowersys thanks for your big efforts on this! We sincerely hope to keep you as a contributor on yellowbrick! |
Summary
To address the failing imports in Windows VMs with Miniconda:
The Matplotlib backend configuration has been addressed by #862. mpl.use('agg') has been working as intended, as explained in that PR.
To minimize the probability of image comparison tests failing in general:
Before removing the labels and legends, approximately 25% of Windows image comparison tests have required an elevated value of tol in order to pass (227 tests have the check
if sys.platform == 'win32'
), out of 894 image comparison tests.When
'win32'
is changed to'wwwww'
purely for analysis purposes, ignoring the Windows-specifictol
values, the number of failing tests with regular Python on Appveyor is at 24, and with Miniconda, the number is 31. Therefore, the number of Windows tests with RMSE values >0.1 has dropped roughly 90%. Follow-up efforts can remove the instances of Windows-specifictol
values.Builds for above analysis:
PyPI/pip Python: https://ci.appveyor.com/project/nickpowersys/yellowbrick/builds/25047957
Anaconda: https://ci.appveyor.com/project/nickpowersys/yellowbrick/builds/25049306
To create the conda environment, I used tests/requirements.txt.
The conda packages were determined by the conda solver.
The name of the environment is yellowbrick. It is activated directly following the installation of the packages so that any further commands with the environment are done from the new environment.
In general, it can be expected that conda environments are used primarily with conda packages, and therefore using conda packages in Miniconda builds can help to verify conda users' and developers' experience through tests using these environments, including those involving the backend.