Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent GraphViz output on AppVeyor/Windows/Anaconda #213

Open
peternowee opened this issue Aug 9, 2019 · 11 comments
Open

Inconsistent GraphViz output on AppVeyor/Windows/Anaconda #213

peternowee opened this issue Aug 9, 2019 · 11 comments
Labels
Milestone

Comments

@peternowee
Copy link
Member

Tests are failing on AppVeyor, even though I see them passing on my Linux laptop. See #211 for an example of failing AppVeyor tests. The current AppVeyor setup is Windows with Anaconda/Miniconda.

I investigated a bit by setting up an AppVeyor build worker with the same configuration as used here for pydot/pydot. I also logged into the build worker VM through RDP as described here, added some debug code to print me the hashes and then ran the tests manually a few more times without changing any settings. I found that the hashes for these tests can differ on each run:

Setup            Test name                    Python   hexdigest  hexdigest_original
AppVeyor-pydot   test_graph_with_shapefiles   2.7      7fb17df    2685aa6
AppVeyor-pydot   test_graph_with_shapefiles   3.6      e4f9322    2685aa6
AppVeyor-pydot   test_graph_with_shapefiles   3.7      848057b    e4f9322
AppVeyor-pn      test_graph_with_shapefiles   2.7      2685aa6    e4f9322
AppVeyor-pn      test_graph_with_shapefiles   2.7      e4f9322    2685aa6
AppVeyor-pn      test_graph_with_shapefiles   3.6      7fb17df    848057b
AppVeyor-pn      test_graph_with_shapefiles   3.7      848057b    e4f9322
my linux box     test_graph_with_shapefiles   2.7      a3c1ea7    a3c1ea7

Setup            Test name                    Python   pydot_sha  graphviz_sha
AppVeyor-pydot   regression_tests (b53.dot)   2.7      355a929    e19d6e0
AppVeyor-pydot   regression_tests (b53.dot)   3.6      355a929    94bef6d
AppVeyor-pydot   regression_tests (b53.dot)   3.7      94bef6d    355a929
AppVeyor-pn      regression_tests (b53.dot)   2.7      e19d6e0    355a929
AppVeyor-pn      regression_tests (b53.dot)   2.7      355a929    2e7c174    
AppVeyor-pn      regression_tests (b53.dot)   2.7      355a929    e19d6e0
AppVeyor-pn      regression_tests (b53.dot)   2.7      e19d6e0    94bef6d
AppVeyor-pn      regression_tests (b53.dot)   3.6      e19d6e0    2e7c174
AppVeyor-pn      regression_tests (b53.dot)   3.7      6765c50    94bef6d
my linux box     regression_tests (b53.dot)   2.7      37dfa75    37dfa75

(Sometimes the test still passes, when hashes are coincidentally the same.)

Notice how the hash values seem to be randomly pulled from a small pool of possible outcomes, and that they can occur both in the left and the right column.

So, there seems to be some source of slight random noise in the current AppVeyor setup that make the test outcomes somewhat unpredictable. Maybe some imaging or font library? Some race condition?

@peternowee
Copy link
Member Author

peternowee commented Aug 9, 2019

Ok, I checked some more and found that even running Graphviz dot directly on the AppVeyor VM produces different results each time:

C:\> set "MINICONDA_DIRNAME=C:\Miniconda"
C:\> set "PATH=%MINICONDA_DIRNAME%;%MINICONDA_DIRNAME%\\Scripts;%PATH%"
C:\> activate conda-env-2.7
(conda-env-2.7) C:\> cd C:\projects\pydot\test\graphs
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-2.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-3.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-4.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-5.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-6.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dot -Tjpe b53.dot >b53-7.jpg
(conda-env-2.7) C:\projects\pydot\test\graphs>dir *jpg
 Volume in drive C is Windows
 Volume Serial Number is D4AB-4044

 Directory of C:\projects\pydot\test\graphs

08/09/2019  09:54 AM            77,324 b53-2.jpg
08/09/2019  09:54 AM            74,625 b53-3.jpg
08/09/2019  09:54 AM            77,324 b53-4.jpg
08/09/2019  09:54 AM            77,324 b53-5.jpg
08/09/2019  09:55 AM            74,625 b53-6.jpg
08/09/2019  09:55 AM            74,625 b53-7.jpg
08/09/2019  09:53 AM            77,324 b53.jpg
               7 File(s)        533,171 bytes
               0 Dir(s)  38,632,243,200 bytes free

Notice the different file sizes. Here are the hashes:

$ sha256sum *jpg
355a929097e5cd5a11204dd33a5746eba5cbab5802d0c35b8a4aa0c4624600c0  b53-2.jpg
94bef6d00c4d81ccf6740bf8df1ab7001a74c929ef345d7f6cf0aeeed4063014  b53-3.jpg
355a929097e5cd5a11204dd33a5746eba5cbab5802d0c35b8a4aa0c4624600c0  b53-4.jpg
355a929097e5cd5a11204dd33a5746eba5cbab5802d0c35b8a4aa0c4624600c0  b53-5.jpg
94bef6d00c4d81ccf6740bf8df1ab7001a74c929ef345d7f6cf0aeeed4063014  b53-6.jpg
94bef6d00c4d81ccf6740bf8df1ab7001a74c929ef345d7f6cf0aeeed4063014  b53-7.jpg
355a929097e5cd5a11204dd33a5746eba5cbab5802d0c35b8a4aa0c4624600c0  b53.jpg

Notice that these hashes were also in the pydot_unittest results shown in my earlier comment.

Finally, notice that the images are even very different from each other:

b53-2.jpg:
b53-2

b53-3.jpg:
b53-3

This is the image that is consistently produced on my Linux setup:
b53-linux

So, this issue is not caused by pydot, but I'm still not sure what the cause is. I did a few searches in the Graphviz issue list, but could not immediately find anything. (Update: See my comment of 2021-02-23 below for some issues opened after this post.) It might just disappear after some upgrades. For example, the current Anaconda setup uses Graphviz 2.38, while 2.40.1 is already available in the repository. If the problem persists, it should probably be reported at Graphviz next.

As I don't have a Windows setup myself and I don't want to depend on the RDP connection to the AppVeyor build worker VMs all the time (they get shut down after 60 minutes), I don't think I will be spending more time on this issue myself. I will just ignore the AppVeyor CI results for now. Anyone, feel free to pick this up, though.

@peterjc
Copy link
Contributor

peterjc commented Sep 25, 2020

I would suggest actively skipping these checks on Windows, so that AppVeyor can useful test all the other functionality on Windows.

@peternowee
Copy link
Member Author

That would be the easy way out. :) When is it ever going to be solved then? I was actually waiting for a Windows user to stand up.

Also note that we did not see any report of this problem from other Windows users (desktop/cloud/etc.). If it is really AppVeyor-specific, and there are no Windows-developers willing to take this up with AppVeyor, we might as well just stop running our tests on AppVeyor completely, just use Travis CI only, until hopefully one day AppVeyor changes some setting by itself and the problem is suddenly gone.

@peterjc
Copy link
Contributor

peterjc commented Sep 25, 2020

My guess this is something like a seed setting being different in the Windows packages, or simply a different version of GraphViz?

I was looking at how conda-forge does the Windows packages, and they've been using the GraphViz provided binaries (rather than building their own as they do on Linux), but don't have the latest version yet on Windows - but there are signs of progress - see conda-forge/graphviz-feedstock#47 (comment)

@peternowee
Copy link
Member Author

Good, we just wait for that then. Otherwise I was thinking about your idea of disabling the test. We could maybe detect the AppVeyor environment and disable those specific tests only on AppVeyor. But as I stated in #233 (comment) as well, I really do not want to disable tests too easily.

Btw, I hope that before releasing their package, the conda-forge maintainers will look at some of their own packaging issues as well, like their numbers 34 and 43. End users are reporting issues resulting from those everywhere, including here at pydot. We even have a special Graphviz-through-Conda-on-Windows workaround in our code, which I hope one day we can remove again.

@jarrodmillman
Copy link

We (networkx / pygraphviz) are having similar problems with anaconda/conda not updating things. I created two new issues:

And am planning to try to follow-up with anaconda once conda-forge updates things:

It would be great if you could comment and / or thumbs up the conda-forge issues I created. Hopefully, we can get things fixed soon.

@peternowee
Copy link
Member Author

@jarrodmillman Ok, upvoted the Graphviz one. Hope you can help them move it forward. Hope a new release will also include a fix for their issues 34 and 43 then.

@peternowee peternowee added this to the Unplanned milestone Dec 26, 2020
@peternowee
Copy link
Member Author

@peterjc Just an update: The problem of the inconsistent Graphviz output on Windows AppVeyor has not been solved by the new conda-forge Graphviz 2.46.1:

So perhaps the problem is somewhere else in the AppVeyor setup.

I currently do not have the time to do much more than this, but thought I would already let you know this finding.

@peterjc
Copy link
Contributor

peterjc commented Feb 22, 2021

Hmm, with GraphViz now being built from source for conda-forge on Linux and Windows (yay), that is a shame - perhaps it is an issue with GraphViz itself (rather than how it was built/installed)?

@peternowee
Copy link
Member Author

Yeah, could be, but only in combination with some other factors then. I have not yet seen any other report of this from other Windows users, so that is why I think it could be something AppVeyor-specific. Or maybe there are no Windows users that run our test suite? I don't know.

@peternowee
Copy link
Member Author

Here are some possibly related issues in the Graphviz issue tracker opened since 2019:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants