New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to content based comparison #854
Conversation
9bc5e9b
to
a2776ee
Compare
Just hope things are working as expected on my machine. Signed-off-by: Stephen Finucane <stephen@that.guru>
There are three issues to resolve here, one with an input file and two with reference PDFs. rst2pdf/tests/input/test_issue_175.txt This attempts to insert a large enough spacer to force a header to the bottom of the first page and the text onto the next page, in order to ensure the header will be moved to the next page to group it with its text. However, the spacer wasn't large enough and led to both the header and text being included on the first page. rst2pdf/tests/reference/test_aafigure.pdf It doesn't appear possible to format a backtick as a literal without including the escape character. The escape character wasn't included in the reference PDF. rst2pdf/tests/reference/test_issue_288.pdf This contained a different number of block elements on Python 3 than on Python 2. Update the PDF but disable the build until Python 2 is dropped. Signed-off-by: Stephen Finucane <stephen@that.guru>
Previously, we were assessing whether a PDF was correct by comparing a checksum of the file against a list of known valid checksums. When a different checksum was seen, which was quite often given the amount of moving pieces here, it was up to the user to determine whether there was a bug or simply a small but valid change in the output PDF. Let's simplify this. Instead of storing checksums, use the pymupdf library to compare the PDFs programmatically. We need to use a bit of fuzzing here, since blocks can get moved around and text rewrapped, but it should be mostly bullet proof otherwise. Signed-off-by: Stephen Finucane <stephen@that.guru>
Resolve issues with the Sphinx tests. Signed-off-by: Stephen Finucane <stephen@that.guru>
a2776ee
to
4a795de
Compare
This is failing because there isn't a Python 2.7 wheel available for PyMuPDF at the moment. I've reported the issue (pymupdf/PyMuPDF#511) but it seems it's blocked on Travis. @akrabat @lornajane How close are we to dropping to Python 2.7 support? Perhaps I could simply drop testing here and side-step the whole issue. |
We can drop Python 2 support. Please do everything else you need to do to get this working for 3 and if that's the only reason the build is failing, we can review with that in mind and/or bite the bullet to remove support for 2. At a bigger picture level (and I probably said this already), I'm not at all sold on doing only image comparisons since not all the tests are purely visual. But having the ability to have "unknown md5 but honestly this PDF looks fine to the machine" as a test output: incredibly valuable!!!! We had some imagemagick thing that did this before (it was a bit half baked). Image comparison is a very welcome addition to our testing setup but I don't think it replaces what we have. Side question: how does the performance compare for running the test suites through the image comparison over the existing md5 process? (the answer might be: run it yourself and see I suppose!) |
#841 takes us to Python 3 only. I'll update PR 841 though as its dependencies probably need checking. |
Okay, I'll drop Python 2.7 tests from Travis and get this passing.
So, interestingly enough, this isn't actually doing image-based comparison at all (that was #849 which I've abandoned in favour of this approach). Instead, it's inspecting the generated PDF using the PyMuPDF library. If you look at the
A run before this change yields the following:
After this change, I see:
So that's roughly a 14% increase in runtime (albeit with one additional skipped test). I think that's acceptable given this is more deterministic. Also, yes, this is still not passing completely on my machine but I'm working on it. It's passing in the gate which is key for now. |
Turns out I wasn't constraining my versions, which was resulting in a number of those failures. I ran again with the requirements from
This is very variable though. Yet another run gave me a run time of 265.66s (0:04:25) so I wouldn't put a whole lot of weight in it. |
These numbers are reliable enough to answer to my original concerns. I was expecting a performance hit but I had no idea if it was going to make the tests take ten times as long to run or what. Less than double seems like a perfectly good ball park figure and I am reassured! |
46b1dfa
to
4a795de
Compare
It seems there are issues with the Python 2.7 wheels for PyMuPDF [1]. We could wait for a fix, but given the next version of rst2pdf is expected to be Python 3 only it's as easy to just drop testing for this now. [1] pymupdf/PyMuPDF#511 Signed-off-by: Stephen Finucane <stephen@that.guru>
Bump 😄 I'm available in case there's anything anything more you need from me |
This works on my Mac and seems logical. Do we know if it's more resistant to changes in ReportLab versions? Also, presumably we no longer need |
It should be, yes. I'm planning to extend this further to do things like combining/splitting blocks due to ReportLab differences (seen locally on newer versions) [*], but I figured that wasn't necessary for a first pass. The [*] In short, consider the case where there are two blocks in PDF A - blocks A1 and A2 - and one block in PDF B - block B. If coordinates x0 and y0 of block A1 equals x0 and y0 of block B, and coordinates x1 and y1 of block A2 equals x1 and y1 of block B, then we can say the combination of blocks A1 and A2 == block B. Again, I'll explain this better in a future PR. |
That makes sense. The interesting bit is when we can avoid invalidating all the tests because a minor release of ReportLab reorganised the internals of the PDF file. We do need to ensure that the tests pass on multiple ReportLabs/ (or whatever other dependencies change the PDF internals) though as we have that feature currently with the multiple hashes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
🙌 |
Time to drop the JSON-based testing. MuPDF is a "lightweight PDF, XPS, and E-book viewer [that] consists of a software library, command line tools, and viewers for various platforms" [1]. The PyMuPDF library provides Python binding for MuPDF and gives us the ability to introspect documents [2]. Start using this to provide a content-based comparison of the PDFs, comparing reference PDFs with those generated by our test suite.
This adds one more dependency, but should result in far more reliable, deterministic test behavior across different environments. Fuzzy matching is used to abstract minor differences.
Note that there is some overlap between this PR and #853, since that was mostly created from patches that I had previously included in this series but decided were non-essential. Expect a merge conflict (which I can resolve) for whatever ends up being merge later.
Also note that I haven't actually deleted the
rst2pdf/tests/md5
directory. This PR is already large enough and that can be done in a follow-up.