Compatibility with ghostscript 9.28 #425

spwhitton · 2019-09-05T14:49:17Z

It seems that ocrmypdf is not compatible with ghostscript 9.28. I am seeing test suite errors like these when I try to run ocrmypdf's test suite in Debian unstable (with the latest pikepdf):

/usr/lib/python3/dist-packages/ocrmypdf/exec/ghostscript.py:297: SubprocessOutputError
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 1/1 [00:00<00:00, 376.24page/s]
OCR: 100%|██████████| 1.0/1.0 [00:07<00:00,  7.63s/page]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:280    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
                                     GPL Ghostscript RELEASE CANDIDATE 1 9.28: Setting Overprint Mode to 1
                                      not permitted in PDF/A-2, overprint mode not set
                                     
                                     Error: /invalidfileaccess in --file--
                                     Operand stack:
                                        --nostringval--   --nostringval--   (/usr/lib/python3/dist-packages/ocrmypdf/data/sRGB.icc)   (r)
                                     Execution stack:
                                        %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1974   1   3   %oparray_pop   1973   1   3   %oparray_pop   1961   1   3   %oparray_pop   1817   1   3   %oparray_pop   --nostringval--   %errorexec_pop   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--
                                     Dictionary stack:
                                        --dict:736/1123(ro)(G)--   --dict:1/20(G)--   --dict:76/200(L)--
                                     Current allocation mode is local
                                     Last OS error: Permission denied
                                     Current file position is 576
                                     GPL Ghostscript RELEASE CANDIDATE 1 9.28: Unrecoverable error, exit code 1

I can provide the full log by e-mail if you need that (it's big).

The text was updated successfully, but these errors were encountered:

jbarlow83 · 2019-09-05T21:18:12Z

Should be fixed for v9.0.3. I didn't test your exact configuration but the change removes the external file access Ghostscript was complaining about.

spwhitton · 2019-09-05T23:28:57Z

Thank you for v9.0.3, but unfortunately, despite using v9.0.3 there are still other failures:

============================= test session starts ==============================
platform linux -- Python 3.7.4+, pytest-4.6.5, py-1.8.0, pluggy-0.12.0
rootdir: /tmp/autopkgtest.zPRMaR/build.urp/src, inifile: setup.cfg, testpaths: tests
plugins: helpers-namespace-2019.1.8, cov-2.7.1
collected 237 items

tests/test_completion.py x.                                              [  0%]
tests/test_ghostscript.py ..                                             [  1%]
tests/test_graft.py ..                                                   [  2%]
tests/test_hocrtransform.py .                                            [  2%]
tests/test_lept.py ..........                                            [  7%]
tests/test_main.py .F................................................... [ 29%]
ss...............ss......................s..............                 [ 53%]
tests/test_metadata.py .....ssss....ss...                                [ 60%]
tests/test_optimize.py ....sss                                           [ 63%]
tests/test_page_numbers.py ...............                               [ 70%]
tests/test_pdfinfo.py ..............                                     [ 75%]
tests/test_qpdf.py .                                                     [ 76%]
tests/test_rotation.py FssF..Fssssssssssssssss.                          [ 86%]
tests/test_stdio.py ..ss...                                              [ 89%]
tests/test_tess4.py ......                                               [ 91%]
tests/test_unpaper.py ......                                             [ 94%]
tests/test_userunit.py ...                                               [ 95%]
tests/test_validation.py ..........                                      [100%]

=================================== FAILURES ===================================
_________________________________ test_deskew __________________________________

spoof_tesseract_noop = {'ADTTMP': '/tmp/autopkgtest.zPRMaR/autopkgtest_tmp', 'ADT_ARTIFACTS': '/tmp/autopkgtest.zPRMaR/test-suite-artifacts',...TS': '/tmp/autopkgtest.zPRMaR/test-suite-artifacts', 'AUTOPKGTEST_TMP': '/tmp/autopkgtest.zPRMaR/autopkgtest_tmp', ...}
resources = PosixPath('/tmp/autopkgtest.zPRMaR/build.urp/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_deskew0')

    def test_deskew(spoof_tesseract_noop, resources, outdir):
        # Run with deskew
        deskewed_pdf = check_ocrmypdf(
            resources / 'skew.pdf', outdir / 'skew.pdf', '-d', env=spoof_tesseract_noop
        )
    
        # Now render as an image again and use Leptonica to find the skew angle
        # to confirm that it was deskewed
        log = logging.getLogger()
    
        deskewed_png = outdir / 'deskewed.png'
    
        ghostscript.rasterize_pdf(
            deskewed_pdf,
            deskewed_png,
            xres=150,
            yres=150,
            raster_device='pngmono',
            log=log,
            pageno=1,
        )
    
        pix = Pix.open(deskewed_png)
        skew_angle, _skew_confidence = pix.find_skew()
    
        print(skew_angle)
>       assert -0.5 < skew_angle < 0.5, "Deskewing failed"
E       TypeError: '<' not supported between instances of 'float' and 'NoneType'

tests/test_main.py:116: TypeError
----------------------------- Captured stdout call -----------------------------
None
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 1/1 [00:00<00:00, 345.52page/s]
OCR: 100%|██████████| 1.0/1.0 [00:00<00:00,  2.42page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
_________________________ test_monochrome_correlation __________________________

resources = PosixPath('/tmp/autopkgtest.zPRMaR/build.urp/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_monochrome_correlation0')

    def test_monochrome_correlation(resources, outdir):
        # Verify leptonica: check that an incorrect rotated image has poor
        # correlation with reference
        corr = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=1,  # north facing page
            test_pdf=resources / 'cardinal.pdf',
            test_pageno=3,  # south facing page
        )
        assert corr < 0.10
        corr = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=2,
            test_pdf=resources / 'cardinal.pdf',
            test_pageno=2,
        )
>       assert corr > 0.90
E       assert 0.0 > 0.9

tests/test_rotation.py:98: AssertionError
------------------------------ Captured log call -------------------------------
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
_______________ test_autorotate_threshold[1-correlation > 0.80] ________________

spoof_tesseract_cache = {'ADTTMP': '/tmp/autopkgtest.zPRMaR/autopkgtest_tmp', 'ADT_ARTIFACTS': '/tmp/autopkgtest.zPRMaR/test-suite-artifacts',...TS': '/tmp/autopkgtest.zPRMaR/test-suite-artifacts', 'AUTOPKGTEST_TMP': '/tmp/autopkgtest.zPRMaR/autopkgtest_tmp', ...}
threshold = '1', correlation_test = 'correlation > 0.80'
resources = PosixPath('/tmp/autopkgtest.zPRMaR/build.urp/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_autorotate_threshold_1_co0')

    @pytest.mark.parametrize(
        'threshold, correlation_test',
        [
            ('1', 'correlation > 0.80'),  # Low thresh -> always rotate -> high corr
            ('99', 'correlation < 0.10'),  # High thres -> never rotate -> low corr
        ],
    )
    def test_autorotate_threshold(
        spoof_tesseract_cache, threshold, correlation_test, resources, outdir
    ):
        out = check_ocrmypdf(
            resources / 'cardinal.pdf',
            outdir / 'out.pdf',
            '--rotate-pages-threshold',
            threshold,
            '-r',
            # '-v',
            # '1',
            env=spoof_tesseract_cache,
        )
    
        correlation = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=1,
            test_pdf=outdir / 'out.pdf',
            test_pageno=3,
        )
>       assert eval(correlation_test)  # pylint: disable=w0123
E       AssertionError: assert False
E        +  where False = eval('correlation > 0.80')

tests/test_rotation.py:155: AssertionError
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 4/4 [00:00<00:00, 407.68page/s]
OCR: 100%|██████████| 4.0/4.0 [00:01<00:00,  3.90page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    3:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    2:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    4:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    3:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    2:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    4:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:281 GPL Ghostscript RELEASE CANDIDATE 1 9.28: Setting Overprint Mode to 1
                                      not permitted in PDF/A-2, overprint mode not set
                                     
                                        **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
__________________________ test_rotate_deskew_timeout __________________________

resources = PosixPath('/tmp/autopkgtest.zPRMaR/build.urp/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_rotate_deskew_timeout0')

    def test_rotate_deskew_timeout(resources, outdir):
        check_ocrmypdf(
            resources / 'rotated_skew.pdf',
            outdir / 'deskewed.pdf',
            '--rotate-pages',
            '--rotate-pages-threshold',
            '0',
            '--deskew',
            '--tesseract-timeout',
            '0',
            '--pdf-renderer',
            'sandwich',
        )
    
        correlation = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'ccitt.pdf',
            reference_pageno=1,
            test_pdf=outdir / 'deskewed.pdf',
            test_pageno=1,
        )
    
        # Confirm that the page still got deskewed
>       assert correlation > 0.50
E       assert 0.0 > 0.5

tests/test_rotation.py:219: AssertionError
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 1/1 [00:00<00:00, 389.26page/s]
OCR: 100%|██████████| 1.0/1.0 [00:00<00:00,  1.94page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
======== 4 failed, 198 passed, 34 skipped, 1 xfailed in 404.87 seconds =========

jbarlow83 · 2019-09-06T00:36:34Z

I looked into this and my conclusion is that Ghostscript 9.28rc1 is quite broken.

All of those errors Ghostscript is producing are nonsense. The one that actually causes the trouble is Recursive XObject detected, ignoring "Im0", object number. By (incorrectly afaict) discarding the image Im0 and rendering without it, these tests are broken. These errors are independent of ocrmypdf. The file in question, cardinal.pdf, passes validation with qpdf, verapdf and Acrobat.

Ghostscript 9.28 rc2 was released upstream but has not made it to Debian. Might as well for that before raising the issue with Artifex since it may go away.

jbarlow83 · 2019-09-06T03:40:10Z

I reported the issue against 9.28rc1 as Debian bug 939530. I don't know how to link that issue to related ocrmypdf issue in Debian's bug tracker.

If -rc2 doesn't resolve it then I'll take it up with Artifex.

spwhitton · 2019-09-06T04:21:51Z

Thanks. I've done the linking!

spwhitton · 2019-09-07T00:27:21Z

Alright, against rc2, only four tests fail:

============================= test session starts ==============================
platform linux -- Python 3.7.4+, pytest-4.6.5, py-1.8.0, pluggy-0.12.0
rootdir: /tmp/autopkgtest.anuEUK/build.Mo8/src, inifile: setup.cfg, testpaths: tests
plugins: helpers-namespace-2019.1.8, cov-2.7.1
collected 237 items

tests/test_completion.py x.                                              [  0%]
tests/test_ghostscript.py ..                                             [  1%]
tests/test_graft.py ..                                                   [  2%]
tests/test_hocrtransform.py .                                            [  2%]
tests/test_lept.py ..........                                            [  7%]
tests/test_main.py .F................................................... [ 29%]
ss...............ss......................s..............                 [ 53%]
tests/test_metadata.py .....ssss....ss...                                [ 60%]
tests/test_optimize.py ....sss                                           [ 63%]
tests/test_page_numbers.py ...............                               [ 70%]
tests/test_pdfinfo.py ..............                                     [ 75%]
tests/test_qpdf.py .                                                     [ 76%]
tests/test_rotation.py FssF..Fssssssssssssssss.                          [ 86%]
tests/test_stdio.py ..ss...                                              [ 89%]
tests/test_tess4.py ......                                               [ 91%]
tests/test_unpaper.py ......                                             [ 94%]
tests/test_userunit.py ...                                               [ 95%]
tests/test_validation.py ..........                                      [100%]

=================================== FAILURES ===================================
_________________________________ test_deskew __________________________________

spoof_tesseract_noop = {'ADTTMP': '/tmp/autopkgtest.anuEUK/autopkgtest_tmp', 'ADT_ARTIFACTS': '/tmp/autopkgtest.anuEUK/test-suite-artifacts',...TS': '/tmp/autopkgtest.anuEUK/test-suite-artifacts', 'AUTOPKGTEST_TMP': '/tmp/autopkgtest.anuEUK/autopkgtest_tmp', ...}
resources = PosixPath('/tmp/autopkgtest.anuEUK/build.Mo8/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_deskew0')

    def test_deskew(spoof_tesseract_noop, resources, outdir):
        # Run with deskew
        deskewed_pdf = check_ocrmypdf(
            resources / 'skew.pdf', outdir / 'skew.pdf', '-d', env=spoof_tesseract_noop
        )
    
        # Now render as an image again and use Leptonica to find the skew angle
        # to confirm that it was deskewed
        log = logging.getLogger()
    
        deskewed_png = outdir / 'deskewed.png'
    
        ghostscript.rasterize_pdf(
            deskewed_pdf,
            deskewed_png,
            xres=150,
            yres=150,
            raster_device='pngmono',
            log=log,
            pageno=1,
        )
    
        pix = Pix.open(deskewed_png)
        skew_angle, _skew_confidence = pix.find_skew()
    
        print(skew_angle)
>       assert -0.5 < skew_angle < 0.5, "Deskewing failed"
E       TypeError: '<' not supported between instances of 'float' and 'NoneType'

tests/test_main.py:116: TypeError
----------------------------- Captured stdout call -----------------------------
None
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 1/1 [00:00<00:00, 297.74page/s]
OCR: 100%|██████████| 1.0/1.0 [00:00<00:00,  2.43page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
_________________________ test_monochrome_correlation __________________________

resources = PosixPath('/tmp/autopkgtest.anuEUK/build.Mo8/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_monochrome_correlation0')

    def test_monochrome_correlation(resources, outdir):
        # Verify leptonica: check that an incorrect rotated image has poor
        # correlation with reference
        corr = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=1,  # north facing page
            test_pdf=resources / 'cardinal.pdf',
            test_pageno=3,  # south facing page
        )
        assert corr < 0.10
        corr = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=2,
            test_pdf=resources / 'cardinal.pdf',
            test_pageno=2,
        )
>       assert corr > 0.90
E       assert 0.0 > 0.9

tests/test_rotation.py:98: AssertionError
------------------------------ Captured log call -------------------------------
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
_______________ test_autorotate_threshold[1-correlation > 0.80] ________________

spoof_tesseract_cache = {'ADTTMP': '/tmp/autopkgtest.anuEUK/autopkgtest_tmp', 'ADT_ARTIFACTS': '/tmp/autopkgtest.anuEUK/test-suite-artifacts',...TS': '/tmp/autopkgtest.anuEUK/test-suite-artifacts', 'AUTOPKGTEST_TMP': '/tmp/autopkgtest.anuEUK/autopkgtest_tmp', ...}
threshold = '1', correlation_test = 'correlation > 0.80'
resources = PosixPath('/tmp/autopkgtest.anuEUK/build.Mo8/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_autorotate_threshold_1_co0')

    @pytest.mark.parametrize(
        'threshold, correlation_test',
        [
            ('1', 'correlation > 0.80'),  # Low thresh -> always rotate -> high corr
            ('99', 'correlation < 0.10'),  # High thres -> never rotate -> low corr
        ],
    )
    def test_autorotate_threshold(
        spoof_tesseract_cache, threshold, correlation_test, resources, outdir
    ):
        out = check_ocrmypdf(
            resources / 'cardinal.pdf',
            outdir / 'out.pdf',
            '--rotate-pages-threshold',
            threshold,
            '-r',
            # '-v',
            # '1',
            env=spoof_tesseract_cache,
        )
    
        correlation = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'cardinal.pdf',
            reference_pageno=1,
            test_pdf=outdir / 'out.pdf',
            test_pageno=3,
        )
>       assert eval(correlation_test)  # pylint: disable=w0123
E       AssertionError: assert False
E        +  where False = eval('correlation > 0.80')

tests/test_rotation.py:155: AssertionError
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 4/4 [00:00<00:00, 550.72page/s]
OCR: 100%|██████████| 4.0/4.0 [00:01<00:00,  3.54page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    3:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    2:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    4:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    2:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    4:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    3:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:281 GPL Ghostscript RELEASE CANDIDATE 2 9.28: Setting Overprint Mode to 1
                                      not permitted in PDF/A-2, overprint mode not set
                                     
                                        **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
                                        **** Error: Recursive XObject detected, ignoring "Im0", object number 14
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
ERROR    root:ghostscript.py:167    **** Error reading a content stream. The page may be incomplete.
                                                Output may be incorrect.
                                    **** Error: File did not complete the page properly and may be damaged.
                                                Output may be incorrect.
__________________________ test_rotate_deskew_timeout __________________________

resources = PosixPath('/tmp/autopkgtest.anuEUK/build.Mo8/src/tests/resources')
outdir = PosixPath('/tmp/pytest-of-spwhitton/pytest-0/test_rotate_deskew_timeout0')

    def test_rotate_deskew_timeout(resources, outdir):
        check_ocrmypdf(
            resources / 'rotated_skew.pdf',
            outdir / 'deskewed.pdf',
            '--rotate-pages',
            '--rotate-pages-threshold',
            '0',
            '--deskew',
            '--tesseract-timeout',
            '0',
            '--pdf-renderer',
            'sandwich',
        )
    
        correlation = check_monochrome_correlation(
            outdir,
            reference_pdf=resources / 'ccitt.pdf',
            reference_pageno=1,
            test_pdf=outdir / 'deskewed.pdf',
            test_pageno=1,
        )
    
        # Confirm that the page still got deskewed
>       assert correlation > 0.50
E       assert 0.0 > 0.5

tests/test_rotation.py:219: AssertionError
----------------------------- Captured stderr call -----------------------------
Scan: 100%|██████████| 1/1 [00:00<00:00, 359.41page/s]
OCR: 100%|██████████| 1.0/1.0 [00:00<00:00,  1.95page/s]
JPEGs: 0image [00:00, ?image/s]
JBIG2: 0item [00:00, ?item/s]
------------------------------ Captured log call -------------------------------
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
ERROR    ocrmypdf:ghostscript.py:167    1:    **** Error reading a content stream. The page may be incomplete.
                                                    Output may be incorrect.
                                        **** Error: File did not complete the page properly and may be damaged.
                                                    Output may be incorrect.
WARNING  ocrmypdf:_pipeline.py:743 Some input metadata could not be copied because it is not permitted in PDF/A. You may wish to examine the output PDF's XMP metadata.
======== 4 failed, 198 passed, 34 skipped, 1 xfailed in 408.89 seconds =========

jbarlow83 · 2019-09-07T20:21:23Z

Looks identical to 9.28rc1 to me.

Tried to report it but bugs.ghostscript.com has been down (https misconfigured?) for about half a day.

jbarlow83 · 2019-09-18T21:26:41Z

Reported as
https://bugs.ghostscript.com/show_bug.cgi?id=701552

They traced to how Debian was compiling its version of Ghostscript. I believe this means all of the upstream packages should work.

spwhitton · 2019-09-18T23:32:59Z

Hello,

On Wed 18 Sep 2019 at 02:26PM -07, jbarlow83 wrote: Reported as https://bugs.ghostscript.com/show_bug.cgi?id=701552 They traced to how Debian was compiling its version of Ghostscript. I believe this means all of the upstream packages should work.

And it's now fixed in Debian unstable, so far as I can tell. Thank you for your help with this.

…

-- Sean Whitton

jbarlow83 closed this as completed Sep 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compatibility with ghostscript 9.28 #425

Compatibility with ghostscript 9.28 #425

spwhitton commented Sep 5, 2019

jbarlow83 commented Sep 5, 2019

spwhitton commented Sep 5, 2019

jbarlow83 commented Sep 6, 2019

jbarlow83 commented Sep 6, 2019 •

edited

spwhitton commented Sep 6, 2019

spwhitton commented Sep 7, 2019

jbarlow83 commented Sep 7, 2019

jbarlow83 commented Sep 18, 2019

spwhitton commented Sep 18, 2019 via email

Compatibility with ghostscript 9.28 #425

Compatibility with ghostscript 9.28 #425

Comments

spwhitton commented Sep 5, 2019

jbarlow83 commented Sep 5, 2019

spwhitton commented Sep 5, 2019

jbarlow83 commented Sep 6, 2019

jbarlow83 commented Sep 6, 2019 • edited

spwhitton commented Sep 6, 2019

spwhitton commented Sep 7, 2019

jbarlow83 commented Sep 7, 2019

jbarlow83 commented Sep 18, 2019

spwhitton commented Sep 18, 2019 via email

jbarlow83 commented Sep 6, 2019 •

edited