Error on lossless compression #69

ken-huston · 2020-06-28T20:25:14Z

Hi,

With lossy compression I get fantastic results (more than 10 times reduction in size in a pdf made from .jpg images). Although reduction in quality is bearable, as I get so much reduction I wanted to try the lossless compression to compare the results.

From other issues I read that this is done without the -s option, but if I do that, I get this error:

Processing "pages-000.jpg"...
source image: 708 x 1121 (8 bits) 0dpi x 0dpi, refcount = 1
thresholded image: 708 x 1121 (1 bits) 0dpi x 0dpi, refcount = 1
0�a&��a��������j��QD�ŭd�Z,��q�f4i�dDY�4^ȟ!�X�؂�ub0~�~���5����k�5�Q �dK�'�4�m.��;�g�hm���F��m�&
             �*(���:S�Pq�M�����L,�#�ex�D.�/��u�
                                               \�}*�YvCBO��
                                                           �P��n�
                                                                 �p��ăUAuDZ�TLX&�:p���'�V4w�j%z�hu+��S�~�-�@iȅ
                              v���ye���1_L�����X+���]�Ȓ�$�����-^��g�!!pB����L��A��$�^��]��2^S K�q�4��A�d
                        �:��D����v}��ZY%_���,(-�EӶ��M̸��1�F�`��cV�Ț�=+h�:k��kM�
                                                                              ,1͖��$a����/usr/local/bin/pdf.py: symbol table output.sym not found!��
Usage: /usr/local/bin/pdf.py [file_basename] > out.pdf

I do not now how to deal with it. Any help will be very much appreciated.

The text was updated successfully, but these errors were encountered:

DingoDog · 2020-07-02T23:31:04Z

Solution was found already in 2016 by klivens
#24 (comment)

I use a sort of one-liner that does the same task, but without requiring modifications of pdf.py

ken-huston · 2020-07-04T15:29:51Z

Thanks for the response Dingo, it works.

I'm trying to compress a pdf composed of JPG text images. I extracted them with pdfimages and used the jbig2 compression. With the lossless option I can reduce the size of the pdf from 40 to 2,5 MB, with lossless to 5,5 MB. But I see almost no difference in quality between the two outputs (both reduce quality of the original pdf).

Am I doing something wrong, or these results are what is expected?

Thanks again.

joshuakraemer · 2022-10-06T16:11:31Z

I use a sort of one-liner that does the same task, but without requiring modifications of pdf.py

@DingoDog, would you please be so kind to share your one-line solution?

I'm trying to compress a pdf composed of JPG text images. I extracted them with pdfimages and used the jbig2 compression. With the lossless option I can reduce the size of the pdf from 40 to 2,5 MB, with lossless to 5,5 MB. But I see almost no difference in quality between the two outputs (both reduce quality of the original pdf).

Am I doing something wrong, or these results are what is expected?

@ken-huston, lossy and lossless compressions are expected to look similar, because the lossy jbig compression is good at preserving visual quality. You should be aware though that the lossy compression can also lead to letter substitutions (see https://en.wikipedia.org/wiki/JBIG2#Disadvantages). Always check your results when using lossy compression.

I assume your sources are grayscale or colored jpg files. You might be able to improve final quality and file size by using a different program to convert the jpg files to black and white image files first. I often use ImageMagick with OTSU binarization, e.g.:

magick in.jpg -auto-threshold OTSU out.pbm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on lossless compression #69

Error on lossless compression #69

ken-huston commented Jun 28, 2020 •

edited

DingoDog commented Jul 2, 2020

ken-huston commented Jul 4, 2020

joshuakraemer commented Oct 6, 2022

Error on lossless compression #69

Error on lossless compression #69

Comments

ken-huston commented Jun 28, 2020 • edited

DingoDog commented Jul 2, 2020

ken-huston commented Jul 4, 2020

joshuakraemer commented Oct 6, 2022

ken-huston commented Jun 28, 2020 •

edited