Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DejaVu Serif glyph widths problem #615

Open
bsweeney opened this issue Apr 5, 2013 · 6 comments
Open

DejaVu Serif glyph widths problem #615

bsweeney opened this issue Apr 5, 2013 · 6 comments

Comments

@bsweeney
Copy link
Member

bsweeney commented Apr 5, 2013

Original author: freecorvette (September 27, 2012 17:17:30)

What steps will reproduce the problem?

I built the attached file using DomPDF and the Deja Vu Serif UTF-8 font. As you can see, the layout looks fine and the text is justified. The problem with the documents produced by DomPDF is that they are very large, so I usually run the pdf through ghostscript, which reduces the PDF size while preserving the look and layout. With this particular document however, after running:

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=output.pdf -dBATCH input.pdf

the output contains a lot of randomly added spaces, some removed spaces and the justify alignment of the text is lost.

I posted this as a bug to Ghostscript, since the input file looks well and the output file doesn't -- the bug is filed here:
http://bugs.ghostscript.com/show_bug.cgi?id=693825

However, after they investigated the problem, they said that the font information stored in the PDF file is incorrect. I quote:

The Widths array for the DejaVuSans CIDFont does not appear to match the actual widths of the glyphs in the font, which is causing pdfwrite to emit most of the characters with small horizontal kernning.

Of course, this is not a DomPDF bug in itself, but I thought that if you were aware of the font problem you may find an easier way to fix it, than fixing Ghostscript in order to accommodate the broken font would be.

As a side note, I'm not having this problem with DejaVu Sans Serif, only with Serif.

What is the expected output? What do you see instead?

The expected output of the gs command should look the same as the input file. Instead, after running the PDF through gs, the layout is broken. The Ghostscript team say there's a problem with the font information stored in the PDF.

What version of dompdf are you using? What version of PHP? On what
operating system?

@version $Id: dompdf_config.inc.php 486 2012-04-02 18:50:11Z fabien.menager $

Please provide the HTML source code you want to convert, or any additional
information.

See attached file. Run the above command. See the output. Is there any way to fix the font?

input.pdf

Original issue: http://code.google.com/p/dompdf/issues/detail?id=563

@bsweeney
Copy link
Member Author

bsweeney commented Apr 5, 2013

I don't know if it's the font itself or the information generated by php-font-lib that's causing the issue for ghostscript. You might try updating your dompdf and php-font-lib code. There are newer versions of both.

Also, with the latest code you can enable font subsetting, which will significantly decrease your file sizes. Look for the DOMPDF_ENABLE_FONTSUBSETTING setting.

@freecorvette
Copy link

Switching to the latest version did not fix the font problem and introduced several other artefacts (broken images, extra lines over the text, missing text in the footer) -- so at the moment it was not an option for me.

Setting DOMPDF_ENABLE_FONTSUBSETTING to true however, which was also available in the version I'm using, did make the document a lot smaller (less than 10x smaller than without the option set, and about 4x smaller than the file produced by gs) -- therefore it fixed my original problem (having to use gs in the first place to make the document smaller). Thanks A LOT for that hint! You should post that piece of information somewhere easy to find, I spent some time searching for "dompdf large file size" and didn't come across it.

Thanks again!

@bsweeney
Copy link
Member Author

bsweeney commented Apr 5, 2013

I'm not sure why you would have broken images or missing text. Could be related to your settings.

The extra lines could have to do with some font metrics issues we've encountered (see issue #609). Can you post the original HTML so we can do some testing when we look into this issue?

Glad that font subsetting helped. I have run across some occasional oddities related to it (e.g. the new PDF reader built in to Firefox as of v19 has trouble with fonts embedded by dompdf), but it seems to be working quite well otherwise.

@freecorvette
Copy link

Brian,

I've made some more tests today and I got different results with different runs. I'm posting below the relevant findings:

  • the extra line was actually a font underline which at some point was showing over the actual text instead of below it. I couldn't reproduce it lately so for now let's forget about it
  • the missing text in the footer was due to not having inline PHP enabled; enabling the setting fixed it, so again, false alarm
  • the broken images is something I'm getting constantly. It's actually not the images being broken, but parts of the images looking scratched. Take a look at the attached files, scratched_images_1 through 3.pdf, on the last page. I got them upon different runs and as you can see some of them look worse than othes, but all expose some sort of "erased areas" effect on the images
  • finally, enabling the font subsetting did create some spacing issues, see subsetting_off.pdf and subsetting_on.pdf, on the last page, inside the red box, where "fără TVA" and the following "(preţ vechi" overlap when subsetting is on and the spacing inside "LiftMaster - euro" looks different, too.

I'm posting an archive with:

  • the scratched images problem (scratched*.pdf)
  • the font spacing problem (subsetting*.pdf)
  • the original HTML + PHP to generate the PDF (source.html)
  • two directories containing the image assets (images, tmp)
  • the dompdf config file (dompdf_config.inc.php)

Unfortunately, I'm only allowed to post image files, so please take the zip file from here:
http://bit.ly/10sBCa2

Thanks!

@bsweeney
Copy link
Member Author

bsweeney commented Apr 9, 2013

The underline issue may be related to #609, so follow-up there if you're able to reproduce the problem.

The spacing issue is ... odd. The character appears to be there. If you copy the line it copies correctly. I don't know if maybe the glyph is missing or something else is going on. But in the latest version of dompdf I'm not seeing the problem, so whatever the issue was it looks like it has been addressed.

I have been unable to reproduce the image issue. Since the image is a GIF, it may be due to the conversion to PNG for placement in the PDF. You might try converting these images to PNG yourself and see if that helps.

@freecorvette
Copy link

Changing the image from GIF to PNG did not help. Also, oddly enough, regenerating the same PDF produced different artefacts in the images. Meanwhile, the guys at Ghostscript fixed the issue with the font spacing, but I can't get gs to compile under CentOS 5.9 (needs newer libraries), so I still don't have a perfect solution to the original problem...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants