Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Text overlap and Overshoot in html #108

PunkBuster opened this Issue · 7 comments

3 participants


I've been trying to convert pdf to HTML to speed up their rendering in the browser but I want the pdf to look exactly the same. I tried converting many pdf's and it worked well till my friend tried the pdf posted below.
This pdf when converted, the text is somehow ruined in some places.

Please advise on how to remove it.



Confirmed. Here's the HTML from my machine using the latest master.

The problems are:

  • p4 - large "Be" text too low
  • p5 - large "abg" text has wrong spacing
  • p8 - missing/incorrectly placed hyperlink text
  • p12 - vertical text has wrong spacing

Also: 6, 7, 12, 13 - overlapping text due to clipping path (this is a known issue, see #39)

@coolwanglu, any thoughts on what's going wrong?


@coolwanglu the problems I found are all regressions, I tried it with 3ed576b and works fine except for the vertical text on p12. Maybe a problem with the newer state tracking code?


I've disabled a recent added function.
I can confirm missing links (p8) and wrong letter space for vertical text (p12)
I also observe a number of "boxes" what are supposed to be invisible.


@PunkBuster Is it possible for you to tell me the password, or provide me with an decrypted version? I need to inspect the links, which are not typical annotation links.


I think I've fixed the format problem, it's about the line merging procedure, which has been very wrong...
Please try the lastest master branch.

I'm going to deal with the blank boxes.


@jahewson The missing links seem to be Widget Annotations. Maybe support it in the future


@PunkBuster Actually I didn't solve the overlapping text. There are two parts

  • Text with clipping path, as john mentioned, duplicated of #39
  • Text covered by images, similar as #39 but harder.

But other problems should have been fixed. Please reopen the issue if the problem still exists.

@coolwanglu coolwanglu closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.