Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Text overlap and Overshoot in html #108

Closed
PunkBuster opened this Issue · 7 comments

3 participants

@PunkBuster

Hi,
I've been trying to convert pdf to HTML to speed up their rendering in the browser but I want the pdf to look exactly the same. I tried converting many pdf's and it worked well till my friend tried the pdf posted below.
This pdf when converted, the text is somehow ruined in some places.

Please advise on how to remove it.

Thanks

https://www.dropbox.com/s/b6gy9n1l4de261y/f.pdf

@jahewson

Confirmed. Here's the HTML from my machine using the latest master.

The problems are:

  • p4 - large "Be" text too low
  • p5 - large "abg" text has wrong spacing
  • p8 - missing/incorrectly placed hyperlink text
  • p12 - vertical text has wrong spacing

Also: 6, 7, 12, 13 - overlapping text due to clipping path (this is a known issue, see #39)

@coolwanglu, any thoughts on what's going wrong?

@jahewson

@coolwanglu the problems I found are all regressions, I tried it with 3ed576b and works fine except for the vertical text on p12. Maybe a problem with the newer state tracking code?

@coolwanglu
Owner

I've disabled a recent added function.
I can confirm missing links (p8) and wrong letter space for vertical text (p12)
I also observe a number of "boxes" what are supposed to be invisible.

@coolwanglu
Owner

@PunkBuster Is it possible for you to tell me the password, or provide me with an decrypted version? I need to inspect the links, which are not typical annotation links.

@coolwanglu
Owner

I think I've fixed the format problem, it's about the line merging procedure, which has been very wrong...
Please try the lastest master branch.

I'm going to deal with the blank boxes.

@coolwanglu
Owner

@jahewson The missing links seem to be Widget Annotations. Maybe support it in the future

@coolwanglu
Owner

@PunkBuster Actually I didn't solve the overlapping text. There are two parts

  • Text with clipping path, as john mentioned, duplicated of #39
  • Text covered by images, similar as #39 but harder.

But other problems should have been fixed. Please reopen the issue if the problem still exists.

@coolwanglu coolwanglu closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.