Skip to content

Conversion Issues

Tyler Clemens edited this page Nov 6, 2013 · 13 revisions

Capitalization of letters is off at times


*SS taken from MFT-TEST-ASSEMBLED-LINKED-RGB.pdf

Word selection is broke due to spans that are inserted to correct for kerning

pdf2htmlEX applies a with and a margin to spans to correct for curning


*SS taken from MFT-TEST-ASSEMBLED-LINKED-RGB.pdf

Word selection is broken due to lack of spaces at the end of divs

This is not a problem on the kindle


*SS taken from MFT-TEST-ASSEMBLED-LINKED-RGB.pdf

Word selection is broken due to placement of divs


*SS taken from MFT-TEST-ASSEMBLED-LINKED-RGB.pdf

Justification of text is slightly off making the text look ragged right

This happens because we use the command line option optimize text to remove some spans that interfere with word selection. Optimize text reduces the number of spans in a line and adjusts the letter spacing and word spacing of the entire line to account for this reduction. Its an imperfect approximation.


*SS taken from Generation Kill.pdf

text selection is broken because of a word split by a space in a span

pdf2htmlEX guesses when to insert a space in its offset spans. It guesses based on the width of a space and the curning of characters. If a false positive occurs, a word will be broken by a space character.


*SS taken from Fire-in-My-Belly-TEST-RGB-LINKED.pdf

text selection is broken because of a word split by a space character

pdf2htmlEX guesses when to insert spaces between characters when it reduces spans with optimize text. It guesses based on the width of a space and the curning of characters. When this guessing renders a false positive, an extra space appears in the text output sometimes breaking up words.


*SS taken from GS-26-pdftk.pdf

PDFs Referenced

Clone this wiki locally