Skip to content
coolwanglu edited this page Jan 31, 2013 · 12 revisions

pdf2htmlEX was first written as more-like-a-toy.

When I heard about complains about no handy tools to present PDF documents online, from my friend Hongliang Tian, I replied without any thought "why not convert to HTML directly", and took that as a challenge.

Hongliang wanted all text from PDF available on the web page. At that time he had made a poppler-based tool to produced PDF information in JSON, which can be further processed and rendered with Javascript. That's where I started to read poppler and PDF spec -- I knew nothing about them at that time.

Before long I realized that font is a big boss in the stage. Thanks to Fontforge I did't have to go too deep into TTF spec (while still I have to read it once), but I had to learn about font from zero, and fight against the coding style (not bad, just unfamiliar) and bugs there.

The rest part was not so hard, as I've got a little background with C/C++/HTML/CSS/Javascript/Python. This is actually of thinking and comparing these languages.

  • Lu Wang

I'd like to thank Wanmin Liu for the help of testing and promoting this tool, and of course all the contributors to this one-man spare-time project. I'm quite happy to see that it's useful to others.

Clone this wiki locally