A PDF to HTML converter
C++ C Python
Pull request Compare This branch is 557 commits behind coolwanglu:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
debian
share
src
test
.gitignore
.travis.yml
AUTHORS
CMakeLists.txt
ChangeLog
LICENSE
LICENSE_GPLv3
README.md
TODO
build_for_ppa.py
pdf2htmlEX.1.in

README.md

pdf2htmlEX

Build Status

A beautiful demo is worth a thousand words:

Browser requirements

Introduction

pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies. It aims to provide an accurate rendering, while keeping optimized for Web display.

pdf2htmlEX is best for text-based PDF files, for example scientific papers with complicated formulas and figures. Text, fonts and formats are natively preserved in HTML such that you can still search and copy. The generated HTML file is static, with optional features powered by JavaScript.

Learn more about who and why should use pdf2htmlEX

Features

  • Precise and native text in HTML
  • Flexible Output
  • Moderate Size
  • More PDF stuffs that you love: links, outlines & printing

Learn more
Compare with others

Wiki Portals

LICENSE

GPLv3 with additional terms (see below) for most parts, MIT License for share/*

Read LICENSE for more detail.

For Online Services

You are free and welcome to modify pdf2htmlEX for your online services, but you should credit pdf2htmlEX if your service involves "online conversion" facilitated by pdf2htmlEX. You are also encouraged to send me a name and a URL for the purpose of statistics.

Read LICENSE for more detail.

Resources

Acknowledgements

pdf2htmlEX is made possible thanks to the following projects:

pdf2htmlEX is inspired by the following projects:

  • pdftops & pdftohtml from poppler
  • MuPDF
  • PDF.js
  • Crocodoc
  • Google Doc

Special Thanks

  • Hongliang Tian
  • Wanmin Liu