Skip to content

jojolebarjos/pdf2htmlEX-webservice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

pdf2htmlEX webservice container

pdf2htmlEX is a precise PDF to HTML converter. Since the original repository is not maintained anymore, many forks have appeared. This container is based on this up-to-date fork. This author has also made a Dockerfile.

This container provide a small webservice (~54Mo).

Usage

docker build -t jojolebarjos/pdf2htmlex .
docker run -d -p 8080:8080 jojolebarjos/pdf2htmlex
HTTP POST /convert
file: <my.pdf>

<!DOCTYPE html>
<!-- Created by pdf2htmlEX (https://github.com/coolwanglu/pdf2htmlex) -->
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <meta charset="utf-8"/>
...

ToDo list

  • Probably need poppler data (--poppler-data-dir="")
  • Properly handle errors in server
  • Add version endpoint