An open-source version of pdfconvert.me
Perl Python Shell Elixir
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
etc
LICENSE
README.md

README.md

pdfconvertme-public

A simplified, open-source version of pdfconvert.me for self-hosting in environments where privacy/security is a concern.

By default, it will generate the PDF but not return it via email. To change this, make sure you have a working MTA and edit bin/pdfconvertme.pl, set $email_result = 1

Required packages (available in Ubuntu 12.04/14.04):

  • libtext-markdown-perl
  • libhtml-fromtext-perl
  • libemail-mime-perl
  • libfile-slurp-perl
  • libxml-feed-perl
  • mpack
  • poppler-utils (for pdf2text conversions)
  • pandoc (for pdf2word conversions)
  • A copy of a compiled wkhtmltopdf binary (I have used this version in the past, but now compile my own as described here)

The following Perl modules need to be installed from CPAN:

For attachment conversions:

  • libreoffice (requires universe in sources.list)
  • libgxps-utils (for .xps files)
  • imagemagick (for images)

Other optional packages:

  • a2ps (raw text2pdf, no HTML-ification) - not enabled by default

Installation/Usage:

  • Copy bin/ and etc/ to /usr/local/
  • Pipe an email message to pdfconvertme.pl over stdin (I use procmail)
  • Options to pdfconvertme.pl:
    • --no-headers - Don't include email headers at top of converted PDF (From/To/Subject/Date)
    • --convert-attachment - Attempt to convert the first valid attachment found in an email
    • --force-url - Convert the first URL found in a message body
    • --force-rss-url - Treat the first URL found in a message body an an RSS feed and convert the body found in the fist entry
    • --force-content-url - Only retrieve the main content of the URL, similar to Readability, implies --no-headers
    • --no-javascript - Will perform web conversion requests with javascript disabled
    • --papersize <size> - Change the papersize on the resulting PDF (defaults to A4)
    • --pdf-to-text - Attempts to convert a PDF back to text only
    • --pdf-to-word - Attempts to convert a PDF to Microsoft Word format
    • --force-from <address> - Force the response to come from a specific address
    • --force-markdown - Treat inline text as Markdown, implies --no-headers
    • --blurb-include-orig - Include the body of the original message in the response body