Thai pdf to text script
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
LICENSE
README.md
adjust_text_from_pdf.rb
mark_invalid_char.rb
pdf2txt_th.sh

README.md

pdf2txt_th

A script for converting PDF to plain text file and adjusting some incorrect Thai text.

History

Online Thai constitution drafts are published in PDF only, which is hard for processing and searching. So Arthit Suriyawongkul initiated the first version.

The origin aricle is here.

Prerequisite

  • pdftotext command (from XPDF)

Usage example

./pdf2txt_th.sh

Author

  • The original version written in Python created by Arthit Suriyawongkul
  • Vee Satayamas