Wrapper around the command line tool antiword which converts Word Document (97/2003) to text or Docbook.
Just antiword, Ruby and a few gems (1.8.6+ as far as I know).
gem install antiwordr
require 'antiwordr' require 'nokogiri' file = DocFilePath.new([Path to Source PDF]) string = file.convert() xml = file.convert_to_docbook() doc = file.convert_to_docbook_document()
See included test cases for more usage examples.
MIT (See included MIT-LICENSE)