ConvertOtherFormatsToDocBook

Norman Walsh edited this page Oct 1, 2015 · 1 revision
Clone this wiki locally

A variety of free and commercial tools exist for doing "up conversion" of non-XML formats to DocBook. The SGML Buyer's Guide calls these types of conversion tools ''N-converters'' (short for "''N''on-SGML/XML source document ''converters''").

  • doclifter, converts troff documents (e.g. man pages); from Eric Raymond
  • DocBook Doclet, converts HTML and generates DocBook from Java source code
  • Html2DocBook, converts XHTML
  • table.el, converts ASCII tables to CALS (DocBook) and HTML tables
  • wt2db, converts plain text (Wiki text); see the Wt2Db page at this site for more info
  • DocBookClass.py, converts StructuredText
  • AsciiDoc, converts text documents to DocBook articles, books and man pages.
  • txt2docbook, converts plain text
  • The makeinfo tool converts GNU Texinfo documents to a variety of formats, including DocBook
  • texi2db, converts Texinfo; see the Texi2Db page at this site for more info
  • In http://mail.gnu.org/pipermail/help-texinfo/2002-December/000851.html is described a method for going from latex to Texinfo. Then you can use the above two tools to get a transformer from latex to Texinfo.
  • Tex4ht can transform LaTeX documents to DocBook. The command for this is dblatex.
  • ooo2sdbk, converts Open''''''Office Writer documents to SimplifiedDocbook (site in French)
  • SO-to-DocBook converts Star''''''Office/Open''''''Office.org Writer documents to DocBook; written in Common Lisp
  • MajiX, converts Microsoft Word documents to SimplifiedDocbook. Version 2.0 released 2004-10-29
  • Logictran R2Net, converts RTF files to DocBook, XHTML, and OEB format (commercial)
  • upCast converts Microsoft Word documents to XML. It comes with a filter (an xslt stylesheet) that outputs the document as a DocBook article. upCast is commercial but they do offer a free version for non-commercial use. The free non-commercial license is somewhat hidden on their website. Go to the Licenses page and scroll down to the "Private Licences" heading.
  • Wordplay is a Word-to-XML converter. Comes with an XSLT stylesheet to easily export Docbook-compliant XML from Word (commercial).
  • YAWC is a plug-in for Microsoft Word (97, 2000 and XP) that can generate XML according to any XML DTD. YAWC ships with support for the Simplified DocBook XML DTD, but any other DTD can be supported with a small amount of configuration work. Pro (commercial) and Lite (free) versions available.
  • x4o is a plug-in for Microsoft Word (97, 2000 and XP) that can generate XML according to any XML DTD. Commercial.
  • ROBODoc generates DocBook from documentation headers extracted from source files (these can be in C, Perl, Lisp, TclTk, C++, FORTRAN, and many other languages).
  • Exegenix converts into DocBook-based XML any file that can be printed to PostScript.
  • LyX can export a LyX document of the "DocBook SGML article/book" document class to DocBook. See also Document processing with LyX and SGML.