Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This is, essentially, a bash wrapper around a custom pandoc writer and template, a simple regular expression (using sed), and an XSL script. It will convert a markdown file to an XML file conforming to the TEI Lite standard. Issues / pull requests, welcomed.


In order to run, this script depends on:

Header Fields

For now, this script recognizes a limited subset of elements for a TEI header. These are all essentially translated into fields in the tei-lite.template file using the pandoc template system. (Links below will take one to the documentation for TEI Lite.) The fields currently implemented privilege metadata related to document transcription---they provide fields, therefore, for author/title of the electronic file as well as fields for a bibliographic citation of its source, a list of editors, and information about sources.

Currently, it requires only:

  • title: A title for the document. (For the titleStmt.)
  • author: at least one author. Each author's name is stored as two variables: forename and surname. titleStmt.)

Additionally, it also recognizes the optional fields:

  • editor: One or more "editors."
  • publicationStmt: Some prose describing the publication/distribution, contained in the publicationStmt. If no publicationStmt is provided, the template inserts simply, "Generated by pandoc.

The following (optional) fields are all stored as part of a bibliographic entry (bibl) under the source description (sourceDesc).

  • citation.title: Stored as <title level='a'>, that is, as an analytic title.
  • citation.container-title: For works (essays, articles, etc) which originally appeared as part of a larger work, container-title contains the name of the larger work. It is stored in the TEI header as <title>.
  • A date, presumably of publication. Format is not specified.
  • citation.publisher: A publisher.
  • citation.publisher-place: Place of publication, stored as pubPlace.
  • A page range, stored as biblScope.

Any sources used for a document or transcription can be described as one or more sources. These will be stored in a list.

Finally, one can describe the source for a document in unstructured prose in the citation.note field, which is converted to a <p> under the sourceDesc.

Additional metadata fields in the YAML header will simply be ignored. There is currently validation done on the header, so invalid field names or other problems will simply be passed over (unless they generate a YAML error). In principle, anything possibile in a TEI Lite header should be capable of being represented in YAML.


A script that uses pandoc to convert markdown to TEI-Lite conforming XML.






No releases published


No packages published