This is, essentially, a bash wrapper around a custom pandoc writer and template, a simple regular expression (using
sed), and an XSL script. It will convert a markdown file to an XML file conforming to the TEI Lite standard. Issues / pull requests, welcomed.
In order to run, this script depends on:
For now, this script recognizes a limited subset of elements for a TEI header. These are all essentially translated into fields in the
tei-lite.template file using the pandoc template system. (Links below will take one to the documentation for TEI Lite.) The fields currently implemented privilege metadata related to document transcription---they provide fields, therefore, for author/title of the electronic file as well as fields for a bibliographic citation of its source, a list of editors, and information about sources.
Currently, it requires only:
- title: A title for the document. (For the titleStmt.)
- author: at least one author. Each author's name is stored as two variables:
Additionally, it also recognizes the optional fields:
- editor: One or more "editors."
- publicationStmt: Some prose describing the publication/distribution, contained in the
publicationStmt. If no
publicationStmtis provided, the template inserts simply, "Generated by pandoc.
- citation.title: Stored as
<title level='a'>, that is, as an analytic title.
- citation.container-title: For works (essays, articles, etc) which originally appeared as part of a larger work,
container-titlecontains the name of the larger work. It is stored in the TEI header as
- citation.date: A date, presumably of publication. Format is not specified.
- citation.publisher: A publisher.
- citation.publisher-place: Place of publication, stored as pubPlace.
- citation.page: A page range, stored as biblScope.
Any sources used for a document or transcription can be described as one or more
sources. These will be stored in a list.
Finally, one can describe the source for a document in unstructured prose in the
citation.note field, which is converted to a
<p> under the sourceDesc.
Additional metadata fields in the YAML header will simply be ignored. There is currently validation done on the header, so invalid field names or other problems will simply be passed over (unless they generate a YAML error). In principle, anything possibile in a TEI Lite header should be capable of being represented in YAML.