Skip to content
E. F. Haghish edited this page Jun 11, 2019 · 82 revisions

MarkDoc is a general-purpose literate programming package for Stata. MarkDoc is very simple and intuitive to use and it supports creating dynamic documents interactively. The software has a considerable focus on making literate programming easy and intuitive for newbies. Moreover, it greatly values the readability of the source code and thus provide several options to keep the source code as plain as possible. Therefore, MarkDoc can be taught to undergraduate students in introductory statistics courses to boost active learning, document code, and practice statistical reporting. Based on my personal experiences in teaching statistics students enjoy taking notes in their script files and writing dynamic documents.

Not only students, but also lecturers can get benefit from MarkDoc for creating dynamic presentation slides, directly from Stata, which makes their slides to be easily updatable, reusable, and easy to create. Finally, advanced Stata programmers, can get benefit from MarkDoc for creating Stata help files in (sthlp) or pdf package vignette from their source code.

Resources

MarkDoc package vignette (PDF)

Journal Article

Examples

Torture tests

Release notes

MarkDoc engine structure

Need help? Ask your questions on statalist.org

Need more help? Contact the author to plan a workshop in your department or company


Features

MarkDoc has several unique features which makes it an ideal package for practicing literate programming at any level, from a complete newbie to an advanced programmer. In brief, it can:

  • highlight the syntax of Stata commands in HTML and PDF
  • Produce dynamic documents in several formats by:
    • converting smcl log file to any format
    • actively reproducing a dynamic document from a do-file
  • Produce presentation slides in PDF or HTML
  • Produce Stata package help files (sthlp) or package vignette (pdf, html, latex, docx, ...) from the source code
  • Render LaTeX mathematical notations in Microsoft Word docx, OpenOffice odt, html, and pdf
  • Capture a graph automatically from Stata and include it in the dynamic document or slides
  • Include a stored image in the dynamic document
  • Create dynamic tables
  • Write dynamic text for interpreting the results
  • Specify what commands or results should be included in the dynamic document
  • Include external text files (markdown, LateX, html, smcl, etc.) in the dynamic document
  • Provide a "standard" template with descriptions for creating template help files

Formats

The main idea of MarkDoc is that a single documentation format and a markup language should be able to produce a variety of document formats from the same source. For example, a graduate student should be able to produce an HTML output from a source code written with Markdown and also, use the same source to create a PDF analysis report, presentation slides, a LaTeX document, or a Microsoft Word document. This range of supported document formats makes "reusing" the documentation easy because a single format can be used for creating a variety of formats.

However, in addition to Markdown, MarkDoc recognizes 3 other markup languages for documentation which are LaTeX, HTML, and SMCL. MarkDoc applies the same format for documentation and can produce:

  1. dynamic analysis document (pdf, docx, tex, html, odt, epub, markdown)
  2. Stata package vignette (pdf, docx, tex, html, odt, epub, markdown)
  3. dynamic presentation slides (pdf, slidy)
  4. Stata help files(sthlp, smcl).

MarkDoc produces these document formats in several ways. Classically, MarkDoc takes a Stata smcl log file and converts it to any of the supported formats which are shown below.

In this case, MarkDoc processes the smcl file, which has a .smcl suffix and produces the document. The smcl file can be written in any of the supported markup languages which are Markdown ()default), html, latex. A smcl log file that is written with Markdown can be converted to pdf, docx, tex, html, odt, epub, and markdown. If the smcl log file is written in html or latex, only a pdf and html or latex document can be exported respectively. Therefore, to get the maximum format compatibility from MarkDoc, Writing with Markdown is recommended compared to LaTeX or html.


To ensure the analysis is really reproducible, converting a smcl file to a document is not enough. Instead a very restricted procedure is required to ensure the do-file loads the dataset that it uses for the analysis. In other words, the literate programming package should imagine there is no dataset loaded in Stata and then executes the do-file in a cleared workspace. When MarkDoc is given a Stata do-file, it executes the the script file in a new workspace and produces the dynamic document or presentation slides in any of the supported formats.


In addition, MarkDoc can also create a dynamic document, presentation slides, or package documentation from Stata script files which have the .do, .ado, and .mata suffix. Similarly, the documentations can be written in Markdown, html, or latex. However, for exporting Stata help files, only Markdown, smcl, or a combination of these two markup languages can be used to create .sthlp files. While html and latex can be used for creating package vignettes, it seems more plausible to write the documentation by combining smcl and Markdown, which on the one hand can greatly simplifies writing Stata help files and on the other, ensures that the help file can have the flexibility of the smcl language when it is needed.

Dialog box

To further facilitate using MarkDoc in classrooms, a dialog box was written for Stata, which also shows the options and features of Stata. The dialog box is currently only supporting the dynamic document engine of MarkDoc.

To use the dialog box, type:

db markdoc

Recently, a new engine was developed for MarkDoc that runs independent of any third-party software. To lunch the GUI for the new engine, which is called mini, type:

db mini

You can read more about the dialog box here.