Skip to content
An implementation of markdown in C, using a PEG grammar
C Perl C++ Visual Basic
Find file
New pull request
Fetching latest commit...
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


What is this?

This is an implementation of John Gruber's "markdown"
( in C.
It uses a PEG grammar to define the syntax. This should allow easy
modification and extension.

It is pretty fast. A 179K text file that takes 5.7 seconds for (v. 1.0.1) to parse takes only 0.2 seconds for this
markdown.  It does, however, use a fair amount of memory.


This program is written in portable ANSI C. For convenience, two
required dependencies are included in the source directory:

  * bsittler's my_getopt option parsing library

  * Ian Piumarta's peg/leg PEG parser generator

These will be built automatically.

To make the 'markdown' executable:


Then, for usage instructions:

    ./markdown -h

To run John Gruber's Markdown 1.0.3 test suite:

    make test

The test suite will fail on one of the list tests.  Here's why. encloses "item one" in the following list in `<p>` tags:

    1.  item one
        * subitem
        * subitem
    2.  item two

    3.  item three

peg-markdown does not enclose "item one" in <p> tags unless it has a
following blank line. This is consistent with the official markdown
syntax description, and lets the author of the document choose whether
`<p>` tags are desired.


peg-markdown supports extensions to standard markdown syntax.
These can be turned on using the command line flag `-x`.  `-x`
by itself turns on all extensions; to turn on extensions selectively,
specify their names after `-x`, for example: `-xsmart`.

The `smart` extension provides "smart quotes", dashes, and ellipses.

The `notes` extension provides a footnote syntax like that of
Pandoc or PHP Markdown Extra.


It should be pretty easy to modify the program to produce other formats
than HTML or LaTeX, and to parse syntax extensions.  A quick guide:

  * `markdown_peg.h` contains declarations for both `markdown_parser.leg`
    and `markdown_output.c`.

  * `markdown_parser.leg` contains the grammar itself, the `markdown()`
    function, and some utility functions used by the parser actions.

  * `markdown_output.c` contains functions for printing the `Element`
    structure in various output formats.  (This includes calling
    `markdown()` again when needed to parse list items and blockquotes,
    which are stored initially as raw strings.)

  * To add an output format, add the format to `formats`, modify
    `print_element`, and add functions `print_XXXX_string`,
    `print_XXXX_element`, and `print_XXXX_element_list`. Also add an
    option in the main program that selects the new format. Don't forget
    to add it to the help message.

  * To add syntax extensions, define them in the PEG grammar (bottom part
    of `markdown_parser.leg`), using existing extensions as a guide.
    New inline elements will need to be added to `Inline =`; new block
    elements will need to be added to `Block =`. If you need to add new
    types of elements (e.g. `FOOTNOTE`), modify the `keys` enum. By using
    `&{ }` rules one can selectively disable extensions depending
    on command-line options. For example, `&{ extension(EXT_SMART) }`
    succeeds only if the `EXT_SMART` bit of the global
    `syntax_extensions` is set.  Add your option to `markdown_extensions`,
    and modify the option parsing in `markdown.c` so that your option gets
    set appropriately.

  * Note:  Avoid using `[^abc]` character classes in the grammar, because they
    cause problems with non-ascii input.  Instead, use:  `( !'a' !'b' !'c' . )`

Something went wrong with that request. Please try again.