Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Some improvements of the markdown -> html export #5

merged 2 commits into from

4 participants


I currently use ocaml-cow's markdown export for a small project, and I tried to bring some improvements to the exported html.

  • Headings, lists, quote and code blocs (<pre><code>...) shouldn't be put in <p>...</p> blocks. The html w3c validator complains about that. A commit corrects this point.
  • By adding two more lines following the same pattern than the existing ones, we can add support for h5 and h6 titles - that are supposed to be supported in markdown (inserted with ##### and ######)
  • I add an option in the functions Xml.to_string (and thus Html.to_string) to specify an indentation level. The code was already written in the Xml lib, but the indentation level was hardcoded as None in the to_string function. This can help to have a pretty xml output (unfortunately, it kinda breaks html output that uses <pre> sections, by adding extra '\n' between all content lines)
  • It seems to me (and to the people I ask) that putting the attributes of a heading anchor into the heading tag itself is much better than putting them in an empty link, itself inside the heading (moreover, the current solution causes some css problems). Example of the modification brought by the commit, for a markdown # Title: before: <h1><a name="h1-Title" class="anchor-toc"> </a> Title </h1> now: <h1 name="h1-Title" class="anchor-toc"> Title </h1>

Cheers !


Thanks for your patch.

Just to clarify some points:

.anchor-toc {
position: relative;
top: -50px;

I remember trying to do a similar hack when anchors are in the section, with no success. So I'm happy to accept your patch if you can provide a similar hack.

  • I'm very relunctant to break the output <pre> tags. But I guess there is no magic here, as Xmlm in not HTML aware, and the indent option is not mandatory.


  • Well, I'm sorry, I hadn't taken that into account. I should have thought about it before submitting a patch. I tried to found a similar hack with no empty tag, without satisfying results: i guess the current solution is the best in this case. Can you just discard the related commit ?
  • I do not intend to break the <pre> tags in fact. I added this option when i was trying to have pretty html output. I think it's just fine to have it for xml output - and it is optionnal, but we can remove it for html output. I can write a corrective patch for that if you want. It would prevent users to output broken html.

In the perspective of improving html output style with indentation, i see few possibilites:

  • Do not touch the Xml module and do not use its indentation option that breaks <pre> tags. To have indentation, manually specify it into the html by adding "\n" and spaces at the end of the data strings. That would work, but it's kind of horrible: it would be like parts of the Xml's module indentation "algorithm"
  • Modify the Xml to_string function to stop adding '\n' at the end of data strings while indent is turned on. If we want a '\n' we have to add it to the data strings, so we can control where the line breaks are, while having auto indentation.
  • Implement this, and add an option to specify if we want to add '\n' at the end of data lines or not.

Is it okay to merge my patches with these modifications (without commit e81f9dd and without the indent option for Html.to_string in commit 4711a8d ) ? What do you think about these ideas for pretty html output ? (of course I'll try to implement it as soon as you are okay with one)



I'm happy to merge your patches (without the anchor fix); and if patching Xml.to_string is easy let do that. However, you can at least add a WARNING in the comments of Html.to_string to tell the user that <pre> is broken.



Ok, I tried hard, but it appears that it is in fact not possible to export (to a string) some html just as xml, while using the indentation system of the XMLM.
It is because, especially because of the <pre> tags, indentation of html partially depends of semantics. More specifically, you don't want the content between <pre> tags to be indented like other content, because it would add a number of spaces (depending of the level of indentation of the block) at the beginning of it. For the content between other tags, you want it to be normally indented.

Sooo, without deep changes in either the XML module or the HTML module, i don't see how to use the indentation system inside XMLM to indent correctly html content.
So I guess it's just fine do discard the related commit (4711a8d). I'll maybe look about that again later, but for the moment i would just want the two first commits to be merged ;).



@Armael can you please rebase/edit you pull request with what we have been discussed here ?


@dbuenzli would be quite interested by your HTML generator actually :-)


@samoht Okay, how can I do that without breaking everything ?
I just have to git reset --hard (or rebase) to remove the two last commits, and push -f ? Maybe git pull at some step of the process ?
I'm not very at ease with github's pull requests in fact...


@dbuenzli thx, I'll have a look
@Armael you can do whatever you want on your branch (even force push) and that pull request will be automatically updated.


@samoht Done !



@samoht samoht merged commit 3a812d5 into mirage:master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 5 additions and 3 deletions.
  1. +5 −3 lib/
8 lib/
@@ -409,14 +409,16 @@ and para p =
<:html<$par_text pt$<a name="$str:id_of_heading h$" class="anchor-toc">&nbsp;</a>&>>
match p with
- Normal pt -> <:html<$par_text pt$>>
+ Normal pt -> <:html<<p>$par_text pt$</p>&>>
| Html html -> <:html<<p>$html$</p>&>>
(* XXX: we assume that this is ocaml code *)
| Pre (t,kind) -> <:html<$ Code.ocaml t$>>
| Heading (1,pt) as h -> <:html<<h1>$heading_content h pt$</h1>&>>
| Heading (2,pt) as h -> <:html<<h2>$heading_content h pt$</h2>&>>
| Heading (3,pt) as h -> <:html<<h3>$heading_content h pt$</h3>&>>
- | Heading (_,pt) as h -> <:html<<h4>$heading_content h pt$</h4>&>>
+ | Heading (4,pt) as h -> <:html<<h4>$heading_content h pt$</h4>&>>
+ | Heading (5,pt) as h -> <:html<<h5>$heading_content h pt$</h5>&>>
+ | Heading (_,pt) as h -> <:html<<h6>$heading_content h pt$</h6>&>>
| Quote pl -> <:html<<blockquote>$paras pl$</blockquote>&>>
| Ulist (pl,pll) -> let l = pl :: pll in <:html<<ul>$li l$</ul>&>>
| Olist (pl,pll) -> let l = pl :: pll in <:html<<ol>$li l$</ol>&>>
@@ -428,7 +430,7 @@ and li pl =
<:html< $ aux pl$ >>
and paras ps =
- let aux p = <:html<<p>$para p$</p>&>> in
+ let aux p = <:html<$para p$&>> in
<:html< $ aux ps$ >>
let to_html ps = paras ps
Something went wrong with that request. Please try again.