Skip to content

Bad performance with larger documents #48

@osmarks

Description

@osmarks

I didn't really want to just post this without also providing a good fix, but it's causing me problems, and I lack the understanding of the large amount of code in this necessary to do much to it.

It seems that this library parses and renders Markdown quite slowly, especially on larger documents. I noticed this while testing with ~10KB documents but have mostly tried testing it on ~100KB ones, for which rendering takes about 4 seconds, while a Python CommonMark implementation seems to manage 400ms and a JavaScript one ~50ms. This is proving problematic for an application I want to use this in, and while one fix might be to switch to cmark bindings, they don't provide options to customize parsing, so I would have to reimplement a lot of logic.

By switching most of the internals away from doubly linked lists, I was able to get a roughly 25% improvement (very rough, as I just ran time over it a few times with a big test file as input) but I think bigger changes would be needed to get a significant difference; I think it needs to avoid allocating as many objects (maybe use an ADT or something instead) and to handle chunks of unformatted text better. This also breaks quite a few of the test cases (mostly only spacing in lists as far as I can tell): https://gist.github.com/osmarks/c7d8db89896047368d6512f6284cd7c2

Here is a test input file (without any formatting) and profiling output from it:

out2.txt
profile_results.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions