-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad performance with larger documents #48
Comments
Performance is indeed a problem. I intend to solve it in 2021. ⌨️ |
This does seem to improve performance a bit on my test case, but it still takes 3 seconds to work for some reason. I'm just using cmark-gfm for now. |
@osmarks Although there are still plenty to optimize, I'm satisfied with the current result (as of v0.8.4):
Nimprof profile result also shows a significant reduce on the number of instruction calls, 3277 from the profile_results.txt you provided v/s 323 on the one run against v0.8.4:
LMK if the performance improvement works on your side. Meanwhile, I'll keep working on improving the performance as tracked in #51. Cheers. |
I didn't really want to just post this without also providing a good fix, but it's causing me problems, and I lack the understanding of the large amount of code in this necessary to do much to it.
It seems that this library parses and renders Markdown quite slowly, especially on larger documents. I noticed this while testing with ~10KB documents but have mostly tried testing it on ~100KB ones, for which rendering takes about 4 seconds, while a Python CommonMark implementation seems to manage 400ms and a JavaScript one ~50ms. This is proving problematic for an application I want to use this in, and while one fix might be to switch to cmark bindings, they don't provide options to customize parsing, so I would have to reimplement a lot of logic.
By switching most of the internals away from doubly linked lists, I was able to get a roughly 25% improvement (very rough, as I just ran
time
over it a few times with a big test file as input) but I think bigger changes would be needed to get a significant difference; I think it needs to avoid allocating as many objects (maybe use an ADT or something instead) and to handle chunks of unformatted text better. This also breaks quite a few of the test cases (mostly only spacing in lists as far as I can tell): https://gist.github.com/osmarks/c7d8db89896047368d6512f6284cd7c2Here is a test input file (without any formatting) and profiling output from it:
out2.txt
profile_results.txt
The text was updated successfully, but these errors were encountered: