Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error without stacktrace when parsing long, complex document inside a Html comment tag #49

Open
pietroppeter opened this issue Feb 18, 2021 · 3 comments

Comments

@pietroppeter
Copy link
Sponsor Contributor

I ran into a weird behaviour that I was able to minimize in the following example:

import markdown

let text = """
## title

some text:
- one point, and a [link](to_here)
- two points and **emphasis**
- three points, _really_?
  + sub point
  + another

"""  # removing any single line or inline element (e.g. link, emphasis, ...) and error will disappear

var longText = "<!--\n" # if this is removed error disappears
for _ in 1 .. 30: # for less than 30 iterations, error disappears
  longText &= text
longText.add "\n-->"  # this can be removed and error will persist
echo markdown(longText)

Running this (nim 1.4.0, markdown #head) the program errors out without a stack trace.
If I reduce the number of iterations, or remove any line or element from text the error disappears.

The behaviour seems to be related to the appearance of a long and fairly complex (from parsing perspective) text inside a Html coment tag (it is sufficient that it starts with <!--).

Apart from this, I have to say this library is excellent, I have been using it extensively and it is the first time that it fails me (not too harmful, the workaround is simple: just split the text; it was only a bit tricky to minimize the error).
I take the opportunity to thank you for the work you did with nim-markdown and also nim-mustache, which are core dependencies of something I am working on and I am about to release (hopefully) soon: https://github.com/pietroppeter/nimib

@soasme
Copy link
Owner

soasme commented Feb 18, 2021

Cool. Glad that the library I wrote becomes the building block of you library. I have been working on the other project recently and share little time on nim-markdown. I'll definitely take a look on the issue this weekend!

@soasme
Copy link
Owner

soasme commented Feb 19, 2021

I've tested the example on my local, the exact iteration needs to go up to 125 to fail.
The error is a segfault.

[1]    4194 segmentation fault  ./issue49

I believe this is kinda relevant to issue #42 and #48; I need to optimize the performance of the library so it can handle complex documents.

@pietroppeter
Copy link
Sponsor Contributor Author

Yes, I guess those issues are probably related. Below some more remarks on my side, in case they are helpful.

I also did a test using WSL (the test above was on Windows) and I also had to raise iteration to 121 before failing. I was also able to see the segmentation fault error reported (on windows I did not see it).

Looking at this forum discussion, I was thinking maybe this is due to a stack overflow (most stuff is ref object so maybe the issue is too many calls to proc?). I did try to modify the stack size and test if iteration limit varies but I was not successful in this attempt:

  • on windows I used --passC="-Wl,--stack,16777216 " but nothing seemed to change
  • on WSL I was not able to change stack size with ulimit -S -s 16384 (there is WSL issue that should be fixed on WSL2 but it seems it is not - I am using WSL2)

Specific to this issue and related to performance (but really restricted only to specific case of Html Comments), looking at the source I notice the fact that an HTML comment could probably be dealt in a different way than other Markdown Blocks (parseHtmlComment uses parseHTMLBlockContent), since all its content does not need any further parsing (in particular in the above example the output is the same as the input).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants