Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marked.js doesn't parse links in front matter headers correctly #13

Open
NicolasMassart opened this issue Nov 12, 2020 · 2 comments
Open
Labels
bug dependencies Pull requests that update a dependency file

Comments

@NicolasMassart
Copy link
Contributor

Description of the issue

As indicated in tcort/markdown-link-check#128 the parsing of links in front matter YAML is buggy and returns all the characters even after the end of the link, so it includes quotes (as quotes are ok in YAML to delimitate string values).
This seems to be a choice on the Marked.js side not to support this: markedjs/marked#485

Solving leads

We first need to check if latest Marked.js behaves in a better way.

Then there's two options:

  1. exclude the front matter header parsing from Marked.js parsing and parse it separately for links
  2. switch to a parser that handles front matter and would provide the correct result

1st option is clearly the easiest in my opinion as we don't know the effect of switching to a new parser on existing user projects.

Expectations

Markdown-link-extractor is expected to extract for all the links in markdown files including those in a front matter header.

Linked issue

#7 also asks for links to be extracted from html code included in markdown. This is the same kind of request. Maybe both could be handled at the same time?

@NicolasMassart NicolasMassart added bug dependencies Pull requests that update a dependency file labels Nov 12, 2020
@NicolasMassart
Copy link
Contributor Author

And looking more at Marked.js, there's markedjs/marked#1716 which seems to be exactly what we need here to be fixed.

@wesley-dean-flexion
Copy link

I'm experiencing the same issue.

I also stumbled across the front-matter library that has methods to extract the front-matter and the body (i.e., Markdown less the front-matter):
https://www.npmjs.com/package/front-matter#fmstring--allowunsafe-false-

Would it be possible to insert a call to use the body to grab just the Markdown and skip the front-matter, possibly here:
https://github.com/tcort/markdown-link-check/blob/master/markdown-link-check#L166

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

2 participants