Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown: add comment parsing to lexer #2316

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stelcodes
Copy link

Prior to this commit, comments were not included in the Markdown lexer. This patch adds comments to the Markdown lexer output. Now comments such as<!-- comment --> will be highlighted as comments appropriately. This includes multi-line comments.

The regex strings for Markdown headings have been altered to allow for comments at the end of heading lines.

Fixes #2273

Prior to this commit, comments were not included in the Markdown lexer.
This patch adds comments to the Markdown lexer output. Now comments such
as`<!-- comment -->` will be highlighted as comments appropriately. This
includes multi-line comments.

The regex strings for Markdown headings have been altered to allow for
comments at the end of heading lines.
@@ -176,3 +176,33 @@ def test_invalid_code_block(lexer):
for fragment in fragments:
for token, _ in lexer.get_tokens(fragment):
assert token != String.Backtick


def test_comment(lexer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testing method used in this file is legacy. Instead, please use the method described here: https://pygments.org/docs/lexerdevelopment/#adding-and-testing-a-new-lexer

@@ -540,13 +540,17 @@ def _handle_codeblock(self, match):
tokens = {
'root': [
# heading with '#' prefix (atx-style)
(r'(^#[^#].+)(\n)', bygroups(Generic.Heading, Text)),
(r'(^#[^#].+?)(<!--.*?-->)?(\n)',
bygroups(Generic.Heading, Comment, Text)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a comment only appear at the end, or also in the middle? Same question on other changes.

@@ -594,6 +598,9 @@ def _handle_codeblock(self, match):
bygroups(Text, Name.Tag, Text, Text, Name.Label, Text)),
(r'^(\s*\[)([^]]*)(\]:\s*)(.+)',
bygroups(Text, Name.Label, Text, Name.Attribute)),
# comments (can span multiple lines)
(r'(<!--[\s\S]*?-->)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just

Suggested change
(r'(<!--[\s\S]*?-->)',
(r'(<!--.*?-->)',

(note that the lexer has re.MULTILINE in its flags).

@jeanas
Copy link
Contributor

jeanas commented Jan 29, 2023

I'm really sorry, I made this review from the start, and somehow didn't click "Submit" so it just stayed "pending" for 3 weeks :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: parse markdown comments
2 participants