New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nested html links are broken with parse_block_html #137

Closed
jvanasco opened this Issue Sep 6, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@jvanasco

jvanasco commented Sep 6, 2017

this is a variant of #81

example_in = '<div><a href="https://example.com">Example.com</a></div>'
mistune.markdown(example_in, escape=False, parse_block_html=True)

will generate:

<div><a href="<a href="https://example.com"&gt;Example.com">https://example.com"&gt;Example.com</a></a></div>

if escape is toggled to True, it is also broken:

'<div><a href="<a href="https://example.com"&gt;Example.com"&gt;https://example.com"&gt;Example.com&lt;/a&gt;&lt;/a&gt;&lt;/div>

@lepture

This comment has been minimized.

Owner

lepture commented Oct 12, 2017

Current implementation of parse_block_html=True is not working well.

@frostming

This comment has been minimized.

Contributor

frostming commented Nov 29, 2017

After a deep look into the source code, the root cause is when parsing the html body with inline, it uses a subset of default rules which doesn't contain inline_html rule. Then the body with urls will be captured by url rule.

I am glad to send a pull request

@lepture

This comment has been minimized.

Owner

lepture commented Nov 29, 2017

@frostming yes, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment