mermaid extension erase all unicode in html output #2

retsyo · 2020-10-19T03:36:49Z

>>> import markdown
>>>
>>> md = markdown.Markdown( extensions=['md_mermaid'] )
>>> strMd = '# hello world'
>>> print(md.convert(strMd))
<h1>hello world</h1>
>>> strMd = 'a你好b'
>>> print(md.convert(strMd))
<p>ab</p>
>>> strMd = u'a你好b'
>>> print(md.convert(strMd))
<p>ab</p>
>>>

The text was updated successfully, but these errors were encountered:

orobardet · 2020-10-21T08:10:06Z

Same here, even with accented characters. It even removes all non basic ascii characters from the whole markdow file (not only from mermaid code).
It seems to be a big regression in the commit 9f279f1 (I don't understand why this code was added). As it's the only difference between version 0.1.0 et 0.1.1, a workaround for now is to pin the 0.1.0 version of the package.

gmat · 2021-06-28T16:11:40Z

@oruelle why is it useful to remove all characters except acsii with strip_notprintable() ? This breaks unicode pages. Can you give us some cases where is useful ? Maybe strip_notprintable should be call in some if else conditions to be defined.

Thanks

mEDI-S · 2022-04-18T19:01:52Z

i have a lot problems with the current version too and have as fix only add strip_notprintable(line) in the re.mach funktions added and the raw lines not modificid, this help a lot to found the correct end and start from a code block

i remoed all other strip_notprintable()
and add only

m_start = MermaidRegex.match( strip_notprintable(line) )
m_end = re.match(r"^["+mermaid_sign+"]{3}[\ \t]*$", strip_notprintable(line) )

hope this help

Apply oruelle#2 (comment) to unfilter non-ascii characters such as Japanese or Emoji.

rayalan · 2023-09-20T08:39:36Z

I'm running into a variant of this with the latest mindmap syntax, which relies heavily on the leading whitespace. The current line.strip() call wipes out the whitespace, turning

mindmap
  root
    topic
      subtopic

into

mindmap
root
topic
subtopic

which won't render.

As others before me have said, I'm not what problem we're trying to solve by stripping when we are inside of mermaid code -- why can't the mermaid code be sent as-is to the mermaid parser?

retsyo · 2023-12-23T10:58:00Z

Same here, even with accented characters. It even removes all non basic ascii characters from the whole markdow file (not only from mermaid code). It seems to be a big regression in the commit 9f279f1 (I don't understand why this code was added). As it's the only difference between version 0.1.0 et 0.1.1, a workaround for now is to pin the 0.1.0 version of the package.

This commit should be cancelled since it gives bad results

@rayalan

fix long-known bug during dealing with unicode which has puzzled and effected some projects. You may find the detail in oruelle#2. Why I leave two `new_lines.append(line)`? Because in oruelle#2, @rayalan says his app depends on leading space, as a result I can't judge what we should do exactly. However, lets fix this to let most of the projects runs without problem again.

This was referenced Feb 8, 2022

Special characters being removed by mermaid plugin obsidian-html/obsidian-html#68

Closed

fix special characters being removed #4

Merged

githubwua added a commit to githubwua/md_mermaid that referenced this issue Jan 18, 2023

Show non-ascii characters

2091675

Apply oruelle#2 (comment) to unfilter non-ascii characters such as Japanese or Emoji.

retsyo mentioned this issue Dec 23, 2023

not compatible with Chinese Characters #11

Open

retsyo mentioned this issue Dec 23, 2023

Update md_mermaid.py #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mermaid extension erase all unicode in html output #2

mermaid extension erase all unicode in html output #2

retsyo commented Oct 19, 2020 •

edited

Loading

orobardet commented Oct 21, 2020 •

edited

Loading

gmat commented Jun 28, 2021

mEDI-S commented Apr 18, 2022

rayalan commented Sep 20, 2023

retsyo commented Dec 23, 2023

mermaid extension erase all unicode in html output #2

mermaid extension erase all unicode in html output #2

Comments

retsyo commented Oct 19, 2020 • edited Loading

orobardet commented Oct 21, 2020 • edited Loading

gmat commented Jun 28, 2021

mEDI-S commented Apr 18, 2022

rayalan commented Sep 20, 2023

retsyo commented Dec 23, 2023

retsyo commented Oct 19, 2020 •

edited

Loading

orobardet commented Oct 21, 2020 •

edited

Loading