Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional uneccessary characters appear in markdown output #50

Closed
cydnax opened this issue Feb 4, 2021 · 2 comments
Closed

Additional uneccessary characters appear in markdown output #50

cydnax opened this issue Feb 4, 2021 · 2 comments

Comments

@cydnax
Copy link

cydnax commented Feb 4, 2021

Additional backslash characters are added to underscores in the markdown output . Each _ is replaced with \_.

Example:
Source:
tp_fan /proc/acpi/ibm/fan
hwmon /sys/class/thermal/thermal_zone0/temp

Markdownload markdown result:
tp_fan /proc/acpi/ibm/fan
hwmon /sys/class/thermal/thermal_zone0/temp

Browser: Firefox Linux 85.0

@cydnax cydnax changed the title Additional uneccessry chareacters appear in markdown output Additional uneccessary characters appear in markdown output Feb 4, 2021
@deathau
Copy link
Owner

deathau commented Feb 16, 2021

There's more than just underscores at work here, turndown escapes a lot of things. Here is their explanation: https://github.com/domchristie/turndown#escaping-markdown-characters

Turndown uses backslashes (\) to escape Markdown characters in the HTML input. This ensures that these characters are not interpreted as Markdown when the output is compiled back to HTML. For example, the contents of <h1>1. Hello world</h1> needs to be escaped to 1\. Hello world, otherwise it will be interpreted as a list item rather than a heading.

To avoid the complexity and the performance implications of parsing the content of every HTML element as Markdown, Turndown uses a group of regular expressions to escape potential Markdown syntax. As a result, the escaping rules can be quite aggressive.

I'm not sure how to best resolve this. I could provide a custom override of the escape function, but it seems complex...

deathau added a commit that referenced this issue Feb 16, 2021
@deathau
Copy link
Owner

deathau commented Feb 21, 2021

There's now an option to disable the turndown escaping, which should prevent underscores, etc being escaped.

Let me know if you have issues with it, but for now this issue is closed

@deathau deathau closed this as completed Feb 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants