Skip to content
This repository was archived by the owner on Dec 16, 2025. It is now read-only.

Markdown: Fix UTF8 chars breaking elements#366

Merged
mco-gh merged 3 commits intogooglecodelabs:masterfrom
nlepage:fix-unicode-chars-breaking-elements
Dec 24, 2020
Merged

Markdown: Fix UTF8 chars breaking elements#366
mco-gh merged 3 commits intogooglecodelabs:masterfrom
nlepage:fix-unicode-chars-breaking-elements

Conversation

@nlepage
Copy link
Copy Markdown
Contributor

@nlepage nlepage commented Nov 14, 2019

Using UTF8 characters such as emojis in markdown input may create "offsets" in the HTML output and may even break UTF8 characters.

Example 1

With the following markdown:

## test
this is a test `🙂 this is a test` this is a test

HTML output is:

<google-codelab-step label="test" duration="0">
  <p>this is a test <code>🙂 this is </code>a test this is a test</p>
</google-codelab-step>

Example 2

With the following markdown:

## test
this is a test `this is a test 🙂` this is a test

HTML output is:

<google-codelab-step label="test" duration="0">
  <p>this is a test <code></code>��� this is a test</p>
</google-codelab-step>

Bug

splitSpaceRight() uses a rune index to split a string.

splitSpaceRight() and splitSpaceLeft() also have their final return reversed.

Fix

splitSpaceRight() now uses a byte index to split the string.

@jlandure
Copy link
Copy Markdown

jlandure commented May 4, 2020

Any news on this enhancement @jatorre @KeyboardNerd ? 👍

Copy link
Copy Markdown
Contributor

@mco-gh mco-gh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@nlepage
Copy link
Copy Markdown
Contributor Author

nlepage commented Dec 24, 2020

Just updated this to upstream's master:

  • Unicode chars breaking was already fixed
  • Kept the tests
  • Return params of splitSpaceRight() and splitSpaceLeft() were still reversed...

@marcacohen I also had opened #367 but don't have time to update it, do whatever you want with it.

@mco-gh
Copy link
Copy Markdown
Contributor

mco-gh commented Dec 24, 2020

thanks!

@mco-gh mco-gh merged commit 58af023 into googlecodelabs:master Dec 24, 2020
@nlepage nlepage deleted the fix-unicode-chars-breaking-elements branch December 24, 2020 10:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants