Skip to content

Enhancement: 😀 Support for YAML non-printing characters #538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
estruyf opened this issue Mar 15, 2023 · 21 comments
Closed

Enhancement: 😀 Support for YAML non-printing characters #538

estruyf opened this issue Mar 15, 2023 · 21 comments

Comments

@estruyf
Copy link
Owner

estruyf commented Mar 15, 2023

Related to the following Discord thread: https://discord.com/channels/992409023607476224/1085255612939640832

Very often, the YAML front matter includes text that will display on a page, like the following:

title: Hello World 🌎

However, the proper YAML would be:
title: Hello World \U0001F4CD

FM currently adds the non-printable character (like an emoji) to the text, which errors out on build since it is not properly formatted YAML

I can understand that FM would not want to modify an existing file, but there is an opportunity to catch non-printable characters when creating/modifying metadata in the Front Matter rail

Author: @BillRaymond

@estruyf estruyf added the enhancement New feature or request label Mar 15, 2023
@estruyf
Copy link
Owner Author

estruyf commented Mar 20, 2023

Adding the encodeEmoji property to the string field. When enabling this property on the field, it encodes all emojis.

Example: Hello World 🌎 becomes Hello World \u1F30E.

Verified against: https://unicodeplus.com/U+1F30E

estruyf added a commit that referenced this issue Mar 20, 2023
@davidsneighbour
Copy link
Contributor

Isn't

title: "Hello World 🌏"

possible/valid too?

@estruyf
Copy link
Owner Author

estruyf commented Mar 20, 2023

Yes @davidsneighbour, but for @BillRaymond it isn't it seems. That is why an extra property is introduced. If your SSG/framework wouldn't support it, you can let FM encode it for you.

@estruyf
Copy link
Owner Author

estruyf commented Mar 20, 2023

By default, it will keep using Hello World 🌏, only if you need to encode the emoji, you will have to tell FM to do this.

@BillRaymond
Copy link

Isn't

title: "Hello World 🌏"

possible/valid too?

While you can add emojis and other non-printable characters to YAML, they are technically not allowed. See here. Some YAML processors handle this for you, but others do not. In my specific situation, I use Jekyll (also the default for GitHub Pages), which errors out. That is why I requested that FM provide a conversion option.

@BillRaymond
Copy link

Thank you!

@BillRaymond
Copy link

BillRaymond commented Mar 20, 2023

Since you are still working on this enhancement, I did some testing on my local machine using Front Matter. I noticed you only sometimes add quotations around strings. However, you need the quotes if you add escape codes to display non-printable characters.

For example:

title: \U0001F30E The title of my post

Outputs:

\U0001F30E The title of my post

But, if you put the title in quotes, you get the non-printable character. For example:

title: "\U0001F30E The title of my post"

Outputs:

🌎 The title of my post

I hope this will help you during your development process.

@estruyf
Copy link
Owner Author

estruyf commented Mar 21, 2023

Thanks @BillRaymond, it is the YAML parser which decides to add quotes or not, will take a look why and when.

estruyf added a commit that referenced this issue Mar 21, 2023
@estruyf
Copy link
Owner Author

estruyf commented Mar 21, 2023

I have done some tests, and in my environment, when double quotes are added, it will use a double backslash, which is how it should be to correctly parse the string.

Screen.Recording.2023-03-21.at.09.51.32.mov

@BillRaymond
Copy link

In this example, I add the Unicode escape character into the title. FM does not automatically put the title in quotes.

If the YAML is not in quotes, the Unicode is processed as regular text and does not display the non-printable character.

However, if I manually put the title in quotes, the proper non-printable character is processed and displays on the website.

emoji-no-yaml-quotes

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

Although it makes it work for your use case, it is not valid for a JSON object. We will have to think about how we can get this supported, as the double backslash is the right way to encode it.

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

Like I thought, it becomes invalid YAML for the parser:

image

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

I just did a test on Jekyll, and I can just use 🌎 in the title:

image

---
layout: post
title:  🌎 Welcome to Jekyll!
date:   2022-04-21 11:44:23 +0200
categories: jekyll update
---

Can it be related to your theme?

@BillRaymond
Copy link

I do not have a theme (I started from the default Minima, but it does not use anything custom). Was the way I added the emoji a problem? I am unsure about that question, so I will run some tests and remove any plugins I may not use.

This still seems valid to look at since YAML does not support non-printable characters and could cause portability issues.

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

It seems that backslashes and YAML is a big topic, most recommend to avoid using backslashes. That said, the only valid way to have a backslash in the JS and Python YAML parser is to use two backslashes when using quotes. This is what the parser will automatically add while parsing.

@BillRaymond
Copy link

Ah, so it still needs to be in quotes, but also use \\ is the proper approach?

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

image

It is also in the YAML docs.

@BillRaymond
Copy link

Okay, I am sorry I led you astray with the single \. That is what I get for using SO and not digging properly into the YAML docs for that one.

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2023

No worries, I learned something from it as well. I might just keep the setting as is right now, in case anyone else might ever need it.

@BillRaymond
Copy link

Is there a feature in the beta I should be testing, or are you still in development?

@estruyf
Copy link
Owner Author

estruyf commented Mar 23, 2023

All is fine, I keep it in progress, as I still need to document it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants