Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove emojis from core? #112

Closed
jgm opened this issue Nov 17, 2022 · 7 comments
Closed

Remove emojis from core? #112

jgm opened this issue Nov 17, 2022 · 7 comments

Comments

@jgm
Copy link
Owner

jgm commented Nov 17, 2022

Currently djot includes a giant table of emoji aliases so it can do substitutions for :smile:.
While this is undoubtedly useful for some purposes, I wonder if it would be better implemented as an optional filter.
It should be relatively easy to create an emoji filter -- performance would have to be tested.

@matklad
Copy link
Contributor

matklad commented Nov 17, 2022

Another problem with emojis is that their set changes over time. So, there's (theoretical) forward compatibility hazard with :emoji_which_will_exist_in_2025:

To clarify, is this about removing just the emoji table (so, the user still can write :smile:, it's just that it doesn't get replaced) or about remoting the :ident: syntax from djot?

@jgm
Copy link
Owner Author

jgm commented Nov 17, 2022

We could go either way. I guess it would make sense to keep the emoji parsing, just remove the replacement with a unicode character. Then the filter would be quite easy and efficient; it would just have to match on the emoji elements rather than doing a pattern recognition on each str.

@jgm
Copy link
Owner Author

jgm commented Nov 17, 2022

Thinking about this a bit more, a case could be made for redesignating the emoji element as something more general-purpose, which could be resolved to emojis but could be put to another use. E.g. special.

We have extensible containers for blocks, inlines, verbatim block, verbatim inline. But we don't have an extensible leaf node -- this could be it.

@bpj
Copy link

bpj commented Nov 18, 2022

This is a good idea but please note my comment at #44 (comment) that {:some text:} should probably also be supported, and that a decimal or hexadecimal number should be allowed between the colons as probably being the best way of representing arbitrary characters by codepoint.

@jgm
Copy link
Owner Author

jgm commented Dec 5, 2022

https://github.com/kitsunies/emoji.lua seems to be a nice emoji library.
We could remove our djot.emoji module and instead either

  1. handle emojis using a filter that uses this rock, or
  2. make emoji support optional in the HTML renderer and depend on this rock if it is selected

The tests would no longer contain resolved emojis. Either way, we could compile the support into the web playground (amalg.lua will include the module if it's installed and you ask it to).

@rhysd
Copy link
Contributor

rhysd commented Dec 27, 2022

Let me copy my thoughts about emoji notation from #175 here.

I don't think emoji notation is necessary because

  • If we want to write emoji characters, it is possible to write them in the document. For example, if I want to write a dog emoji, I can write 🐶 instead of :dog: directly in the text. Input methods support inputting emojis these days.
  • Emoji notation requires readers to map symbols to the actual emoji on their memory when they read the document in plain text. But it is not easy because there are so many emoji characters these days.
  • For parsers, it is not easy to parse emoji symbols. As djot parser does, a parser must maintain the list of emoji symbols. However, they need to be updated constantly (e.g. new Unicode version bump).
  • It is actually not possible to map all emoji characters to simple emoji symbols. (e.g. emoji modifiers like skin tone 👍🏻, compound emojis like 👪)
  • Some emoji symbols are confusing. For example, according to Emojipedia 🚶🏼‍♂️ is :walking_man: on GitHub and :man-walking: on Slack. It's not easy to remember which one is correct for users so parsers need to support the both patterns. But supporting both patterns makes the emoji symbols list bigger and messy.
  • Adding dedicated notation increases the language's complexity. If removing the notation, : character will be free and a parser will get simpler.

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Dec 28, 2022

@rhysd's excellent points boils down to: Today, 😉 for all intents and purposes is plain text just like a, b and c.

jgm added a commit that referenced this issue Dec 28, 2022
@jgm jgm closed this as completed in 3c4e8ab Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants