Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render githubemoji to unicode #19

Closed
remram44 opened this issue Dec 13, 2016 · 18 comments
Closed

Render githubemoji to unicode #19

remram44 opened this issue Dec 13, 2016 · 18 comments
Labels
P: maybe Pending approval of low priority request.

Comments

@remram44
Copy link

I'd like a way to convert a :github_emoji: into the unicode character, instead of HTML <img> element. This doesn't seem very complicated given current code (I guess I can pull request if appropriate).

@facelessuser
Copy link
Owner

Can you explain the motivation behind this? It is called githubemoji, so the fact that it renders the icons as github does should be expected. I see the linked issue, but what I am looking for is the reasoning behind why githubemoji should provide an option to serve things up differently.

@remram44
Copy link
Author

In this case, the HTML is rendered by a chat client that doesn't allow images (for the usual privacy reasons, like email clients).

It could also allow users of your extension to use another backend for their emoji, if the rest of their site does, for instance emojione.

It could be a simple configuration parameter that lets me replace SimpleEmojiPattern, taking name & unicode codepoint and returning whatever element.

@facelessuser
Copy link
Owner

If we were to go down this road, it would be a list based off of what Github currently supports. The images names could be parsed and the unicode points obtained. I think things like :octocat: obviously would not translate.

I am not entirely sure exactly how this is intended to be used though. Do you desire just an HTML encoded Unicode point &1F3BE? Are you proposing it to be in some kind of wrapper to be targeted by special fonts? I guess, what is the specifications of the proposal.

I guess I kind of want to get an idea of exactly what is wanted before I give an answer. Technically this can already be accomplished with post processing that finds these emoji links and converts them to plain Unicode. It doesn't have to be implemented directly in githubemoji, but it could be. I guess I am curious how flexible this would be, or is this request only directly applied to a very narrow case.

When I created this specific extension, it was kind of just for fun. It was easy enough to poll github for all their emojis and I decided, why not. What I don't want to do is go down a road where I'm implementing multiple variants of emoji rendering. If we output an alternate output, I want to it be flexible enough that I don't get a request for yet another rendering 🙂 .

Please provide an example of what you expect the output to be, and how it could be leveraged. I want to understand fully the applications. You mention emojione (which I don't know much about), and am curious how it could leverage your proposed output.

@remram44
Copy link
Author

For me personally, I just need it to output the unicode character directly (or I guess it could be encoded as &#128040;).

I understand that you can't support all kinds of backends for this extension, that's why it's probably easier to allow the user to plug in some code that take the name and unicode codepoint (if there's one) and output whatever HTML element is appropriate.

Markdown extensions already have support for configuration options, so it looked to me like a low hanging fruit 😃 (and would help a lot with my own project, which has to translate markup between Matrix and Gitter, but that's another kind of insane)

I could definitely write my own Markdown extension based off your code that renders emoji as unicode when possible though, if you currently cover all the use cases you want to cover.

@facelessuser
Copy link
Owner

If I spit it out as an HTML entity wrapped in a span with a class, would you be able to work with that?

<span class="some-emoji-class">&#128040;</span>

I figure if I am outputting this, people might want to target them with a specific font. So if I do this, I would want it to be easy to target with CSS. I know you don't need to do this, but would it pass through fine in your use case.

@facelessuser
Copy link
Owner

@remram44, so my proposal won't work for you?

@remram44
Copy link
Author

This would work for me! I have actually implemented it already. Unicode character renders fine on my target, and custom <span> don't matter.

@facelessuser
Copy link
Owner

Okay, then if you have a solution already, I won't rush to do anything here. I will probably add it as an option for the future whenever I have time.

@facelessuser facelessuser added the P: maybe Pending approval of low priority request. label Dec 16, 2016
@remram44
Copy link
Author

remram44 commented Dec 17, 2016 via email

@facelessuser
Copy link
Owner

I do think that this particular extension is too tied up with the required output format as it is now

Meh, I think it conformed to specs. What good is software that doesn't conform to specs 😉 ? I never wanted to support every emoji image set out there like Phantom etc. I guess I could have provided an option to feed in your own dictionaries of links, but that is half the extension. At that point, you might as well just fork the original and do what you want. And like you mentioned, it is easy enough for people to fork the code and have it support the image set of their choice if they like, even their own local set if desired. Github made it very easy to poll their emoji and get the links to their images for free, so why not.

With that said, expanding it's specs to output the raw Unicode isn't a bad option, and I think I've generally warmed up to this. You get Github's syntax, but you also get something you can use without Github's image CDN. I think this new raw output is very reasonable and flexible and now more general. And if a specific, different image set is desired instead, fork it 😄.

I will try to PR if time allows.

Anyways, the changes required to support the proposed format with spans is pretty trivial, and I should be able to throw it together pretty quick. I may get to it this weekend.

Thanks for all your work!

You're welcome.

@facelessuser
Copy link
Owner

Looks like the emoji list was outdated too. So their will be a number of new ones in the next release.

@facelessuser
Copy link
Owner

Okay, this is more complicated than I initially thought. Github emoji now use some newer emoji. So some emoji can be translated directly. For instance, :zimbabwe::

https://assets-cdn.github.com/images/icons/emoji/unicode/1f1ff-1f1fc.png

becomes

&#x1f1ff;&#x1f1fc;

But some of the new emoji use zero width spacers such as :family_man_man_girl_boy:.

https://assets-cdn.github.com/images/icons/emoji/unicode/1f468-1f468-1f467-1f466.png

translates to

&#x1f468;&#x200D;&#x1f468;&#x200D;&#x1f467;&#x200D;&#x1f466;

Harder to automate as there is no way to tell which use zero width spacers (unless you just know). I need to think about all this a little more. There are probably sources such as https://github.com/github/gemoji which I could use to get this information. But this makes it more complicated.

@facelessuser
Copy link
Owner

I will either have to use the gemoji source to extract the info I need instead of the github API, or I will need to push such a request to a separate emoji plugin. With the addition of the aforementioned new emoji which require non-breaking Unicode characters, and the fact there is no way to determine this from the what the Github API returns, it now makes it difficult to automate an update if I am to require the the return of actual Unicode code points.

As gemoji is the source behind Github's emoji, I think it could be used. I would have to add logic to comb the source and determine the actual code points on update. I would need to account for Python narrow builds as well, but it is possible. This would eliminate the ability for the plugin to update on the fly, but I am okay with that. It would just have to be officially updated on the repo and then built for a release.

What I don't want to do though is to manually manage a list of Unicode code points. As long as I can come up with an easy automated way to get short names from a common list and extract their Unicode code points, I don't mind doing this. Whether this would be associated with the githubemoji extension or a separate emoji extension is yet to be determined.

@facelessuser
Copy link
Owner

After experimenting and thinking about this more, I would rather create an emojione extension that could either preserve the short names or output Unicode or formatted image tags. Then the user could have plain Unicode, or use the output to apply emojione js and/or css.

I would leave GitHub as it is. This is only because it is more difficult to tell exactly which version of gemoji they are using. Do they automatically start using prereleases? So they only use the latest non-prerelease? The API won't tell me that. And at least with emojione, you can lock it to a specific supported version until you get time to update to the most recent version.

@facelessuser
Copy link
Owner

I will close this in favor of an emojione extension. I believe that plugin would be more readily used than GitHub in the long run.

@facelessuser
Copy link
Owner

Reopening this as I am looking to generalize the emoji stuff. I plan on creating a new emoji extension that will have an emojione index and a gemojie (github) index. But it will allow a function to be passed in for the emoji generator, so it will be possible for people to alter the format. I will provide some basic formats as emojione supports a few, and I will add an html entity dump. It will also be a bit more configurable. I will deprecate githubemoji, and eventually remove it.

@facelessuser
Copy link
Owner

So a couple of things have occurred.

  1. githubemoji was updated for the last time with the latest emoji mapping. Be aware that we have all sorts of new emoji dealing different skin tones, and also various emoji mixing up genders. These emojis will use a unicode joining character which you cannot extract from the image name. I'm not sure how you approached generating your plain emoji workaround, but if use the image names to guess their html entity, you will have some issues with certain emojis as none of Github's image names include the join character, they are more friendly names.

    This extension is marked for deprecation and will not receive anymore updates to it's mapping.

  2. The emoji extension will be the main emoji extension moving forward. It provides both a gemoji (github) and emojione emoji index built from the latest release tag from the respective repos.

    The emoji extension will provide a handful of general purpose settings. It will provide to generator functions to that are both github and emojione aware; they are to_png and to_html_entities. The rest are specific to emojione: to_svg, to_png_sprite, and to_svg_sprite. Users can create their own and have it used instead if none of these are sufficient. This will be documented before the next release.

Theoretically other emoji indexes could be added, but I would need a strong argument to add them. At this point I plan to support gemoji (as I was already supporting it) and emojione because of the open nature and generally up to date support they seem to have. It should hopefully be flexible enough to handle everyone's need.

@facelessuser
Copy link
Owner

FYI: Release 1.3.0 is out and emoji docs are here: http://facelessuser.github.io/pymdown-extensions/extensions/emoji/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P: maybe Pending approval of low priority request.
Projects
None yet
Development

No branches or pull requests

2 participants