Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji in header breaks generated link #36

Closed
adrianmcli opened this issue Feb 16, 2017 · 18 comments · Fixed by #45
Closed

Emoji in header breaks generated link #36

adrianmcli opened this issue Feb 16, 2017 · 18 comments · Fixed by #45

Comments

@adrianmcli
Copy link
Collaborator

adrianmcli commented Feb 16, 2017

Continued from thlorenz/doctoc#123

In short: If you place an emoji into a header, the generated anchor tag does not work.

Current generated output:

- [Modules 📦](#modules-%F0%9F%93%A6)

The actual link generated by GitHub just leaves out the emoji, but has a dash in there for the space.

So I actually tried it out with a heading like this:

# Modules 📦

And the Markdown used to make it work:

- [Modules 📦](#modules-)

You can see it here: https://github.com/adrianmcli/next-boilerplate/blob/master/README.md

I took out the emoji from the TOC though, just because I didn't think it looked good.

@thlorenz
Copy link
Owner

OK thanks for that info, so what we'd need to do is replace any emoj with a dash as well.
Could you research how to detect any emoj in a regex? I've got no clue ;)

You can test things by forking and modifying this repo.
You'll get the quickes feedback if you just add a failing test here with an emoj.
Then you just try to modify the used regex until the test passes.

Finally you can PR with your changes. I'd greatly appreciate it.
Thanks.

@adrianmcli
Copy link
Collaborator Author

adrianmcli commented Feb 16, 2017

I don't think it becomes a space. It kind of just disappears. The dash is there because I had a space in between the emoji and the word.

For example, this:

# Modu📦les

Would become:

- [Modu📦les](#modules)

@adrianmcli
Copy link
Collaborator Author

So we need to:

  1. Include the emoji as if it was an actual character/word.
  2. Convert spaces to dashes (regular conversion).
  3. Strip out the emoji.

@thlorenz
Copy link
Owner

Sounds good .. basically just add tests that assume it's doing all that and then make'em pass.

Quite easy really :P main challenge how do you regex match emojis

@adrianmcli
Copy link
Collaborator Author

Apparently, detecting emojis with regex is actually really hard. Maybe this package can help: https://github.com/mathiasbynens/emoji-regex

@adrianmcli
Copy link
Collaborator Author

@thlorenz PR submitted, yay! #37

@anoff
Copy link

anoff commented Aug 14, 2017

There seem to be some emojis that are not completely captured by this solution
image

Didn't get around to analyze what makes this one special. Possibly related to the issue in the library you're using mathiasbynens/emoji-regex#28

@thlorenz
Copy link
Owner

Hope we can catch those as well .. don't have bandwidth to attack this myself, but maybe you can help (PRs appreciated).
Also @mathiasbynens should be considered the emoji regexpert (new word) so whatever he says in his issue would very likely be correct :)

@mathiasbynens
Copy link

Didn't get around to analyze what makes this one special. Possibly related to the issue in the library you're using mathiasbynens/emoji-regex#28

Yeah, that’s likely the issue. emoji-regex follows the Unicode standard, detecting only official emoji sequences. Apple’s macOS emoji picker randomly inserts U+FE0F after certain emoji despite that resulting in a non-standard sequence.

Why would you want to strip emojis, though? They’re perfectly valid in IDs and #foo-style in-page anchors. IMHO, a better fix would leave emojis intact and make sure the links are working instead.

@adrianmcli
Copy link
Collaborator Author

adrianmcli commented Aug 15, 2017

@mathiasbynens unfortunately, that's just how the header links are generated by GitHub.

## my title🕵️here

Generates this anchor:

#my-titlehere

Should we instead try an opt-in method? That might be a big change though.

@anoff
Copy link

anoff commented Aug 15, 2017

tbh I can live with the current behaviour. Just wanted to add the comment here for future generations to see. Depending on whether the issue gets fixed in the underlying library I'd recommend a Known Issues in the readme though :) I'll try to stick around and PR the docs if needed.

@Miserlou
Copy link

This never got resolved! Still breaks for me. Can we just drop the emoji or give me an option to?

I use emojis in headers here: https://github.com/Miserlou/dnd-tldr

@ccheever
Copy link

ccheever commented May 8, 2020

@Miserlou I think your links will work if you just remove the emojis from the links. I tried modifying the URL fragment on one of your broken links and it worked.

@ctsstc
Copy link

ctsstc commented May 10, 2020

If I remove the emojis from the links then it breaks in other apps like VSCode's markdown preview. I would like it to work on Github and other apps 🤔

@basnijholt
Copy link

The biggest problem is that the behavior is inconsistent. For example
# Alarm clock ⏰ has #alarm-clock- as link
# Apple Watch ⌚️ has #apple-watch-%EF%B8%8F as link.

This gets quite annoying when automatically generating table of contents, such as I do here: https://github.com/basnijholt/home-assistant-config/blob/35f3ae3942c5d343efe133fccd85415d4bdf6501/README.md#automations---table-of-content

mcornella added a commit to mcornella/anchor-markdown-header that referenced this issue Sep 23, 2020
dianakao added a commit to benritter522/service-one-client that referenced this issue Mar 27, 2021
I also shortened the Index to be more concise. I transformed the Figma prototype and Data Sources into clickable links. Tried out emojis too. 

Check out this link to add markdown to emoji headers for anchors
thlorenz/anchor-markdown-header#36
@jon424
Copy link

jon424 commented Mar 29, 2021

Is there any news on this issue? This works in VSCode for me:
[React](#⚛️-React)

but doesn't on github. It tries to access this URL:
#⚛%EF%B8%8F-React

jtbandes added a commit to jtbandes/quicklookjs that referenced this issue Jun 8, 2021
@Alynva
Copy link

Alynva commented Oct 4, 2021

I'm getting these %EF%B8%8F unicodes that prints as empty chars. In the inspector I see this:

image

But hovering the link it shows this:

image

The same that goes to the URL:

image

That is hanpping with multiple emojis: ⚠️✍️⏲️🛠️⚙️☝️⚡️ and others...

But as others said, I prefer that the emojis was just added as is, not removing like that because it breaks on other softwares.

@kevin-david
Copy link
Collaborator

#45 should address this by taking a new version of emoji-regex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.