New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Non emojis with real ones #29
Conversation
Keeping it as a draft for now since its been awhile and I took a few liberties, @keith-turner if this is going in the direction you like then lmk and ill ready-for-review it |
@robindiddams I took a quick look over this and it looks neat. I want to take a more in depth look which will take some time, I will do that soon. |
@robindiddams I looked over the code changes and have not seen any problems so far. I am going to look over the emoji mapping changes next. I created an ecojiv2 branch and pointed this PR to it. I am thinking there are other changes I would like to make RE ecoji v2 (readme changes, possibly API changes, experiment w/ changes to ecoji.io, and possibly command line tool changes) before merging ecojiv2 into the master branch. May also want to request review of the branch from others who have written ecoji impls in other languages before merging it into master. Once all of this is done, I will merge the branch into master. |
@robindiddams now that the ecojiv2 branch exists, are you ok with taking this PR out of draft status? |
@keith-turner thanks for taking the time, yeah that makes sense. Hearing from the other ecoji implementers will be fun too 🎉. Give me shout if theres any way I can help 👍 |
@robindiddams I have been analyzing this and I do not think it has the property that something encoded w/ ecojiv1 will decoded to the same thing in ecojiv2. Below are some experiments I did. The hex
For ecoji v2 to be able o decode ecoji v1, I think that for any emojis the two standards share they must have the same 10-bit ordinal. Looking at what you have done and the table I created, I am thinking that we can not have backwards compatibility and maintain the sort order. Personally I think its better to drop the sort order constraint and have backwards compatibility. Below I took some data from here and modified it real quickly to make the first few have the same ordinals when used by both. I randomly selected a few codepoints not used in ecoji1 to fill in for the code points that are not candidates for ecoji 2. The following encoding example for ecoji 2 would maintain compat and break sorting, which again I think is fine.
I am thinking a next step is to write a little program to list all of the candidates for ecoji2 that are not used by ecoji1. Then we just need to select the candidates that we want to fill in for the ordinals that ecoji 1 needs replaced. |
I modified my little program here to print what emojis are available and not used by ecoji1. I also made it determine how many need to be replaced. There are 131 that need to be replaced and 220 are available to chose from. Below are all that are available to chose from ⌚ ⌛ ⏩ ⏪ ⏫ ⏬ ⏰ ⏳ ◽ ◾ ☔ ☕ ♈ ♉ ♊ |
I was looking at the list of one I posted trying to find the ones I liked and that seemed hard. I found it easier to look for ones I did not like and found the following. ⚪ ⚫ ❌ ❎ ❓ ❔ ❕ ❗ ➕ ➖ ➗ 🟠 🟡 🟢 🟣 🟤 🟥 🟦 🟧 🟨 🟩 🟪 🟫 ⬛ ⬜ ◽ ◾ ✅ Thought the following were redundant. ⌚ ⌛ ⏰ ⏳ 🤍 🤎 🧡 The following are padding ☕ 🙋📑 I also noticed the padding emoji 🏍 is listed as having two code points. |
Ah right because im assuming v2 until I hit v1, since the v2 order is different theyre already incompatible. duh 🥴. My bad 😅, so we're back to the original plan then! At least now we can choose fun emojis like 🪤! Cool, a little later this week ill sub in those, skipping the ones you dont like and picking ones distinctively |
Okay @keith-turner sorry that took longer than I said 😅. Heres what I've come up with: Padding
Emojis
You can view the full list here: https://gist.github.com/robindiddams/943202dbc129f16b64f2113ea91ce180 I conducted your same test (but with different numbers): ❯ echo 0450 | xxd -r -ps | ecoji1
🇦🏎☕☕
❯ echo 0450 | xxd -r -ps | ecoji2
🫀⚓☕☕
❯ echo 0450 | xxd -r -ps | ecoji2 | ecoji2 -d | xxd -ps
0450
❯ echo 0450 | xxd -r -ps | ecoji1 | ecoji2 -d | xxd -ps
0450 Since 2 of the padding chars changed, the code had to change a bit more too. I want to write some unit tests that would cover decoding something with each one but I'm also not that great at computer math so might need some help generating an encoded string for each 😅. |
That happens. I should be able to look at this again this weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robindiddams this is looking really nice, I partially reviewed the updates. I analyzed the new emojis.txt and emojisv1.txt and the changes look really good. I also looked at the code, but did not complete reviewing it. I have not had a chance to think through the padding changes, I will do that when I circle back to look at this again in a few days.
} else { | ||
if ok := checkRune(c); !ok { | ||
// try to fallback to ecoji v1 | ||
if isV1 := checkRuneV1(c); isV1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need to be more strict for the following case.
- See a rune that is only used by ecoji v2
- See a run that is only used by ecoji v1
For the above sequence of events, probably want to return an error. I don't think it currently does, but not completely sure.
In other programming languages I might use an enum type for this instead of a bool, but I don't think go has enums. Could have two bools instead of one like ecojiV1 *bool, ecojiV2 *bool
and whenever a rune is seens that is only used by v1 or v2 it will set it to true. When one is set to true then do not expect to see runes only used be the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robindiddams I pushed a new commit 7c0dda1 to the ecojiv2 branch to attempt to handle this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay awesome!
@robindiddams I am merged this into the ecojiv2 branch so I can make follow on changes. Please feel free to submit more pull request against the ecojiv2 branch. I accidentally typed the preceding message in the commit message 😆 I may try to fix that real quick. |
I fixed the commit message by force pusing a new commit (85f4673) on the ecojiv2 branch |
@robindiddams I have been doing a lot of work in the ecojiv2 branch over the past few days. I was just looking over the new emojis that you selected for ecoji v2. I was looking at some that you did not select that I thought were interesting, which I listed in the following table.
Then I looked for ones in ecoji v2 that I thought were the least interesting that could possibly be replaced with the above. I listed those below.
Do you have any thoughts on making these replacements? I feel like I am getting close to wrapping up the things I want do on the ecoji v2 branch. The remaining things I want to do are review the new emojis in detail (which I did a first pass at leading this post, still want to do a 2nd pass) and update the documentation (readme and docs dir). |
Yeah I actually avoided all the people emojis since they are almost always unioned to a skin tone and/or gender modifier so I preferred emojis that were more standalone recognizable. It certainly wouldn't be an issue to include them though. Also I really like the sparkles and brick emojis, but I know brick looks weird on widows so I get it if you wanna drop em 🤷♀️. |
Thinking about the following smaller set of replacements. The ninja and zombie being my favorite two. Dropped some others after reading your comment since it seems pretty similar to other existing emojis.
There are a lot of existing people emojis in use, I looked over the existing set of emojis and AFAICT the above people emojis are distinguishable from the existing people except for the vampire. Mulling over dropping the vampire as a replacement. If the vampire is dropped could possibly do the following
Well lets keep those then. |
okay sounds good! I like the replacements!
woo hoo! thanks man! 👌 Since this pr is done, do you want to go back to tracking things in #27 ? if theres more work you wanna delegate im happy to take em on. |
It seems like the scope of this issue has expanded beyond simply replacing "non emoji" (if this issue is not the right place I can open a new issue for v2). I have a few suggestions. I would consider also replacing:
and
and
The rationale is that these cannot be described in a language agnostic way. You have to know the language they're representing to accurately describe them. How do you tell someone to enter 🈹 or 🆓 over the phone if you're not a native speaker of either language, for instance? Similarly I think it is hard to describe the different between the following sets:
(^possible color blindness issues)
Should note that not all of these There are also a few that I don't understand why were replaced, like:
Theres are all easy to describe and visually distinct, what was the rationale for why they needed replacement? Finally here's an interesting one
Sun behind cloud gets replaced with teapot, but then
night skyline gets replaced with a different sun behind cloud. There are probably some more to pick through but figure I should get thoughts on this much first. |
@dcow after reading your comment I created docs/emojis.md which is automatically generated by markdown_test.go. I think @robindiddams removed some of the emojis that were used in ecoji V1 because there were not single code point fully qualified emojis. At the end of docs/emojis.md there is a section about candidates there were not used. I need to reconcile all of this w/ your comment but will have to do that later. In general we have a few options for changing the set of emojis for ecoji v2.
I need to go back and look over your suggestions and see where they fall. Also maybe your comment deserves its own issue. |
@dcow I looked through what you posted in that section and I agree those are not interesting at all and would be nice to replace. I looked through the available unused candidates that I thought were interesting and came up with the following. I think you identified ~38 emojis that would be nice replace in your first section. I think there are around ~42 unused candidates below, so maybe could swap those.
The following unused candidates are less interesting than those above IMO but more interesting than the ones you identified.
|
Based on the comment from @dcow made on #29 modfied the emojis used for Ecoji V2. $ sed --file replace.sed -i emojisV2.txt $ cat replace.sed s/1f18e/270b/ s/1f191/1f90c/ s/1f192/1f90f/ s/1f193/270a/ s/1f194/1f9cf/ s/1f195/1f9da/ s/1f196/1f9db/ s/1f197/1f9dc/ s/1f198/1f9dd/ s/1f199/1f9cd/ s/1f19a/1f9ce/ s/1f21a/1f9d6/ s/1f22f/1f9d7/ s/1f232/1f9d8/ s/1f233/1fac2/ s/1f234/1fab4/ s/1f235/1f9ff/ s/1f236/1fa78/ s/1f238/1f6d7/ s/1f239/1f9f4/ s/1f23a/1faa3/ s/1f250/1f9fc/ s/1f251/1faa5/ s/1f3e7/1f9fd/ s/1f4b9/1f9ef/ s/1f4f4/1faa6/ s/1f519/1faa7/ s/1f51a/267f/ s/1f51b/26d4/ s/1f51c/2648/ s/1f51d/2649/ s/1f51e/264a/ s/1f51f/264b/ s/1f520/264c/ s/1f521/264d/ s/1f522/264e/ s/1f523/264f/ s/1f524/2650/ s/1f538/2651/ s/1f539/2652/ s/1f53a/2653/ s/1f53b/26ce/
Fixes #27, Removes and replaces all emojis that are either not emojis (🅾) or are partial emojis (🏛). In trying to figure out one to replace I ended up writing a program to determine which ones needed to go and which ones i could choose from. It turns out a lot of them were multi codepoint emojis (which ecoji doesnt support) so I made it select ones for me from a valid list of sorted single codepoint emojis.
NOTE: I know it looks like I'm replacing a lot of emojis that appear to render fine but go ahead and look up any of them in the unicode emoji spec or emojipedia and you'll find that theyre all 2+ codepoints.
Other things:
go run gen.go
to try it out.go.mod
file since I already changed enough stuff. (but in case you wanted to make this a go module, happy to help there too 👀)Replacement table
Since it appears that the mapping needs to be sorted these replacements arent exactly 1-1 (ie, they wont necessarily fill the same spot that the previous one left because their unicode code point isnt the same). Unless we no longer care about the set being in numerical order then theres really no point in manually selecting replacements since we dont control where theyd be in the set.
Sorry it took me so long to get around to this, been a busy few months 😅, lmk if anything is unclear or if you think I did anything egregious.