Migrate emoji storage to group aliases together for each canonical name. #1084

zee-bit · 2021-07-15T22:24:54Z

What does this PR do?

Migrates our emoji storage structure in unicode_emojis.py to a new structure where
for each canonical_name we keep its aliases grouped together, instead of considering
them as separate emojis.

Tested?

Manually
Existing tests (adapted, if necessary)
New tests added (for any new behavior)
Passed linting & tests (each commit)

Notes & Questions

I need to test the reactions PR on top of this to determine the authenticity of this migration.

Interactions

Blocks feature: Add Reactions to messages. #913

setup.cfg

preetmishra

@zee-bit Thanks for working on this. This looks good overall. 👍 Though, as there has been a discussion on CZO, generating the new file midway or separately in a commit would bring discrepancies in model.py.

neiljp

@zee-bit You've got the correct principle here. To ease the migration and avoid the mid-commit regression (which should be breaking tests?), a possible refactoring would be to first adjust the emoji data generation function to return an additional set of data ie. the pre-prepared emoji names, adapting the autocomplete to use it. You can then regenerate the emoji file and the emoji manipulation internal to the model at the same time in the next commit.

tools/convert-unicode-emoji-data

neiljp · 2021-07-22T00:43:12Z

zulipterminal/model.py

        self.active_emoji_data = self.generate_all_emoji_data(
            self.initial_data["realm_emoji"]
        )
+        self.all_emoji_names = self.generate_all_emoji_names()


We update emoji when we get a realm emoji change event, so we at least want to call both in each case, or have one method that returns a tuple and updates both at the same time.

This commit adds a new instance variable in model that stores the list of all emoji names generated during model initialization by the generate_emoji_data function. The generate_emoji_data function now return a tuple of all active emoji data and all emoji names sorted. This also migrates the autocomplete emoji function to use the list of all_emoji_names returned from this, instead of generating it multiple times during runtime whenever autocomplete emoji is called. Tests amended.

This commit updates the convert-unicode-emoji-data script to store all the unicode emoji codepoints in their extended format as received from the server. This script will map each emoji_name (aka canonical_name) to its data containing the emoji_code and aliases that share the same emoji_code. We sort this dictionary in ascending order of key (emoji_name) and store it as an OrderedDict to maintain the order of sorting. We also turn off black formatting by wrapping the file with fmt: off/on to disable black from formatting the dictionaries afterwards - since that is a generated file, it should not be modified.

E501 ignores max-line-length property - this is already defined by black. We add this error code in flake8's ignore list in setup.cfg file.

This commit generates the EMOJI_DATA from tools/convert-unicode-emoji-data in the new format that maps each emoji_name to its corresponding emoji's code, and aliases that share the same emoji_code, stored in an OrderedDict. The EMOJI_DATA is sorted in ascending order of emoji_name and this ordering is maintained via OrderedDict. The generated EMOJI_DATA is stored in unicode_emojis.py file. This also updates the helper function in model that generates all emoji data to follow and adapt to this new format of storing emoji's and updates the EmojiData type to indicate each emoji now includes an aliases field. This also migrates the emoji fixtures in conftest i.e. realm_emoji_data, unicode_emojis and zulip_emoji to the new format of storing emoji_data. Tests amended.

neiljp · 2021-07-23T07:20:48Z

@zee-bit This flows much cleaner with no mid-PR regressions - great! Merging this now 🎉

zulipbot · 2021-07-23T07:21:27Z

Hello @zulip/server-refactoring members, this pull request was labeled with the "area: refactoring" label, so you may want to check it out!

zee-bit added further discussion required Discuss this on #zulip-terminal on chat.zulip.org PR blocks other PR PR needs review PR requires feedback to proceed labels Jul 15, 2021

neiljp reviewed Jul 16, 2021

View reviewed changes

setup.cfg Show resolved Hide resolved

zee-bit force-pushed the update-emoji-storage-by-name branch from 2fdf218 to 46a7cf3 Compare July 16, 2021 19:32

zulipbot added the size: XL [Automatic label added by zulipbot] label Jul 16, 2021

neiljp mentioned this pull request Jul 18, 2021

Apply black to tools/ and setup.py #1087

Merged

4 tasks

preetmishra reviewed Jul 18, 2021

View reviewed changes

neiljp added PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback and removed PR needs review PR requires feedback to proceed labels Jul 19, 2021

zee-bit force-pushed the update-emoji-storage-by-name branch 2 times, most recently from d7339ae to 1a32875 Compare July 19, 2021 13:31

zee-bit added PR needs review PR requires feedback to proceed and removed PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback labels Jul 19, 2021

zee-bit force-pushed the update-emoji-storage-by-name branch 2 times, most recently from 94c5569 to 23a95d5 Compare July 21, 2021 22:09

neiljp reviewed Jul 22, 2021

View reviewed changes

neiljp added PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback and removed PR needs review PR requires feedback to proceed labels Jul 22, 2021

zee-bit added 4 commits July 22, 2021 19:51

linting: Stop enforcing E501 via flake8; black already handles it.

6a6b022

E501 ignores max-line-length property - this is already defined by black. We add this error code in flake8's ignore list in setup.cfg file.

zee-bit force-pushed the update-emoji-storage-by-name branch from 23a95d5 to 0aecbeb Compare July 22, 2021 15:03

zee-bit added PR needs review PR requires feedback to proceed and removed PR awaiting update PR has been reviewed & is awaiting update or response to reviewer feedback labels Jul 22, 2021

neiljp merged commit 0eaa4f2 into zulip:main Jul 23, 2021

neiljp removed the PR needs review PR requires feedback to proceed label Jul 23, 2021

neiljp added this to the Next Release milestone Jul 23, 2021

neiljp added area: refactoring and removed further discussion required Discuss this on #zulip-terminal on chat.zulip.org labels Jul 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate emoji storage to group aliases together for each canonical name. #1084

Migrate emoji storage to group aliases together for each canonical name. #1084

zee-bit commented Jul 15, 2021

preetmishra left a comment

neiljp left a comment

neiljp Jul 22, 2021

neiljp commented Jul 23, 2021

zulipbot commented Jul 23, 2021

Migrate emoji storage to group aliases together for each canonical name. #1084

Migrate emoji storage to group aliases together for each canonical name. #1084

Conversation

zee-bit commented Jul 15, 2021

preetmishra left a comment

Choose a reason for hiding this comment

neiljp left a comment

Choose a reason for hiding this comment

neiljp Jul 22, 2021

Choose a reason for hiding this comment

neiljp commented Jul 23, 2021

zulipbot commented Jul 23, 2021