Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Emojify transformation #164

Merged
merged 6 commits into from Aug 14, 2021
Merged

Conversation

xiaohk
Copy link
Contributor

@xiaohk xiaohk commented Jul 25, 2021

Emojify πŸ”€ β†’ πŸ¦„

This transformation augments the input sentence by swapping words into emojis with similar meanings. For example, it changes word "movie" to emoji "🎬".

Author name: Zijie J. Wang

Author email: jayw@gatech.edu

Author Affiliation: Georgia Tech

What type of a transformation is this?

This transformation acts like a translation to test language models' robustness and generalizability. In this context, we are translating English words into Emoji unicode. The transformed sentence has similar structure and semantics to the source sentence.

Some examples:

"Apple is looking at buying U.K. startup for $132 billion."

⬇

"🍎 is πŸ‘€ at πŸ›οΈ πŸ‡¬πŸ‡§ startup for $1️⃣3️⃣2️⃣ billion."
"The quick brown fox jumps over the lazy dog."

⬇

"The quick 🟀 🦊 jumps over the lazy πŸ•."
"Oh, and their spring rolls and the accompanying peanuts and hot sauces were also delicious."

⬇

"Oh, and their 🌱 🧻 and the accompanying πŸ₯œ and 🌑️ sauces were also πŸ˜‹."

Copy link
Collaborator

@tongshuangwu tongshuangwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the neat PR! I did spot another later PR that similarly tries to insert emojis. I also mentioned in the other one that you two seem to create emoji mappers in different ways, so might be good to merge them? #186 @kaustubhdhole thoughts?

@xiaohk
Copy link
Contributor Author

xiaohk commented Aug 6, 2021

@tongshuangwu Thanks for the comment! It seems #186 only focuses on smileys and smiley emojis with around 60 dictionary keys. My PR is more general and has more than 4k English keywords. I would like to keep two PRs separate.

Copy link
Collaborator

@tongshuangwu tongshuangwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the reply, I'm now approving this then!

@xiaohk
Copy link
Contributor Author

xiaohk commented Aug 14, 2021

thanks for the reply, I'm now approving this then!

Thank you! Ω©( ᐛ )و

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants