For my digital cut-up revisited, I worked with integrating emojis into spaCy text generation. To be honest, I was a bit regretful of choosing to use emojis for my cut-up, because they were a bit difficult to work with. However, after thoroughly analyzing the tutorials, I thought it would be quite interesting to see if I could actually assign 'emoji vectors'. 

In [109]:
import json

In [110]:
import numpy as np

In [111]:
import emoji

In [112]:
emoji_data = json.loads(open("emojis.json").read())

After importing all the necessary modules, I stored a modified json file (duplicated from the color data corpus) into emoji_data. The modified corpus looked something like this:
        {
            "color": "🟡",
            "hex": "#f1f33f"
        },

        {
            "color": "💄",
            "hex": "#c0022f"
        },

        {
            "color": "🌅",
            "hex": "#fffd37"
        },

        {
            "color": "🐙",
            "hex": "#a442a0"
        },

        {
            "color": "🎲",
            "hex": "#fff4f2"
        },

        {
            "color": "🦚",
            "hex": "#0add08"
        },
        
 Looking at this corpus after the finishing the assignment, I see a lot of issues with it, namely the assignment of the emojis to the hex values. However, because I wasn't sure what values I could assign to the emojis to vectorize them, I decided to just use the hex values in the end. I made sure to categorize each emoji by assigning it to the hex value closest to the color it had, as well as the name of the colors themselves (ex: lipstick red is denoted by the lipstick emoji).

In [113]:
def hex_to_int(s):
    s = s.lstrip("#")
    return np.array([int(s[:2], 16), int(s[2:4], 16), int(s[4:6], 16)])

Converted the hex values into integers.

In [114]:
emojis = dict()
for item in emoji_data['colors']:
    emojis[item["color"]] = hex_to_int(item["hex"])

Storing the mapped values into 'emojis' dictionary.

In [115]:
emojis['blue']

array([  0,   0, 255])

In [116]:
import sys

In [117]:
from simpleneighbors import SimpleNeighbors

In [118]:
emoji_lookup = SimpleNeighbors(3, 'euclidean')
for name, vec in emojis.items():
    emoji_lookup.add_one(name, vec)
emoji_lookup.build()

Add the values to the index.

In [119]:
emoji_lookup.nearest(emojis['red'])

['red', '🚒', '🛑', '🍒', '🍅', '🍓', '🍄', '🥤', '🦧', '🍎', '🌶', '🍑']

In [120]:
emoji_lookup.nearest(emojis['white'])

['white', '🥼', '🎲', '🥋', '⚽', '🚑', '🥛', '🍚', '🥚', '🦢', '🐇', '👻']

In [121]:
emoji_lookup.nearest(emojis['black'])

['black', '☕', '🍫', '🌘', '🦇', '🐀', '🦝', '🌑', '🎱', '♠', '🥥', '🌲']

In [122]:
emoji_lookup.nearest(emojis['yellow'])

['yellow', '🌟', '🌻', '🏵', '🌼', '🌅', '🍌', '🍋', '🐤', '🟡', '😺', 'orange']

In [123]:
emoji_lookup.nearest(emojis['pink'])

['pink', '🍭', '👻', '🌸', '🥚', '🎲', '🌺', '🐷', '🥋', '🍣', '🛍', '🎀']

In [124]:
emoji_lookup.nearest(emojis['orange'])

['orange', '🍊', '🎃', '🏵', '🔥', '🧀', '🌻', '🌼', '🦁', '🏀', '🥭', '🌟']

In [125]:
emoji_lookup.nearest(emojis['blue'])

['blue', '🧊', '🦋', '🌌', '🟦', '🧞\u200d', '🔌', '💠', '🔵', '🏙', '🔮', '🥀']

In [126]:
emoji_lookup.nearest(emojis['green'])

['green', '🦚', '👒', '☘', '🟢', '🗽', '🍀', '🐲', '🍃', '🌿', '🐉', '🌱']

There seems to be no issue finding the nearest emojis to the specified colors, so the next step would be to see if I could input a text and receive an emoji output.

In [127]:
import spacy
nlp = spacy.load('en_core_web_md')

In [128]:
def meanv(vecs):
    total = np.sum(vecs, axis=0)
    return total / len(vecs)

The function from the tutorial (from what I understand, it calculates the mean of all the vectors, which is useful in generating an 'encapsulating emoji' for the text).

In [129]:
doc = nlp(open("alice.txt").read())
alice_emojis = [emojis[word.lower_] for word in doc if word.lower_ in emojis]
avg_emojis = meanv(alice_emojis)
print(avg_emojis)

[226.325 225.675 199.525]


I chose the text 'Alice in Wonderland' for my test run.

In [130]:
emoji_lookup.nearest(avg_emojis)

['🥚', 'pink', '🥋', '🎲', '🍭', '🥛', '👻', '⚽', '🚑', '🐇', '🍚', '🦢']

It was quite interesting to see that 'Alice in Wonderland' was embodied by an overwhelmingly 'white' emoji color scheme.

In [131]:
import random
red = emojis['red']
blue = emojis['blue']
for i in range(14):
    rednames = emoji_lookup.nearest(red)
    bluenames = emoji_lookup.nearest(blue)
    print("On the outside I'm " + rednames[0] + ", but really I'm " + bluenames[0])
    red = emojis[random.choice(rednames[1:])]
    blue = emojis[random.choice(bluenames[1:])]

On the outside I'm red, but really I'm blue
On the outside I'm 🍎, but really I'm 🏙
On the outside I'm 🍒, but really I'm 🦋
On the outside I'm red, but really I'm 🔮
On the outside I'm 🌶, but really I'm 🥣
On the outside I'm 🍓, but really I'm 🔷
On the outside I'm 🟠, but really I'm 🔵
On the outside I'm 🍓, but really I'm 🔮
On the outside I'm 🍄, but really I'm 🦋
On the outside I'm red, but really I'm 🧞‍
On the outside I'm 🍄, but really I'm 🔌
On the outside I'm 🛑, but really I'm 🏙
On the outside I'm 💄, but really I'm 🌃
On the outside I'm 🚒, but really I'm 🔮


I also tried to create some emoji poems as well.

In [135]:
doc = nlp(open("frankenstein.txt").read())
franken_emojis = [emojis[word.lower_] for word in doc if word.lower_ in emojis]
avg_emojis = meanv(franken_emojis)
print(avg_emojis)
print("Frankenstein is" + str(emoji_lookup.nearest(avg_emojis)))

[ 86.7  105.84 116.26]
Frankenstein is['🐟', '🐋', '🟪', '🥗', '🍠', '🎾', '💚', '🔷', '🥣', '🍆', '🍇', '🔮']


Another attempt with 'Frankenstein' as the input text. Interestingly, 'Frankenstein' differed from the all-white of "Alice in Wonderland', complete with blue, purple, and green colors. I really wish that I had organized the emojis in a different way when I was creating the corpus, perhaps according to their meaning rather than color. Hopefully, as I experiment more with emojis and vectors, I can create an Emoji-Generator that can encapsulate even more of the text's meaning.