# Word2Vec model widgets

This notebook introduces several examples of linking word2vec model in `ml5_ipynb` with jupyter widget `ipywidgets` to produce outputs.  
The model used contains 300-dimension embeddings for 10000 most common English words. There are smaller models in data folder.  

This example can refer to an word2vec [example](https://github.com/ml5js/ml5-library/tree/main/examples/p5js/Word2Vec/Word2Vec_Interactive/data) in ml5.js

**Note:** Using words not in the model embeddings will result in errors.

In [1]:
from ml5_ipynb import ml5_text
import ipywidgets as widgets

In [2]:
w2v = ml5_text.word2Vec('data/wordvecs10000.json')

VBox(children=(word2Vec(status='deferring flush until render'), Text(value='deferring flush until render', des…

....................Model is ready


## What are the Top 3 nearest words?

The following uses `nearest(word)` function to calculate the cosine distance and output the top 3 words with smallest distance.

In [3]:
nearest = widgets.Textarea(
    value='',
    placeholder='Type a word',
#     description='Please type a word',
    disabled=False
)
nearest_output = widgets.HTML(
    value="",
#     placeholder='Some HTML',
#     description='Some HTML',
)
nearest_button = widgets.ToggleButton(
    value=False,
    description='is nearest to',
    disabled=False,
    button_style='', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
#     icon='check' 
)
def get_nearest(val):
    if val:
        word = nearest.value
        if not word:
            print('Empty word!')
            return
        w2v.nearest(word)
        nearest_list = w2v.nearest_results[-1]
        if not nearest_list:
            print('No nearest word!')
            return
        nearest_words = [i['word'] for i in nearest_list[:3]]
        w_str = '<br>'.join(nearest_words)
        nearest_output.value = w_str
    nearest_button.value = False

In [4]:
out = widgets.interactive_output(get_nearest,{'val':nearest_button})
widgets.VBox([nearest,nearest_button,nearest_output,out])

VBox(children=(Textarea(value='', placeholder='Type a word'), ToggleButton(value=False, description='is neares…

## What's the Top 3 words between two words?

The following uses `average([word1,word2])` function to calculate the average of embedding of two words and output the top 3 words similar to the average embedding.

In [5]:
w1 = widgets.Text(
    value='',
    placeholder='Type a word',
    disabled=False
)
w2 = widgets.Text(
    value='',
    placeholder='Type a word',
    disabled=False
)
btw_output = widgets.HTML(
    value="",
)
btw_button = widgets.ToggleButton(
    value=False,
    description='is',
    disabled=False,
    button_style='', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
)
def get_btw(val):
    if val:
        word1 = w1.value
        word2 = w2.value
        if not word1 or not word2:
            print('Please type in both!')
            return
        w2v.average([word1,word2])
        btw_list = w2v.average_results[-1]
        if not btw_list:
            print('No between word!')
            return
        btw_words = [i['word'] for i in btw_list[:3]]
        w_str = '<br>'.join(btw_words)
        btw_output.value = w_str
    btw_button.value = False

In [6]:
btw_out = widgets.interactive_output(get_btw,{'val':btw_button})
widgets.VBox([widgets.HBox([widgets.HTML(value="Between "),
                            w1, 
                            widgets.HTML(value=" and "),
                            w2,btw_button]),
              btw_output,btw_out])

VBox(children=(HBox(children=(HTML(value='Between '), Text(value='', placeholder='Type a word'), HTML(value=' …

## Analogy

Analogy is to show how two things are similar to each other. Analogy of word embedding can refer to element-wise addition and subtraction. It is a "word algebra".   
For example, king is to queen as man is to woman. The resulting word is determined by the following formula.
```
vector('queen') - vector('king') + vector('man')
```

In [7]:
is_word = widgets.Text(
    value='',
    placeholder='Type a word',
    disabled=False
)
to_word = widgets.Text(
    value='',
    placeholder='Type a word',
    disabled=False
)
is_word2 = widgets.Text(
    value='',
    placeholder='Type a word',
    disabled=False
)
analogy_output = widgets.HTML(
    value="",
)
analogy_button = widgets.ToggleButton(
    value=False,
    description='is to',
    disabled=False,
    button_style='', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
)
def get_analogy(val):
    if val:
        iw = is_word.value
        tw = to_word.value
        iw2 = is_word2.value
        if not iw or not tw or not iw2:
            print('Please finish typing!')
            return
        w2v.subtract([tw,iw])
        sub_list = w2v.subtract_results[-1]
        if not sub_list:
            print('Oops! Please type in other words!')
            return
        sub_w = sub_list[0]['word']
        w2v.add([sub_w,iw2])
        add_list = w2v.add_results[-1]
        if not add_list:
            print('Oops! No analogy for this example!')
        add_word = [i['word']+"("+ str(round(i['distance'],2))+")" for i in add_list[:3]]
        analogy_output.value = " , ".join(add_word)
    analogy_button.value = False

In [8]:
analogy_out = widgets.interactive_output(get_analogy,{'val':analogy_button})
widgets.VBox([widgets.HBox([is_word,
                            widgets.HTML(value=" is to "),
                            to_word,
                            widgets.HTML(value=" as "),
                            is_word2,analogy_button]),
              analogy_output,analogy_out])

VBox(children=(HBox(children=(Text(value='', placeholder='Type a word'), HTML(value=' is to '), Text(value='',…