In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from whatlies import Embedding, EmbeddingSet
import spacy 

## Making Plots ... More Cool 

The `Embedding` object merely has support for matplotlib, but the `EmbeddingSet` has support for Altair too! You can plot this interactively by just passing the names of the tokens you'd like to see.

In [3]:
nlp = spacy.load("en_core_web_md")
words = ["prince", "princess", "nurse", "doctor", "banker", "man", "woman", 
         "cousin", "neice", "king", "queen", "dude", "guy", "gal", "fire", 
         "dog", "cat", "mouse", "red", "bluee", "green", "yellow", "water", 
         "person", "family", "brother", "sister"]
emb = EmbeddingSet({t.text: Embedding(t.text, t.vector) for t in nlp.pipe(words)})

In [4]:
orig_chart = emb.plot_interactive('man', 'woman')

new_ts = emb | (emb['king'] - emb['queen'])
new_chart = new_ts.plot_interactive('man', 'woman')

Note that altair has a convenient syntax for plotting two charts next to eachother. This is really cool when you want to compare. Feel free to zoom in and play as well!

In [5]:
orig_chart | new_chart

## Transformations

There's something extra too. Sofar we're been mapping vectors unto other ones in order to plot them. But theoretically we could go a step further.

In [8]:
from whatlies.transformers import pca, umap

orig_chart = emb.plot_interactive('man', 'woman')
pca_emb = emb.transform(pca(2))
umap_emb = emb.transform(umap(2))

pca_emb.plot_interactive('pca_0', 'pca_1') | umap_emb.plot_interactive('umap_0', 'umap_1')

Note that we can increase the number of components and still only plot a few. 

In [7]:
pca_emb = emb.transform(pca(3))

pca_emb.plot_interactive('pca_0', 'pca_1') | pca_emb.plot_interactive('pca_2', 'pca_1')

But why go with only two plots when you can have an entire matrix? 

In [8]:
pca_emb.plot_interactive_matrix('pca_0', 'pca_1', 'pca_2', annot=True)

What is particularily interesting here is the pca axes. They seem to encode information and can we attempt an understanding by glancing at it.

But the overlap makes it hard to read. So let's apply one more transformation here.

In [12]:
from whatlies.transformers import noise 

(emb
 .transform(pca(3))
 .transform(noise(1))
 .plot_interactive_matrix('king', 'queen', 'man', 'woman', annot=True, width=200, height=200))