# PheKnowLator Animation
Sometimes, expecially when preparing a presentation for a conference or your work colleagues, a good animation can say more than a thousand words.

For this reason, we have prepared a straighforward way to create animations for a number of tasks using GRAPE that, through subsampling, can be executed on graphs of arbitrary size.

In this brief tutorial, we will show how to get a PheKnowLator embedding using First-order LINE, and then we will use TSNE decoposition to reduce its dimensionality and plot it into a short video.

The resulting WEBM can be converted using one of many services and can be directly incorporated in Google Slides.

## Retrieving PheKnowLator
First, we retrieve PheKnowLator:

In [2]:
from grape.datasets.pheknowlatorkg import PheKnowLator

graph = PheKnowLator()

Then, let's take a look at its graph report:

In [3]:
graph

## Connected holdout
Since we want to visualize an edge prediction task on this graph, we need to create a connected holdout:

In [4]:
%%time
train, test = graph.connected_holdout(train_size=0.7)
train.enable()

CPU times: user 4.96 s, sys: 127 ms, total: 5.09 s
Wall time: 3.02 s


## Compute the embedding
Next, we compute the embedding using the First-order LINE method. Do note that this implementation is a data-race aware one that uses SGD as optimizer, and nothing fancy like ADAM or NADAM: this means that the memory footprint is only limited to the embedding size.

In [5]:
%%time
from grape.embedders import FirstOrderLINEEnsmallen
embedding = FirstOrderLINEEnsmallen().fit_transform(train)

CPU times: user 13min 41s, sys: 1.07 s, total: 13min 42s
Wall time: 35.3 s


## Visualize the embedding on the test graph
We are at the end, finally visualizing the test graph.

In [6]:
from grape import GraphVisualizer

vis = GraphVisualizer(
    graph=test,
    support=train,
    n_components=4,
    edge_embedding_method="Hadamard",
    rotate=True,
    verbose=True,
    # Automatically, since LINE learns a cosine, the visualization tool
    # would dispatch a Cosine-distance based TSNE. This would use the sklearn
    # implementation, which is terribly slow. Therefore, we force it to use the Euclidean distance
    # and therefore the Multicore TSNE implementation (when available).
    decomposition_kwargs=dict(metric="euclidean")
)

Then we run the TSNE, this may take a while.

In [7]:
%%time
vis.fit_negative_and_positive_edges(embedding)

Performing t-SNE using 24 cores.
Using no_dims = 4, perplexity = 30.000000, and theta = 0.500000
Computing input similarities...
Building tree...
 - point 2000 of 20000
 - point 4000 of 20000
 - point 6000 of 20000
 - point 8000 of 20000
 - point 10001 of 20000
 - point 12000 of 20000
 - point 14000 of 20000
 - point 16000 of 20000
 - point 18000 of 20000
 - point 20000 of 20000
Done in 0.00 seconds (sparsity = 0.006355)!
Learning embedding...
Iteration 51: error is 101.399671 (50 iterations in 7.00 seconds)
Iteration 101: error is 87.444477 (50 iterations in 6.00 seconds)
Iteration 151: error is 83.274944 (50 iterations in 7.00 seconds)
Iteration 201: error is 81.845552 (50 iterations in 7.00 seconds)
Iteration 251: error is 81.079643 (50 iterations in 6.00 seconds)
Iteration 301: error is 2.939348 (50 iterations in 7.00 seconds)
Iteration 351: error is 2.509616 (50 iterations in 6.00 seconds)
Iteration 400: error is 2.273571 (50 iterations in 7.00 seconds)


CPU times: user 6min 56s, sys: 14min 2s, total: 20min 59s
Wall time: 53.4 s


Fitting performed in 53.00 seconds.


In [8]:
%%time
vis.plot_positive_and_negative_edges()

Rendering frames:   0%|                                                                                       …

OpenCV: FFMPEG: tag 0x30387076/'vp80' is not supported with codec id 139 and format 'webm / WebM'


Merging frames:   0%|                                                                                         …

CPU times: user 9min 51s, sys: 12.4 s, total: 10min 4s
Wall time: 1min 35s
