Compute filtered ranks for evaluating ComplEx #1080

huonw · 2020-03-13T06:26:07Z

This adds the final piece of the evaluation required to make ComplEx non-@experimental. it extends the ranking procedure performed in #901 to also compute the "filtered" ranks. This gives the rest of the metrics in Table 2 of the ComplEx paper (http://jmlr.org/proceedings/papers/v48/trouillon16.pdf):

As a reminder from #901, the knowledge graph link prediction metrics for a test edge E = (s, r, o) connecting nodes s and o are calculated by ranking the prediction for that edge against all modified-source E' = (n, r, o) and modified-object E'' = (s, r, n) edges (for all nodes n in the graph). The "raw" ranks are just the rank of E against the E' and against the E''.

The "filtered" ranks exclude the modified edges E' and E'' that are known, i.e. are in the train, validation or test sets. For instance, if E = (A, x, B) has score 1, but the modified edges (A, x, C) and (A, x, D) have scores 1.3 and 1.5 respectively, E has raw modified-object rank 3. If (A, x, D) is in the train set (or validation or test) but (A, x, C) is not, it is excluded from the filtered ranking, and so E has filtered modified-object rank 2.

This has been a struggle to implement correctly, because it has been difficult to correctly use the right node nodes in the right place of the ranking procedure. For modified-object ranking, with E and E'' as above, calculating the score of E in the column of scores of every modified-object edge E'' needs to use o, but calculating the known edges similar to E needs to use (s, r, _), not (o, r, _) (the latter is meaningless). (And similarly for modified-subject ranking.) It sounds obvious when written out like this, but it's somewhat difficult to keep track of which entity needs to go where in practice. (@kieranricardo had this key insight.)

The implementation works by: start with the raw greater matrix, where each column represents the test edges E with a row for every node in the graph (i.e. row n represents swapping node n into the subject or object) and the elements of a column are True if the score of that modified edge is greater than the score of E. For each edge/column, compute the indices of the similar known edges and set those indices to false, leaving only unknown edges with scores greater than E.

See: #1060

…x-notebook

review-notebook-app · 2020-03-13T06:26:13Z

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

stellargraph/layer/knowledge_graph.py

…ex-filtered

stellargraph/layer/knowledge_graph.py

codeclimate · 2020-03-16T22:36:05Z

Code Climate has analyzed commit 667c344 and detected 4 issues on this pull request.

Here's the issue category breakdown:

Category	Count
Security	4

View more on Code Climate.

kieranricardo

Woooo this looking really good! Great to see the results for WN18 matching the paper!! Just a couple of minor things.

The FB15K numbers still seem a little low but I don't think this is a blocking concern seeing as the WN18 results are matching - any idea whats going on?

Also not a blocking concern, just an idea for future work. I think users with powerful a CPU and/or GPU might notice a slowdown when using rank_edges_against_all_nodes because the numpy ops are single threaded. We might be able to speed this up using tensorflow ops/layer/model for the bulk ops. What do you think?

stellargraph/layer/knowledge_graph.py

kieranricardo · 2020-03-17T05:00:39Z

stellargraph/layer/knowledge_graph.py

@@ -175,27 +172,66 @@ def rank_edges_against_all_nodes(model, test_data, known_edges_graph):

        num_nodes = known_edges_graph.number_of_nodes()

-        def ranks(pred, true_ilocs):
-            batch_size = len(true_ilocs)
+        def ranks(


Could you add a docstring for ranks?

Also I think ranks can pulled out of ComplEx now - it doesn't seem have anything ComplEx specific and I think the only outer scope variables it uses are knowledge_graph and num_nodes.

I'm generally resistant about adding too much documentation for scoped and private functions, because maintaining documentation is error-prone, and it just drifts. However, I will move it out of ComplEx to _ranks_from_score_columns (or something) and give it a doc string.

huonw · 2020-03-17T05:51:04Z

The FB15K numbers still seem a little low but I don't think this is a blocking concern seeing as the WN18 results are matching - any idea whats going on?

Unclear, I tried with a few different parameters (e.g. increasing the embedding dimension from 150 to 200), and tweaking the learning rate, and they made a bit of difference (e.g. I kept the dimension at 200), but I got bored of waiting to do this hyperparameter optimisation work. If it's deemed important, I can put more effort into it/write a script, or even just switch to the slower Adagrad optimiser (one potential problem is our regularisation doesn't quite match theirs, because Keras's BinaryCrossentropy takes the mean, but the paper and https://github.com/ttrouill/complex doesn't, so one needs to scale the L2 normalisation lambda appropriately).

Also not a blocking concern, just an idea for future work. I think users with powerful a CPU and/or GPU might notice a slowdown when using rank_edges_against_all_nodes because the numpy ops are single threaded. We might be able to speed this up using tensorflow ops/layer/model for the bulk ops. What do you think?

That's not a bad idea, but I'd prefer to do it as future work, to keep this PR focused on the filtered-ranks-computation.

kieranricardo

looks good!

I don't think you need to worry about matching the FB15K results exactly - matching all the results from every paper would be extremely time consuming and the WN18 results are enough for me at least. Whadya reckon @PantelisElinas?

kieranricardo · 2020-03-17T06:14:19Z

@huonw oh one more thing, you should update the changelog because ComplEx is a new non-experimental algorithm 🎉

huonw · 2020-03-18T04:18:35Z

I haven't been bothering to update the CHANGELOG patch-by-patch because I find it just results in merge conflicts. Since I'm "slacking" off in this respect, I've instead made sure to go over the git log before each release like #930.

PantelisElinas · 2020-03-18T21:57:53Z

@huonw just letting you know that I won't be able to review this until next week. So, if you are satisfied with the one review, then go ahead and merge.

P.

huonw · 2020-03-18T22:53:34Z

Thanks for letting me know 👍

I'll merge this with just @kieranricardo's review.

huonw added 9 commits March 11, 2020 15:51

Write a demo notebook for the ComplEx model

1d5b556

black

4c4de3b

parallelism

4158974

better variable names, better iteration, remove unused true_is_source

b72268f

dataframe printing, more epochs

7b8c862

more epochs + early stopping, rename

92ccfd5

Merge remote-tracking branch 'origin/develop' into feature/862-comple…

c0b5946

…x-notebook

Parallelism

f4a7c1b

Compute filtered ranks for evaluating ComplEx

67bc7fb

kieranricardo self-requested a review March 16, 2020 02:08

kieranricardo reviewed Mar 16, 2020

View reviewed changes

stellargraph/layer/knowledge_graph.py Outdated Show resolved Hide resolved

kieranricardo and others added 2 commits March 16, 2020 16:09

fixed complex filtered scores

b9f0afe

Adjustments

f20c113

huonw changed the base branch from feature/862-complex-notebook to develop March 16, 2020 22:31

huonw added 2 commits March 17, 2020 09:32

black

9b4165a

Merge remote-tracking branch 'origin/develop' into feature/1060-compl…

3ed5f08

…ex-filtered

codeclimate bot reviewed Mar 16, 2020

View reviewed changes

stellargraph/layer/knowledge_graph.py Outdated Show resolved Hide resolved

stellargraph/layer/knowledge_graph.py Outdated Show resolved Hide resolved

stellargraph/layer/knowledge_graph.py Show resolved Hide resolved

huonw added 3 commits March 17, 2020 10:32

fix comment

b64fbf3

Rerun, deexperimentalise

99d3ab4

Embedding dimension = 200

1f9efcb

huonw marked this pull request as ready for review March 17, 2020 02:44

Add comment

e3fa99e

huonw requested review from kieranricardo and PantelisElinas March 17, 2020 02:48

kieranricardo reviewed Mar 17, 2020

View reviewed changes

Docs and move ranks inner function

667c344

kieranricardo approved these changes Mar 17, 2020

View reviewed changes

huonw merged commit bc0388a into develop Mar 19, 2020

huonw deleted the feature/1060-complex-filtered branch March 19, 2020 04:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute filtered ranks for evaluating ComplEx #1080

Compute filtered ranks for evaluating ComplEx #1080

huonw commented Mar 13, 2020 •

edited

review-notebook-app bot commented Mar 13, 2020

codeclimate bot commented Mar 16, 2020 •

edited

kieranricardo left a comment

kieranricardo Mar 17, 2020

huonw Mar 17, 2020

huonw commented Mar 17, 2020

kieranricardo left a comment

kieranricardo commented Mar 17, 2020

huonw commented Mar 18, 2020

PantelisElinas commented Mar 18, 2020

huonw commented Mar 18, 2020

Compute filtered ranks for evaluating ComplEx #1080

Compute filtered ranks for evaluating ComplEx #1080

Conversation

huonw commented Mar 13, 2020 • edited

review-notebook-app bot commented Mar 13, 2020

codeclimate bot commented Mar 16, 2020 • edited

kieranricardo left a comment

Choose a reason for hiding this comment

kieranricardo Mar 17, 2020

Choose a reason for hiding this comment

huonw Mar 17, 2020

Choose a reason for hiding this comment

huonw commented Mar 17, 2020

kieranricardo left a comment

Choose a reason for hiding this comment

kieranricardo commented Mar 17, 2020

huonw commented Mar 18, 2020

PantelisElinas commented Mar 18, 2020

huonw commented Mar 18, 2020

huonw commented Mar 13, 2020 •

edited

codeclimate bot commented Mar 16, 2020 •

edited