🦜 🏴‍☠️ Implement NodePiece representation and model #621

mberr · 2021-11-08T11:59:17Z

This is a first draft to add NodePiece representations to pykeen.

For now, it uses a simple variant of it, where each entity is represented by k randomly chosen incident relations.

Official Code: https://github.com/migalkin/NodePiece
Paper: https://arxiv.org/abs/2106.12144

trigger ci

mberr · 2021-11-08T11:59:46Z

@migalkin would be great to have your feedback, as you may be familiar with it 😉

trigger ci

cthoyt · 2021-11-08T12:02:34Z

Can we have a demo on how you would use this representation with a model? Like can we easily implement a TransE with NodePiece?

mberr · 2021-11-08T12:22:15Z

Can we have a demo on how you would use this representation with a model? Like can we easily implement a TransE with NodePiece?

Sure.

from typing import Optional

from class_resolver.api import HintOrType

from pykeen.models.nbase import ERModel
from pykeen.nn.emb import EmbeddingSpecification, NodePieceRepresentation
from pykeen.nn.modules import Interaction, TransEInteraction
from pykeen.pipeline import pipeline
from pykeen.triples.triples_factory import CoreTriplesFactory


class NodePieceModel(ERModel):
    def __init__(
        self,
        *,
        triples_factory: CoreTriplesFactory,
        embedding_specification: Optional[EmbeddingSpecification] = None,
        interaction: HintOrType[Interaction] = TransEInteraction,
        **kwargs,
    ) -> None:
        if embedding_specification is None:
            embedding_specification = EmbeddingSpecification(
                shape=(64,),
            )
        entity_representations = NodePieceRepresentation(
            triples_factory=triples_factory,
            token_representation=embedding_specification,
        )
        super().__init__(
            triples_factory=triples_factory,
            interaction=interaction,
            entity_representations=entity_representations,
            relation_representations=embedding_specification,
            **kwargs,
        )


result = pipeline(
    dataset="nations",
    model=NodePieceModel,
    model_kwargs=dict(
        interaction_kwargs=dict(
            p=2,
        ),
    ),
)
print(result.get_metric("hits_at_10"))

EDIT: added in 6b29c4e

src/pykeen/nn/emb.py

trigger ci

tests/test_nn/test_emb.py

src/pykeen/nn/emb.py

Trigger CI

src/pykeen/nn/emb.py

trigger ci

src/pykeen/models/unimodal/node_piece.py

trigger ci

also fix create_inverse_triples for NodePiece test trigger ci

mberr · 2021-11-14T19:35:04Z

fe90b42 - @cthoyt this is not really part of this PR

trigger ci

mberr · 2021-11-14T19:44:13Z

tests/cases.py

@@ -931,7 +931,7 @@ def test_score_t(self) -> None:
        try:
            scores = self.instance.score_t(batch)
        except NotImplementedError:
-            self.fail(msg="Score_o not yet implemented")
+            self.fail(msg="score_t not yet implemented")


this typo is not really part of the PR

mberr · 2021-11-14T19:44:22Z

tests/cases.py

@@ -950,7 +968,7 @@ def test_score_h(self) -> None:
        try:
            scores = self.instance.score_h(batch)
        except NotImplementedError:
-            self.fail(msg="Score_s not yet implemented")
+            self.fail(msg="score_h not yet implemented")


trigger ci

cthoyt · 2021-11-15T00:13:02Z

Looks like the issues are now with ConvE's tests

migalkin · 2021-11-15T00:26:27Z

Looks like the issues are now with ConvE's tests

I was running the debugger for the ConvE test and for some reason after initialization of TestConvE(cases.ModelTestCase) it goes on to initialize NodePiece although it is not related anyhow to the ConvE test 🤔

migalkin

The implementation works 🎉
Exposing the ratio param from the MLP encoder sounds like a good idea (with the default value 2), otherwise everything looks ready!

migalkin · 2021-11-10T15:34:14Z

src/pykeen/models/unimodal/node_piece.py

+            :func:`torch.max`, or even trainable aggregations e.g., ``MLP(mean(MLP(tokens)))``
+            (cf. DeepSets from [zaheer2017]_) if given value ``"mlp"``.


The current _ConcatMLP is not DeepSets :)

The idea of DeepSets is to project each set member independently through some encoder, then aggregate (like with mean) and then pass through another FF net. It would look like this:

enc1 = nn.Sequential( nn.Linear(embedding_dim, embedding_dim), nn.ReLU(), nn.Linear(embedding_dim, embedding_dim) ) enc2 = nn.Sequential(nn.Linear(embedding_dim, embedding_dim), nn.ReLU(), nn.Linear(embedding_dim, output_dim))

and in forward pass:

# x: shape (bs, num_elements, embedding_dim) x = enc1(x) # the same shape (bs, num_elements, embedding_dim) x = torch.mean(-2) # here we do the aggregation to (bs, embedding_dim) x = enc2(x) # final projection keeping (bs, output_dim)

It can be added as an option along with mlp though

The correct docstring somehow got lost during the refactoring 😅

here it was still correct: #621 (comment)

mberr · 2021-11-15T09:26:17Z

Looks like the issues are now with ConvE's tests

I think the issue is that we did not yet think about what should be scored in score_r, if we have inverse relations. In the baseline implementation in _OldAbstractModel, we use relation_ids to create the individual hrt triples, which does not contain the inverse relations, but only the "real" ones.

We could either:

skip the score_r test for models with inverse triples
decide on how to handle this generally

To keep this PR focused on one thing, I tend towards option 1.

trigger ci

cthoyt · 2021-11-15T09:57:44Z

Yes let’s bump this. So let’s override the test in conve to be skipped and leave a todo for later

it is not yet clear what would be the desired output shape

trigger ci

cthoyt

mberr added 4 commits November 8, 2021 12:48

add node pieces representation

f8d47bc

add test case for node piece representations

36dceaf

fix reduction

81d4027

remove unused import

abacd7d

trigger ci

mberr changed the title ~~Node Piece Repreentation~~ Node Piece Representation Nov 8, 2021

fix name & add placeholder for citation

82a2cfa

trigger ci

re-use embedding specification

59e3bbc

cthoyt reviewed Nov 8, 2021

View reviewed changes

src/pykeen/nn/emb.py Outdated Show resolved Hide resolved

cthoyt reviewed Nov 8, 2021

View reviewed changes

src/pykeen/nn/emb.py Outdated Show resolved Hide resolved

cthoyt reviewed Nov 8, 2021

View reviewed changes

src/pykeen/nn/emb.py Outdated Show resolved Hide resolved

mberr and others added 7 commits November 8, 2021 13:41

add node piece model

6b29c4e

Update src/pykeen/nn/emb.py

f771e7a

allow custom output shape

61600ea

Update emb.py

c614a48

allow provision of pre-instantiated token repr.

70bf759

Merge branch 'node-pieces' of github.com:pykeen/pykeen into node-pieces

3d0a97c

export model and add tests

cf2fb73

trigger ci

mberr commented Nov 8, 2021

View reviewed changes

tests/test_nn/test_emb.py Show resolved Hide resolved

Update citation and readme

7109c7a

cthoyt reviewed Nov 8, 2021

View reviewed changes

src/pykeen/nn/emb.py Outdated Show resolved Hide resolved

cthoyt added 5 commits November 8, 2021 14:18

Remove missing import

428b33a

Update docs

20dde45

Remove unnecessary kwargs

dd6718d

Fix last mypy

f18299e

Trigger CI

Add citation

5bbc4bb

cthoyt reviewed Nov 8, 2021

View reviewed changes

src/pykeen/nn/emb.py Show resolved Hide resolved

Update typing in tests

9afeb12

fix format

506c148

trigger ci

mberr commented Nov 14, 2021

View reviewed changes

src/pykeen/models/unimodal/node_piece.py Show resolved Hide resolved

mberr added 8 commits November 14, 2021 20:04

add num_tokens to docstring

840da3b

trigger ci

re-apply black

f667ae3

trigger ci

add test for score_r, and fix typos

276559a

re-use code from test case

0e3d9e0

fix typo in nbase

fe90b42

trigger ci

909b36c

raise error instead of warn

ac97a56

implement inverse triples creation for CLI test

ffc43c2

also fix create_inverse_triples for NodePiece test trigger ci

hide padding relation to enable relation scoring

2b9fa84

trigger ci

mberr commented Nov 14, 2021

View reviewed changes

mberr added 2 commits November 14, 2021 20:46

fix darglint

e0e9c44

trigger ci

fix negative index not being supported by module

bb0c5fc

trigger ci

migalkin approved these changes Nov 15, 2021

View reviewed changes

mberr added 2 commits November 15, 2021 10:55

add warning

abe4687

update docstring

a669084

trigger ci

mberr added 2 commits November 15, 2021 11:04

skip score_r shape verification if inv. relations

3b203fd

it is not yet clear what would be the desired output shape

add TODO

cce4d07

trigger ci

mberr marked this pull request as ready for review November 15, 2021 10:08

add test for SubsetRepresentationModule

f159168

trigger ci

cthoyt approved these changes Nov 15, 2021

View reviewed changes

mberr merged commit 9837077 into master Nov 15, 2021

mberr deleted the node-pieces branch November 15, 2021 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🦜 🏴‍☠️ Implement NodePiece representation and model #621

🦜 🏴‍☠️ Implement NodePiece representation and model #621

mberr commented Nov 8, 2021 •

edited by cthoyt

mberr commented Nov 8, 2021

cthoyt commented Nov 8, 2021

mberr commented Nov 8, 2021 •

edited

mberr commented Nov 14, 2021

mberr Nov 14, 2021

mberr Nov 14, 2021

cthoyt commented Nov 15, 2021

migalkin commented Nov 15, 2021

migalkin left a comment

migalkin Nov 10, 2021

mberr Nov 15, 2021

mberr commented Nov 15, 2021 •

edited

cthoyt commented Nov 15, 2021

cthoyt left a comment

		:func:`torch.max`, or even trainable aggregations e.g., ``MLP(mean(MLP(tokens)))``
		(cf. DeepSets from [zaheer2017]_) if given value ``"mlp"``.

🦜 🏴‍☠️ Implement NodePiece representation and model #621

🦜 🏴‍☠️ Implement NodePiece representation and model #621

Conversation

mberr commented Nov 8, 2021 • edited by cthoyt

mberr commented Nov 8, 2021

cthoyt commented Nov 8, 2021

mberr commented Nov 8, 2021 • edited

mberr commented Nov 14, 2021

mberr Nov 14, 2021

Choose a reason for hiding this comment

mberr Nov 14, 2021

Choose a reason for hiding this comment

cthoyt commented Nov 15, 2021

migalkin commented Nov 15, 2021

migalkin left a comment

Choose a reason for hiding this comment

migalkin Nov 10, 2021

Choose a reason for hiding this comment

mberr Nov 15, 2021

Choose a reason for hiding this comment

mberr commented Nov 15, 2021 • edited

cthoyt commented Nov 15, 2021

cthoyt left a comment

Choose a reason for hiding this comment

mberr commented Nov 8, 2021 •

edited by cthoyt

mberr commented Nov 8, 2021 •

edited

mberr commented Nov 15, 2021 •

edited