Embed unseen s/p/o 's #205

MatthewGleeson · 2021-05-11T19:53:09Z

Thanks for publishing this great repo!

I'm trying to use the pretrained KGE models in this repo to create embeddings for unseen objects and predicates but I'm having trouble figuring out how to do so.

The "Use a pretrained model in an application" portion of the README has been helpful, but I want to be able to pass a trio of s/p/o strings such as 'dirt' 'component of', 'clay' to a pretrained ComplEx model instead of passing an index to a value in the Wordnet database.

Is there a way to do this? Would I need to do some sort of transfer learning on the embedderr first? I've got a dataset I can use if that is the case.

rufex2001 · 2021-05-11T20:37:37Z

If I understand correctly, you need a mapping between your mentioned strings and the index used internally by the models to identify embeddings. If the triples in the dataset you are using already come with such strings, then the entity_ids.del file has the mapping you need and should be inside your dataset folder after you preprocess it. If your dataset comes with unreadable IDs like those in, for example, WN18, then you need a mapping between those IDs and their readable counterparts. This needs to come with your dataset. Once you have such mappings, you can use them to turn your strings into the indexes used by the models.

…

On Tue, 11 May 2021, 21:53 Matt Gleeson, ***@***.***> wrote: Thanks for publishing this great repo! I'm trying to use the pretrained KGE models in this repo to create embeddings for unseen objects and predicates but I'm having trouble figuring out how to do so. The "Use a pretrained model in an application" portion of the README has been helpful, but I want to be able to pass a trio of s/p/o strings such as 'dirt' 'component of', 'clay' to a pretrained ComplEx model instead of passing an index to a value in the Wordnet database. Is there a way to do this? Would I need to do some sort of transfer learning on the embedderr first? I've got a dataset I can use if that is the case. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#205>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABEWXZHMZ77O45PX4CTRZPLTNGDLZANCNFSM44WXBGCA> .

MatthewGleeson · 2021-05-12T00:04:42Z

Yep I'm hoping to adapt some of the pretrained KGE models that are listed in the README for a knowledge graph-informed RL application(specifically, https://github.com/minqi/wordcraft). I've already checked and many of the objects I need embeddings for do not exist in the lookup table for some of the datasets(which I'm assuming would be required to use the entity_ids.del file you're talking about). So in that case, I'd need to either train the KGE models from scratch or make use of transfer learning to calculate the embeddings somehow, right? I can't pass these unseen strings to the entity/relationship embedders?

rufex2001 · 2021-05-12T05:40:25Z

Correct. These models don't have/learn representations of entities and relations that aren't seen during training.

…

On Wed, 12 May 2021, 02:05 Matt Gleeson, ***@***.***> wrote: Yep I'm hoping to adapt some of the pretrained KGE models that are listed in the README for a knowledge graph-informed RL application(specifically, https://github.com/minqi/wordcraft). I've already checked and many of the objects I need embeddings for do not exist in the lookup table for some of the datasets(which I'm assuming would be required to use the entity_ids.del file you're talking about). So in that case, I'd need to either re-train the KGE models from scratch or make use of transfer learning to calculate the embeddings somehow, right? I can't pass these unseen strings to the entity/relationship embedders? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#205 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABEWXZE45NVXYMPL3FCNOKLTNHA27ANCNFSM44WXBGCA> .

AdrianKs · 2021-05-12T07:22:55Z

you could create a new dataset extending the one the model is currently trained on. Then train on the new dataset and load the embeddings that are already trained with our load_pretrained option.
Additionnally you could try and also freeze the pretrained embedding, but this option is not yet in the master branch but a in PR #136

rgemulla · 2021-05-12T07:51:21Z

You may either retrain on your dataset or use a KGE model that constructs entity/relations embeddings from textual representations. (We may add such an implementation to LibKGE soon.)

esulaiman · 2021-06-06T14:12:17Z

you could create a new dataset extending the one the model is currently trained on. Then train on the new dataset and load the embeddings that are already trained with our load_pretrained option.
Additionnally you could try and also freeze the pretrained embedding, but this option is not yet in the master branch but a in PR #136

I am training my own dataset using the model provided by this library to obtain KGE. Can you kindly help with how to use load_pretrained option.

MatthewGleeson · 2021-06-06T15:14:44Z

@esulaiman it would be better to open a separate issue for this problem. But take a look at #174, and the code in the README section titled "use your own dataset"

MatthewGleeson · 2021-06-06T15:26:14Z

I'm closing this issue but I wanted to leave a note of what I did:
-created a compatible dataset from the WordCraft environment matching data/toy/*.txt ( entities, relations, test, train, valid)
-trained many kge models from this repo on this dataset
-Used the kge repo's model.score_spo function in my WordCraft agent to inform its decisions in the mdp
-This was for a group project for my NLP class, anyone interested can take a look at our demo notebook:
https://colab.research.google.com/drive/1bL3U19pmd9l-1nG_lmdGx-AjVKwpmd3N?usp=sharing

MatthewGleeson closed this as completed Jun 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed unseen s/p/o 's #205

Embed unseen s/p/o 's #205

MatthewGleeson commented May 11, 2021

rufex2001 commented May 11, 2021 via email

MatthewGleeson commented May 12, 2021 •

edited

Loading

rufex2001 commented May 12, 2021 via email

AdrianKs commented May 12, 2021

rgemulla commented May 12, 2021

esulaiman commented Jun 6, 2021

MatthewGleeson commented Jun 6, 2021

MatthewGleeson commented Jun 6, 2021 •

edited

Loading

Embed unseen s/p/o 's #205

Embed unseen s/p/o 's #205

Comments

MatthewGleeson commented May 11, 2021

rufex2001 commented May 11, 2021 via email

MatthewGleeson commented May 12, 2021 • edited Loading

rufex2001 commented May 12, 2021 via email

AdrianKs commented May 12, 2021

rgemulla commented May 12, 2021

esulaiman commented Jun 6, 2021

MatthewGleeson commented Jun 6, 2021

MatthewGleeson commented Jun 6, 2021 • edited Loading

MatthewGleeson commented May 12, 2021 •

edited

Loading

MatthewGleeson commented Jun 6, 2021 •

edited

Loading