Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About dataset #9

Closed
Syzseisus opened this issue May 31, 2023 · 3 comments
Closed

About dataset #9

Syzseisus opened this issue May 31, 2023 · 3 comments

Comments

@Syzseisus
Copy link

Hi. First of all, I really appreciate to your wonderful work and code.

I open this issue to ask question:

How did you define node feature of each datasets?

Sincerely,
Wooseong Cho.

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Jun 1, 2023

Hi @Syzseisus .
Thanks for your interest!
We extract node structural features using the node2vec and node semantic features using Transformer-XL.
Refer to Paper:
image

@Syzseisus
Copy link
Author

Thank you for your quick response.

I've already reviewed the provided resources.
However, I'm interested in obtaining a detailed understanding of the configuration settings for
node2vec and Transformer-XL models.
Could you please provide me with the specific parameters or something?

ALTERNATIVELY, for FB15k, I would appreciate it if you could confirm whether the order of nodes in the pickle file provided in Google Drive is the same as in the dgl.
If you are unsure, please let me know the order
or you can check it by following these steps:

  1. For the order of nodes used in dgl, you can access the raw tgz file from here
    and find the order in the entities.dict file.
  2. However, please note that 0dgl uses the node entity name as a hash.
    To find the real names of the nodes, you can refer to the entities2wikidata.json file.

Thank you for your assistance.

Sincerely,
Wooseong Cho.

@GRAPH-0
Copy link
Owner

GRAPH-0 commented Jun 4, 2023

Hi,

  1. For node embedding, you can refer to Ask for the code of word embedding #3 and Ask For the problem of Transformer-XL to get text embedding #6.

The node2vec settings:

  • FB and tmdb:
model = Node2Vec(data.edge_index, embedding_dim=64, walk_length=20,
                     context_size=10, walks_per_node=10, num_negative_samples=1,
                     p=1, q=1, sparse=True).cuda()
loader = model.loader(batch_size=128, shuffle=True, num_workers=4)
optimizer = torch.optim.SparseAdam(list(model.parameters()), lr=0.01)
  • imdb:
model = Node2Vec(data.edge_index, embedding_dim=128, walk_length=20,
                     context_size=10, walks_per_node=10, num_negative_samples=1,
                     p=1, q=1, sparse=True).cuda()
  1. Sorry, I fail to open the tgz file from dgl to check the order. But you can get the correspondence between node id and mid from datasets/FB15k/fb15k_description.tsv. I think this can help you!

Sincerely,
Han

@GRAPH-0 GRAPH-0 closed this as completed Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants