Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meaning of 'features' object #45

Open
MinhAnhL opened this issue Jul 22, 2019 · 3 comments
Open

Meaning of 'features' object #45

MinhAnhL opened this issue Jul 22, 2019 · 3 comments

Comments

@MinhAnhL
Copy link

MinhAnhL commented Jul 22, 2019

Dear @tkipf
first of all, thank you for the excellent work! Your paper and the provided code are helpful to get started with GCN.

I'm currently trying to apply your algorithm to my data.
After looking at the load_data() function, I was able to create an adjacency matrix in the same format as your Cora example.

However, I struggle with the node feature object, because I don't understand the meaning of the content cora.tx and cora.allx.
The shape is (2709x1433) (#41 nodes x #features), so apparently, there are 1433 node features.
The print is as follows

(0, 19) 1.0
(0, 81) 1.0
(0, 146) 1.0
(0, 315) 1.0
(0, 774) 1.0
(0, 877) 1.0
(0, 1194) 1.0
(0, 1247) 1.0
(0, 1274) 1.0
(1, 19) 1.0
(1, 88) 1.0
(1, 149) 1.0
(1, 212) 1.0
(1, 233) 1.0

How can we interpret the rows? I can't make sense of it.
This format is usually an edge list format, but as we have node features, how do the edges come into play?

I looked into your other repositories and issues around this topic and couldn't find anything which helps me understand the structure of the .tx and .allx files.

tkipf/gcn#36
tkipf/gcn#125
tkipf/gcn#114
tkipf/gcn#36
tkipf/gcn#22
#35

I'm planning to use node degree as recommended here tkipf/gcn#22 (and add more features later on)
My current attempt is to do

node_deg = dict(G.degree()).values()
features = sparse.csr_matrix(node_deg).T

As I don't understand the Cora output, I can't really assess if that is correct or not.

Could you provide more guidance and explanation for that?

That would be great :)
Thank you in advance,
Best,
Minh

@tkipf
Copy link
Owner

tkipf commented Jul 23, 2019

I recommend using the data loader from https://github.com/tkipf/keras-gcn/blob/master/kegra/utils.py

Then you don’t have to deal with this strange .allx etc format (which is just supplied in this repo because it was used in a benchmark from an earlier paper from some other lab on which we base our evaluation on) :)

@ZJJTSL
Copy link

ZJJTSL commented Oct 26, 2020

i have the same question about how to apply it to my own dataset, so if you have solved this problem , could you give me some guidance please, thanks a lot !

@ZJJTSL
Copy link

ZJJTSL commented Nov 30, 2020

Dear @tkipf
first of all, thank you for the excellent work! Your paper and the provided code are helpful to get started with GCN.

I'm currently trying to apply your algorithm to my data.
After looking at the load_data() function, I was able to create an adjacency matrix in the same format as your Cora example.

However, I struggle with the node feature object, because I don't understand the meaning of the content cora.tx and cora.allx.
The shape is (2709x1433) (#41 nodes x #features), so apparently, there are 1433 node features.
The print is as follows

(0, 19) 1.0
(0, 81) 1.0
(0, 146) 1.0
(0, 315) 1.0
(0, 774) 1.0
(0, 877) 1.0
(0, 1194) 1.0
(0, 1247) 1.0
(0, 1274) 1.0
(1, 19) 1.0
(1, 88) 1.0
(1, 149) 1.0
(1, 212) 1.0
(1, 233) 1.0

How can we interpret the rows? I can't make sense of it.
This format is usually an edge list format, but as we have node features, how do the edges come into play?

I looked into your other repositories and issues around this topic and couldn't find anything which helps me understand the structure of the .tx and .allx files.

tkipf/gcn#36
tkipf/gcn#125
tkipf/gcn#114
tkipf/gcn#36
tkipf/gcn#22
#35

I'm planning to use node degree as recommended here tkipf/gcn#22 (and add more features later on)
My current attempt is to do

node_deg = dict(G.degree()).values()
features = sparse.csr_matrix(node_deg).T

As I don't understand the Cora output, I can't really assess if that is correct or not.

Could you provide more guidance and explanation for that?

That would be great :)
Thank you in advance,
Best,
Minh
hi ,have you ever tried the degree matrix as the feature matrix, if so ,the dimention of feature matrix is n*1, does it work in your case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants