Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the dataset in GPT_GNN #24

Closed
SKD621 opened this issue Nov 29, 2020 · 6 comments
Closed

About the dataset in GPT_GNN #24

SKD621 opened this issue Nov 29, 2020 · 6 comments

Comments

@SKD621
Copy link

SKD621 commented Nov 29, 2020

I notice that you actually have three categories (CS/Med/NN) in OAG, which is available in the preprocessed graphs. I am interested in the whole datasets about the three categories. Maybe, you can provide the raw data about the Med and NN like CS. Thanks for your help in advance.

@acbull
Copy link
Owner

acbull commented Nov 30, 2020

For the full dataset, you can refer to: https://www.openacademic.ai/oag/

@SKD621
Copy link
Author

SKD621 commented Nov 30, 2020

Thanks for your reply. Can you provide the code for building the MAG_0919_CS.zip and SeqName_CS_20190919.tsv. The url you provided does‘t contain the way how to build it. Thanks.

@acbull
Copy link
Owner

acbull commented Nov 30, 2020 via email

@SKD621
Copy link
Author

SKD621 commented Nov 30, 2020

Thanks for your reply

@SKD621 SKD621 closed this as completed Nov 30, 2020
@SKD621 SKD621 reopened this Dec 1, 2020
@SKD621
Copy link
Author

SKD621 commented Dec 1, 2020

I am sorry to bother you again. It seems that the dataset (CS paper) you provide only contain ~1 million nodes, which is different from the statics the paper HGT stated (11,732,027 nodes)

@acbull
Copy link
Owner

acbull commented Dec 24, 2020

The statistics is for the whole complete CS graph. While pre-processing, we filter our some papers that has less than 5 citations to make the graph denser.

The complete raw dataset can be found at: https://drive.google.com/drive/folders/1yDdVaartOCOSsQlUZs8cJcAUhmvRiBSz

@SKD621 SKD621 closed this as completed Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants