Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix source--target node designations for code using Cora dataset #444

Closed
6 of 8 tasks
geoffj-d61 opened this issue Jul 5, 2019 · 3 comments
Closed
6 of 8 tasks
Assignees
Labels
bug Something isn't working

Comments

@geoffj-d61
Copy link
Contributor

geoffj-d61 commented Jul 5, 2019

Description

The Cora README documentation from the dataset specifies that the column order is: cited-paper; citing-paper. However, at least one Jupyter notebook (namely cora-links-example.ipynb) assumes the column ordering: source; target.

Notwithstanding that this example assumes undirected edges, the constructed graph should accurately reflect the directed relationships inherent in the data.

How to reproduce (the error):

The relevant code snippet in the cora-links-example.ipynb notebook is:

edgelist = pd.read_csv(os.path.join(data_dir, "cora.cites"), sep='\t', header=None, names=["source", "target"])

Aim

To locate all code examples that load the Cora dataset, and ensure the loaded graphs utilise the correct target--source citation ordering.

  • Bug fixed
  • Branch and Pull Request build on CI
  • Branch and Pull Request pass unit tests on CI
  • Branch and Pull Request pass integration tests on CI
  • Version number reflects new status
  • Peer Code Review Performed
  • Code well commented
  • CHANGELOG.md updated
@geoffj-d61 geoffj-d61 added bug Something isn't working ml labels Jul 5, 2019
@geoffj-d61 geoffj-d61 self-assigned this Jul 5, 2019
@geoffj-d61
Copy link
Contributor Author

geoffj-d61 commented Jul 5, 2019

@adocherty, could you please review this ticket and let me know if I've made any goofs, or if it seems okay?

@adocherty
Copy link
Contributor

Hi @geoffj-d61,

This looks good to me. I point out that the directionality here is dependent upon the semantics of the link, the current links are a "cited_by" relationship, you are proposing to change this to a "cites" relationship. When we make this change it would be good to add a comment about the semantics of the link.

Note also that a lot of the demo notebooks use CORA with the same code, so we should change all of them.

@geoffj-d61 geoffj-d61 changed the title Reverse source--target node designations for code using Cora dataset Fix source--target node designations for code using Cora dataset Jul 9, 2019
@geoffj-d61
Copy link
Contributor Author

@adocherty, agreed - I intend to review and fix all occurrences in the demos. Also, I'm not exactly changing to a "cites" relationship - this is already explicitly assumed in the notebooks; I am merely correcting the loaded directionality. While I am about it, I am also testing demos in python 3.6/3.7 and trying to understand them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants