Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Example][Refactor] Refactor GCN example #4160

Merged
merged 11 commits into from
Jul 5, 2022

Conversation

chang-l
Copy link
Collaborator

@chang-l chang-l commented Jun 22, 2022

Description

Referring to: #4186, this PR is for refactoring GCN example. Only single GPU is implemented.

Please note that we have two gcn implementations, one for DGL module (train.py) and one of customized module (gcn_mp.py). To properly show file diff for reviewing, I have not renamed gcn_mp.py. It needs to be addressed before merge.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

  • Align to our golden example
  • Two separate self-contained files for two implementations

Tests

Included in the updated README file.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 22, 2022

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 23, 2022

Commit ID: 680f120

Build ID: 1

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@BarclayII BarclayII requested a review from mufeili June 27, 2022 06:45
@jermainewang jermainewang added the Release Candidate Candidate PRs for the upcoming release label Jun 29, 2022
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 30, 2022

Commit ID: 15c922c

Build ID: 2

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 30, 2022

Commit ID: cfd7b15

Build ID: 3

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@chang-l
Copy link
Collaborator Author

chang-l commented Jun 30, 2022

CI-test keeps failing since I removed gcn.py file, which includes the gcn module definition. Please let me know if it is desired to keep gcn module file separately. @jermainewang @mufeili

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 30, 2022

Commit ID: a38b0a6

Build ID: 4

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jun 30, 2022

Commit ID: 7179662

Build ID: 5

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Example test].

Report path: link

Full logs path: link

examples/pytorch/gcn/README.md Outdated Show resolved Hide resolved
examples/pytorch/gcn/gcn_mp.py Outdated Show resolved Hide resolved
examples/pytorch/gcn/gcn_mp.py Outdated Show resolved Hide resolved
examples/pytorch/gcn/gcn_mp.py Outdated Show resolved Hide resolved
@jermainewang
Copy link
Member

Apart from the inline comments, I feel we could further simplify the example:

  • The example currently shows how to write a custom graph convolution layer using update_all, built-in functions and UDFs. I think we could just call dgl.nn.GraphConv.
  • We can then have only one training script.

@mufeili what do you think?

@mufeili
Copy link
Member

mufeili commented Jul 1, 2022

Apart from the inline comments, I feel we could further simplify the example:

  • The example currently shows how to write a custom graph convolution layer using update_all, built-in functions and UDFs. I think we could just call dgl.nn.GraphConv.
  • We can then have only one training script.

@mufeili what do you think?

I agree.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 1, 2022

Commit ID: 65fccf4abf6875a8ef1bb8979d8da691424f90f9

Build ID: 7

Status: ❌ CI test failed in Stage [C++ CPU].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 1, 2022

Commit ID: 0298e8370271f8efb78cfc894d9aa32f0976c7af

Build ID: 6

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 1, 2022

Commit ID: 6095d0d8da7ec2b7a71b7476347a1c37582a4e7a

Build ID: 8

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

help="Dataset name ('cora', 'citeseer', 'pubmed').")
args = parser.parse_args()
print(f'Training with DGL intrinsic graph convolution module.')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to first create a transform object and then pass it to the dataset classes below. Also you need to compose RemoveSelfLoop and AddSelfLoop as some datasets have self-loops at the beginning for some of the nodes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can do that (creating a transform object transform=AddSelfLoop() and pass trans. obj to dataset class)
I think, by default (allow_duplicate=False), AddSelfLoop includes RemoveSelfLoop to remove duplicated self-loops, if I understand the doc correctly (https://docs.dgl.ai/generated/dgl.transforms.AddSelfLoop.html).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks.

@mufeili
Copy link
Member

mufeili commented Jul 4, 2022

Overall a great job, I've left some minor comments.

@yaox12
Copy link
Collaborator

yaox12 commented Jul 5, 2022

@chang-l Can you add your name in CONTRIBUTORS.md?

Copy link
Member

@mufeili mufeili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 69c424d4b9d8495c886311317ad84fc0fb11a0bd

Build ID: 9

Status: ❌ CI test failed in Stage [Torch GPU Example test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 45df38f

Build ID: 10

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Example test].

Report path: link

Full logs path: link

@chang-l
Copy link
Collaborator Author

chang-l commented Jul 5, 2022

@yaox12 Thanks for reminding. I just added.

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: be5d1eb

Build ID: 11

Status: ❌ CI test failed in Stage [Torch CPU (Win64) Example test].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: e9060ed

Build ID: 12

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@yaox12 yaox12 merged commit 885be17 into dmlc:master Jul 5, 2022
@chang-l chang-l deleted the gcn-example-refactor branch July 13, 2022 20:37
BarclayII pushed a commit to BarclayII/dgl that referenced this pull request Aug 10, 2022
* Refactor GCN example

* Refactor GCN based on graphsage

* Readme update

* Minor update

* update

* Remove user-defined GCN implementation

* README update

* Update

* Update CONTRIBUTORS.md

* update task_example_test

Co-authored-by: Xin Yao <xiny@nvidia.com>
@frozenbugs frozenbugs removed the Release Candidate Candidate PRs for the upcoming release label Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants