Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dist] enable to specify sort_etype for sample_etype_neighbours #4212

Merged
merged 13 commits into from
Jul 11, 2022

Conversation

Rhett-Ying
Copy link
Collaborator

Description

follow-up for #4150.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR
  • If the PR is for a new model/paper, I've updated the example index here.

Changes

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

To trigger regression tests:

  • @dgl-bot run [instance-type] [which tests] [compare-with-branch];
    For example: @dgl-bot run g4dn.4xlarge all dmlc/master or @dgl-bot run c5.9xlarge kernel,api dmlc/master

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 19996630b1cadcd106e2a2b380c1e5d862303ef3

Build ID: 1

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 39265f6

Build ID: 2

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 0179d62

Build ID: 3

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

python/dgl/distributed/dist_graph.py Outdated Show resolved Hide resolved
python/dgl/distributed/dist_graph.py Outdated Show resolved Hide resolved
@@ -227,6 +227,7 @@ def initialize(ip_config, num_servers=1, num_workers=0,
'Please define DGL_CONF_PATH to run DistGraph server'
formats = os.environ.get('DGL_GRAPH_FORMAT', 'csc').split(',')
formats = [f.strip() for f in formats]
sort_etypes = os.environ.get('DGL_SORT_ETYPES', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the corresponding doc for this new environment variable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have corresponding doc? where is it?

python/dgl/distributed/graph_services.py Outdated Show resolved Hide resolved
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 382c066e18a6262c24630de60147803142c7dce8

Build ID: 4

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 360417511595fe2e0857b281ed2d3e2550c19933

Build ID: 6

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: 42ac91ea04efc8bafe9d88ade2aa68ccb592e3af

Build ID: 5

Status: ❌ CI test failed in Stage [C++ CPU].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 5, 2022

Commit ID: b2fd34fe8bba0e3ef5b1001f5c94aa52ff8657a1

Build ID: 7

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@zheng-da
Copy link
Collaborator

zheng-da commented Jul 5, 2022

the main question is why we should expose the sort_etypes flag to users? if we provide this interface, users need to specify this flag and the graph format correctly. this will be very confusing to users.
we should enable sort_etypes by default based on the graph formats specified by users if the input graph is a heterogeneous graph.

@Rhett-Ying
Copy link
Collaborator Author

Rhett-Ying commented Jul 6, 2022

Enabling sort_etypes automatically without user's explicit configure is preferable, let me try to figure out a clean way to achieve this. If you have any ideas on how to achieve this, pls let me know. We need to record the etype_sorted info both for DistGraphSever and DistGraph. If we always sort etypes if csc or csr is specified, then we're able to specify etype_sorted=True for sample_etype_neighbors().

@@ -339,6 +344,12 @@ def __init__(self, server_id, ip_config, num_servers,
# Create the graph formats specified the users.
self.client_g = self.client_g.formats(graph_format)
self.client_g.create_formats_()
if sort_etypes == 'csr':
self.client_g = sort_csr_by_tag(
self.client_g, tag=self.client_g.edata[ETYPE], tag_type='edge')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jermainewang @zheng-da Is sorting like this correct ? Is passing tag with g.edata[ETYPE] correct? Do I need to sort the inner edges only?

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 6, 2022

Commit ID: ca443b8bf3edc8ac77c475c723cb2633b8f8f86b

Build ID: 8

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 6, 2022

Commit ID: 8c81d762bb7c5b2c10db38ba0d06e10564b5b644

Build ID: 9

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 11, 2022

Commit ID: 5819bb6cf3e0f9c95c924877e6d642970aee48ff

Build ID: 10

Status: ❌ CI test failed in Stage [Lint Check].

Report path: link

Full logs path: link

@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 11, 2022

Commit ID: 713140e91a7dffe1200f9ccb08307484190ff47b

Build ID: 11

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@jermainewang jermainewang added the Release Candidate Candidate PRs for the upcoming release label Jul 11, 2022
@dgl-bot
Copy link
Collaborator

dgl-bot commented Jul 11, 2022

Commit ID: 4a2cc01

Build ID: 12

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

@Rhett-Ying Rhett-Ying merged commit 14e4e1b into dmlc:master Jul 11, 2022
@Rhett-Ying Rhett-Ying deleted the sortEtypesInDistGraph branch July 11, 2022 10:37
BarclayII pushed a commit to BarclayII/dgl that referenced this pull request Aug 10, 2022
…#4212)

* [Dist] enable to specify sort_etype for sample_etype_neighbours

* fix lint

* pass argument instead of env

* fix lint and doc string

* refine args

* remove unnecessary lines

* debug only

* debug add sort time log

* change interface

* fix typo

Co-authored-by: Xin Yao <xiny@nvidia.com>
@frozenbugs frozenbugs removed the Release Candidate Candidate PRs for the upcoming release label Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants