Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve MG graph creation #2044

Merged
merged 17 commits into from
Feb 9, 2022

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Jan 27, 2022

Improve multi-node multi-GPU scalability of graph creation (especially the code computing renumber_map)
Fix an overflow bug when creating a graph with more than 2^31 vertices

…e and also seems like having an issue with 2^31 or more elements)
@seunghwak seunghwak added 2 - In Progress improvement Improvement / enhancement to an existing function DO NOT MERGE Hold off on merging; see PR for details non-breaking Non-breaking change labels Jan 27, 2022
@seunghwak seunghwak added this to the 22.04 milestone Jan 27, 2022
@seunghwak seunghwak requested a review from a team as a code owner January 27, 2022 19:49
@seunghwak seunghwak changed the title [WIP][skip-ci] Improve MG graph creation Improve MG graph creation Feb 3, 2022
@seunghwak seunghwak added 3 - Ready for Review bug Something isn't working and removed 2 - In Progress DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function labels Feb 3, 2022
@seunghwak seunghwak self-assigned this Feb 3, 2022
@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.04@8a5df26). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-22.04    #2044   +/-   ##
===============================================
  Coverage                ?   73.55%           
===============================================
  Files                   ?      156           
  Lines                   ?    10295           
  Branches                ?        0           
===============================================
  Hits                    ?     7573           
  Misses                  ?     2722           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a5df26...f17db5f. Read the comment docs.

Comment on lines +47 to +51
auto lower_it =
thrust::lower_bound(thrust::seq, group_id_first, group_id_last, static_cast<int>(i));
auto upper_it = thrust::upper_bound(thrust::seq, lower_it, group_id_last, static_cast<int>(i));
return thrust::make_tuple(static_cast<int>(i),
static_cast<size_t>(thrust::distance(lower_it, upper_it)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of casting i to int can you cast it to
typename std::iterator_traits<GroupIdIterator>::value_type while you are using it for binary_search?
It could be cast to int while returning in the tuple.

Copy link
Contributor Author

@seunghwak seunghwak Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...

Actually, typename std::iterator_traits<GroupIdIterator>::value_type should be int here (Group IDs or GPU IDs or process IDs are all assumed to be int). Using GroupIdIterator intead of int type pointer because I am using thrust::transform_iterator to compute group ID on the fly.

I'll add

static_assert(std::is_same_v(typename std::iterator_traits<GroupIdIterator>::value_type, int)); to make this clearer.

And I may modify operator()(size_t i) to operator()(int i) to avoid type casting... but I have mixed thoughts about this (I assume thrust::tabulate computes the ranges of i from thrust::distance() and i can potentially overflow int. Shouldn't happen unless we have more than 2^31-1 GPUs and that's the assumption for using 'int' here for GPU IDs. I guess explicitly casting this makes this intention more clearer than relying on implicit casting).

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that works.

@ChuckHastings
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2cbf6c5 into rapidsai:branch-22.04 Feb 9, 2022
rapids-bot bot pushed a commit that referenced this pull request Mar 9, 2022
Pulls updates from PRs #2044, better be reviewed and merged after PR #2044

Mainly experimenting, not ready for review.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2062
@seunghwak seunghwak deleted the enh_graph_creation branch August 11, 2022 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
No open projects
22.04 Release
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

None yet

4 participants