[REVIEW] Pattern accelerator based implementation of PageRank, Katz Centrality, BFS, & SSSP #838

seunghwak · 2020-04-28T14:08:17Z

OK, I will try to merge this and plan to address multi-GPU extensions & performance tuning in separate PRs.

This PR is already very large and also there are multiple works dependent on this, so I think this works better (and this code is not linked to any python user code yet, so there isn't much risk in premature merging).

This API aims to achieve

thrust-like API for graph algorithms
Abstract out implementation issues in different target systems (Single GPU, multi-GPU, ...) inside the pattern accelerator API, Graph, and Handle; Same analytics code will be used for different target systems.
Minimize redundancy in cuGraph codebase and better enforce consistency.

…o fea_pattern_acc

…hanges to enable optimization

…nalytics functions

…matrix rows

…e more performance optimizations in accelerator API implementations

GPUtester · 2020-05-06T18:33:02Z

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

…ttern_acc

…) and col_name()

ChuckHastings

I've been using these changes and they seem fine.

I assume more changes will be in a different PR (e.g. implementations for is_multi_gpu reductions)

afender

That's a lot of very useful code 🎉

BradReesWork · 2020-09-23T15:04:12Z

rerun tests

codecov-commenter · 2020-09-23T16:44:03Z

Codecov Report

Merging #838 into branch-0.16 will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff              @@
##           branch-0.16     #838   +/-   ##
============================================
  Coverage        73.44%   73.44%           
============================================
  Files               60       60           
  Lines             2335     2335           
============================================
  Hits              1715     1715           
  Misses             620      620

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 85ec558...c50f85c. Read the comment docs.

…ttern_acc

BradReesWork · 2020-09-23T20:01:09Z

rerun tests

…ttern_acc

BradReesWork · 2020-09-24T22:59:11Z

rerun tests

pgera · 2020-09-25T09:58:38Z

Just some early feedback on memory usage. I tested BFS with some small graphs, and it produces the correct output. I wasn't able to test twitter (5.6 GB in CSR) on a 12 GB GPU whereas the current implementation works fine for that. The graph is already in CSR here. So there is no additional memory usage beyond the CSR graph, distance array, and whatever BFS uses. That is, I'm calling BFS on

cugraph::experimental::graph_view_t<VT, ET, float, false, false> G(            
      handle,                                                                   
      thrust::raw_pointer_cast(t_vlist.data()),                                 
      thrust::raw_pointer_cast(t_elist.data()),                                                                                              
      nullptr,                                                                  
      std::vector<VT> (),                                                       
      vertex_cnt,                                                               
      edge_cnt,                                                                 
      cugraph::experimental::graph_properties_t(),                              
      false,                                                                    
      false);

cugraph::experimental::bfs<VT, ET, float, false>(                           
      handle,                                                                   
      G,                                                                        
      thrust::raw_pointer_cast(t_distances.data()),                             
      static_cast<VT*>(nullptr),                                                
      static_cast<VT>(src),                                                     
      false,                                                                    
      std::numeric_limits<VT>::max(),                                           
      false);

The last two calls in the back trace are

#6  0x0000555555567878 in rmm::mr::cuda_memory_resource::do_allocate(unsigned long, CUstream_st*) ()
#7  0x00007fffedd46940 in void cugraph::experimental::update_frontier_v_push_if_out_nbr<cugraph::experimental::graph_view_t<int, int, float, false, false, void>

BradReesWork · 2020-09-25T12:34:29Z

rerun tests

seunghwak · 2020-09-25T14:41:24Z

Just some early feedback on memory usage. I tested BFS with some small graphs, and it produces the correct output. I wasn't able to test twitter (5.6 GB in CSR) on a 12 GB GPU whereas the current implementation works fine for that. The graph is already in CSR here. So there is no additional memory usage beyond the CSR graph, distance array, and whatever BFS uses. That is, I'm calling BFS on

cugraph::experimental::graph_view_t<VT, ET, float, false, false> G(            
      handle,                                                                   
      thrust::raw_pointer_cast(t_vlist.data()),                                 
      thrust::raw_pointer_cast(t_elist.data()),                                                                                              
      nullptr,                                                                  
      std::vector<VT> (),                                                       
      vertex_cnt,                                                               
      edge_cnt,                                                                 
      cugraph::experimental::graph_properties_t(),                              
      false,                                                                    
      false);

cugraph::experimental::bfs<VT, ET, float, false>(                           
      handle,                                                                   
      G,                                                                        
      thrust::raw_pointer_cast(t_distances.data()),                             
      static_cast<VT*>(nullptr),                                                
      static_cast<VT>(src),                                                     
      false,                                                                    
      std::numeric_limits<VT>::max(),                                           
      false);

The last two calls in the back trace are

#6  0x0000555555567878 in rmm::mr::cuda_memory_resource::do_allocate(unsigned long, CUstream_st*) ()
#7  0x00007fffedd46940 in void cugraph::experimental::update_frontier_v_push_if_out_nbr<cugraph::experimental::graph_view_t<int, int, float, false, false, void>

Thanks for testing this, and BFS & SSSP in this PR is not optimized for performance & memory footprint (if you actually read the PR code, you may find several FIXMEs to reduce memory footprint). It will happen in future PRs, and I will make sure memory requirement is actually smaller than the previous one in the final version.

seunghwak added 14 commits April 22, 2020 07:01

draft pattern accelerator API for the pagerank pattern

73ace81

implement pagerank using the pattern accelerator API

5c93240

implement katz centrality using the pattern accelerator API

8a66ec5

add handle to the pattern accelerator API

388b93c

fix minor issues in pagerank & katz_centrality

d725d74

add a pattern to support BFS

6c50c25

draft implementation of BFS using pattern accelerator APIs

3faffba

move non-public APIs to the detail namespace

71052b6

minor tweak to bfs

52d09fe

initial draft of sssp using pattern accelerator

5e053c8

merge e_op and e_pred_op and add reduce_op for bfs & sssp patterns

aa86d6a

tweaking patterns for BFS & SSSP for better accelerator implementation

692a9fc

raise abstraction level for vertex queue

2a7624b

direction optimized to direction optimizing in BFS

2350282

BradReesWork added the 2 - In Progress label May 4, 2020

BradReesWork added this to PRs in v0.15 Release May 4, 2020

BradReesWork added this to the 0.15 milestone May 4, 2020

BradReesWork requested review from BradReesWork, afender and ChuckHastings May 4, 2020 18:35

BradReesWork assigned seunghwak May 4, 2020

seunghwak added 8 commits May 5, 2020 21:37

Merge branch 'branch-0.14' of https://github.com/rapidsai/cugraph int…

c788f1a

…o fea_pattern_acc

Merge branch 'branch-0.14' of https://github.com/rapidsai/cugraph int…

f8034c4

…o fea_pattern_acc

update comments, class & function names, and several additional API c…

700bd7b

…hanges to enable optimization

add FIXME comments to remove opg as a template parameter from graph a…

dcbac12

…nalytics functions

rename frontier to better reflect that it is a froniter on adjacency …

366829d

…matrix rows

updated pattern accelerator API for better expressiblity and to enabl…

d38f85c

…e more performance optimizations in accelerator API implementations

remove template parameter bool opg from graph analytics

68759a6

remove unnecessary code

b3e4ac7

seunghwak added 6 commits September 18, 2020 00:35

update change log

9e1484d

clang-format

274ea2a

remove cuda.cuh (this is replaced by raft)

6c37b0d

clang-format

b568443

clang-format

8339cb7

update raft tag

9aaf731

BradReesWork added 3 - Ready for Review and removed 2 - In Progress labels Sep 18, 2020

BradReesWork requested a review from aschaffer September 18, 2020 12:48

Merge branch 'branch-0.16' of github.com:rapidsai/cugraph into fea_pa…

cfefe63

…ttern_acc

seunghwak mentioned this pull request Sep 18, 2020

[REVIEW] FEA Multinode extension for pattern accelerator based PageRank, Katz Centrality, BFS, and SSSP implementations (C++ part) #1151

Merged

seunghwak added 3 commits September 18, 2020 23:07

remove unecessary code

b3f502b

update sG interface of graph_view.hpp to mirror MG interface

ff0b2bf

replace comm_p_row_key & comm_p_col_key with key_naming_t().row_name(…

3165637

…) and col_name()

ChuckHastings approved these changes Sep 23, 2020

View reviewed changes

afender approved these changes Sep 23, 2020

View reviewed changes

Merge branch 'branch-0.16' of github.com:rapidsai/cugraph into fea_pa…

72c5cbe

…ttern_acc

seunghwak added 2 commits September 23, 2020 23:53

fixed confusing variable names

54227cf

Merge branch 'branch-0.16' of github.com:rapidsai/cugraph into fea_pa…

1132074

…ttern_acc

resolve merge conflicts

c50f85c

BradReesWork merged commit b353e1e into rapidsai:branch-0.16 Sep 25, 2020

v0.16 Release automation moved this from PRs to Done Sep 25, 2020

seunghwak deleted the fea_pattern_acc branch October 3, 2020 04:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Pattern accelerator based implementation of PageRank, Katz Centrality, BFS, & SSSP #838

[REVIEW] Pattern accelerator based implementation of PageRank, Katz Centrality, BFS, & SSSP #838

seunghwak commented Apr 28, 2020 •

edited

GPUtester commented May 6, 2020

ChuckHastings left a comment

afender left a comment •

edited

BradReesWork commented Sep 23, 2020

codecov-commenter commented Sep 23, 2020 •

edited

BradReesWork commented Sep 23, 2020

BradReesWork commented Sep 24, 2020

pgera commented Sep 25, 2020 •

edited

BradReesWork commented Sep 25, 2020

seunghwak commented Sep 25, 2020

[REVIEW] Pattern accelerator based implementation of PageRank, Katz Centrality, BFS, & SSSP #838

[REVIEW] Pattern accelerator based implementation of PageRank, Katz Centrality, BFS, & SSSP #838

Conversation

seunghwak commented Apr 28, 2020 • edited

GPUtester commented May 6, 2020

ChuckHastings left a comment

Choose a reason for hiding this comment

afender left a comment • edited

Choose a reason for hiding this comment

BradReesWork commented Sep 23, 2020

codecov-commenter commented Sep 23, 2020 • edited

Codecov Report

BradReesWork commented Sep 23, 2020

BradReesWork commented Sep 24, 2020

pgera commented Sep 25, 2020 • edited

BradReesWork commented Sep 25, 2020

seunghwak commented Sep 25, 2020

seunghwak commented Apr 28, 2020 •

edited

afender left a comment •

edited

codecov-commenter commented Sep 23, 2020 •

edited

pgera commented Sep 25, 2020 •

edited