-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update primitives to support DCSR (DCSC) segments (Part 1) #1690
Conversation
…ename_patterns_to_prims
… (currently, DCSR or DCSC)
Co-authored-by: Paul Taylor <paul.e.taylor@me.com>
…n_device_"view"_t
…_view_t and matrix_partition_device_view_t objects and removed (vertex|matrix)_partition_device_view_t constructors that takes a graph_view_t object
…still not work if DCSR (DCSC) is enabled
…to fea_dcsr_prim
Codecov Report
@@ Coverage Diff @@
## branch-21.08 #1690 +/- ##
===============================================
Coverage ? 59.31%
===============================================
Files ? 80
Lines ? 3557
Branches ? 0
===============================================
Hits ? 2110
Misses ? 1447
Partials ? 0 Continue to review full report at Codecov.
|
…to fea_dcsr_prim
{ | ||
if (dcs_nzd_vertices_) { | ||
// we can avoid binary search (and potentially improve performance) if we add an auxiliary | ||
// array or cuco::static_map (at the expense of additional memory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a Bloom filter help here? Haven't thought too deeply, perhaps it would be too intense for the size lists we're talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly (at the expense of additional memory) for certain ranges of (# non-zero degree vertices) / (# vertices in the hypersparse segment).
Binary search based approach will be most memory efficient, and bloom-filter + binary search or hash-table based approaches may be faster for certain ranges of fill ratio (but will require more memory). We need to investigate the trade-offs (how much speedup we can get considering that this will require additional memory) once this turned out to be a performance bottleneck (it seems like it is not for most datasets we have tested and with a relatively small number of GPUs, but this may change for very low average vertex degree graphs and many GPUs... we need to implement few more optimizations to test for this setting).
@gpucibot merge |
Update graph primitives to support DCSR (DCSC) segments (except for the ones used by Louvain, graph primitives used in Louvain will be updated in a separate PR with thread-divergence optimization & more testing).
DCSR (DCSC) segment support is still disabled (as enabling this will break Louvain).