Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure Louvain to be more like other algorithms #2594

Merged

Conversation

ChuckHastings
Copy link
Collaborator

Eliminate the Louvain class and restructure the louvain algorithm to be like the other algorithms.

This will involve either inlining functionality from the Louvain member functions, or creating stand-alone detail methods that encapsulate the logic.

@ChuckHastings ChuckHastings self-assigned this Aug 18, 2022
@ChuckHastings ChuckHastings added 2 - In Progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 18, 2022
@ChuckHastings ChuckHastings added this to the 22.10 milestone Aug 18, 2022
@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@73e66a1). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10    #2594   +/-   ##
===============================================
  Coverage                ?   61.11%           
===============================================
  Files                   ?      106           
  Lines                   ?     5634           
  Branches                ?        0           
===============================================
  Hits                    ?     3443           
  Misses                  ?     2191           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -0,0 +1,370 @@
/*
* Copyright (c) 2020-2022, NVIDIA CORPORATION.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file contains implementation that was formerly in louvain.cuh, so keeping the original copyright range.

@ChuckHastings ChuckHastings marked this pull request as ready for review August 23, 2022 19:48
@ChuckHastings ChuckHastings requested review from a team as code owners August 23, 2022 19:48
@ChuckHastings ChuckHastings changed the title [WIP] Restructure Louvain to be more like other algorithms Restructure Louvain to be more like other algorithms Aug 23, 2022
Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look much more consistent with the rest of the codebase.

I have few minor suggestions to improve code readability.

}
};

template <typename graph_view_t>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to use template <typename graph_view_t> here and template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu> in the graph_contraction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verbosity more than anything.

I tend to prefer template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>. But the edge_src_property_t was the graph_view_type. So that would require a couple of extra lines of parameter definition, using graph_view_t is a bit clearer in that case.

I can switch to graph_view_t for detail methods for consistency if that's what we think is better.

graph_view_t const& graph_view,
typename graph_view_t::weight_type total_edge_weight,
typename graph_view_t::weight_type resolution,
rmm::device_uvector<typename graph_view_t::weight_type>& vertex_weights_v,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be const?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one could be const.

typename graph_view_t::weight_type resolution,
rmm::device_uvector<typename graph_view_t::weight_type>& vertex_weights_v,
rmm::device_uvector<typename graph_view_t::vertex_type>& cluster_keys_v,
rmm::device_uvector<typename graph_view_t::weight_type>& cluster_weights_v,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this function updates clustering by delta modularity AND based on the updated clustering, updates per cluster weight sums as well. May better carve out the per cluster weight sum update part from this function?

If we remove

  std::tie(cluster_keys_v, cluster_weights_v) = cugraph::transform_reduce_e_by_src_key(
    handle,
    graph_view,
    edge_src_dummy_property_t{}.view(),
    edge_dst_dummy_property_t{}.view(),
    graph_view_t::is_multi_gpu
      ? src_clusters_cache.view()
      : detail::edge_major_property_view_t<vertex_t, vertex_t const*>(next_clusters_v.data()),
    detail::return_edge_weight_t<vertex_t, weight_t>{},
    weight_t{0});

cluster_keys_v and cluster_weights_v can be const as well, and it might be much easier to understand the input and output.

And if we further carve out the cache update parts, this function can take next_clusters_v as an R-value, and return the updated next_clusters_v. Then, all the input parameters will become const reference or scalars. This might be more intuitive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @seunghwak. It would make it easier to follow then. Also in that case it would be nicer to change the function name as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think I can refactor that as well. That's probably an artifact of the implementation prior to using the primitives. IIRC, I had an optimization in the original SG implementation that allowed me to skip a few of these steps if I did it all together. Clearly now they can be separated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored in next push

Copy link
Contributor

@naimnv naimnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look to me and I will pull this branch once merged and base my work on top of it.

Comment on lines 260 to 263
::testing::Values(cugraph::test::File_Usecase(
"test/datasets/karate.mtx") //,
// cugraph::test::File_Usecase("test/datasets/dolphins.mtx")
)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason for not testing with dolphin.mtx?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I had commented it out to do faster debugging on something that only failed in karate. I'll add that back in.

Comment on lines 917 to 925
return transform_reduce_e(
handle,
*this,
edge_src_dummy_property_t{}.view(),
edge_dst_dummy_property_t{}.view(),
[] __device__(auto, auto, weight_t wt, auto, auto) { return wt; },
weight_t{0});
}

Copy link
Contributor

@naimnv naimnv Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use common functor for both mg sg compute_total_edge_weight?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try making a shared detail method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in next push

vertex_cluster_weights_v.resize(0, handle.get_stream());
vertex_cluster_weights_v.shrink_to_fit(handle.get_stream());
} else {
thrust::sort_by_key(handle.get_thrust_policy(),
Copy link
Contributor

@naimnv naimnv Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here ? Something like -
// sort cluster_keys_v_ and cluster_weights_v_ to use them for binary search (lower_bound)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in next push

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one thing we may consider is whether should we better sort here or in compute_cluster_keys_and_values. I was at the beginning a bit confused why cluster_keys_v and cluster_weights_v are R-values.

cluster_weights_v.begin());

vertex_cluster_weights_v.resize(next_clusters_v.size(), handle.get_stream());
thrust::transform(handle.get_thrust_policy(),
Copy link
Contributor

@naimnv naimnv Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here ? Something like -

// for each cluster found in next_clusters_v_, lookup its weight in cluster_weights_v_
// and store them in local variable vertex_cluster_weights_v

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in next push

typename graph_view_t::weight_type resolution,
rmm::device_uvector<typename graph_view_t::weight_type>& vertex_weights_v,
rmm::device_uvector<typename graph_view_t::vertex_type>& cluster_keys_v,
rmm::device_uvector<typename graph_view_t::weight_type>& cluster_weights_v,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @seunghwak. It would make it easier to follow then. Also in that case it would be nicer to change the function name as well.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChuckHastings
Copy link
Collaborator Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 421bac0 into rapidsai:branch-22.10 Aug 26, 2022
@ChuckHastings ChuckHastings deleted the create_louvain_detail_methods branch December 2, 2022 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants