Define API for MG random walk #2407

ChuckHastings · 2022-07-13T17:45:24Z

This PR defines the API for MG random walk in the C API and the C++ API.

C and C++ tests are defined, although some of the code is ifdef'ed out since there is not a working implementation here.

codecov-commenter · 2022-07-13T22:41:18Z

Codecov Report

Merging #2407 (6fb67a1) into branch-22.08 (2aad5f2) will increase coverage by 0.27%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.08    #2407      +/-   ##
================================================
+ Coverage         60.11%   60.39%   +0.27%     
================================================
  Files               102      102              
  Lines              5155     5244      +89     
================================================
+ Hits               3099     3167      +68     
- Misses             2056     2077      +21

Impacted Files	Coverage Δ
python/cugraph/cugraph/gnn/graph_store.py	`76.66% <0.00%> (-3.34%)`	⬇️
python/cugraph/cugraph/structure/property_graph.py	`96.41% <0.00%> (-0.02%)`	⬇️
python/cugraph/cugraph/__init__.py	`100.00% <0.00%> (ø)`
python/cugraph/cugraph/dask/comms/comms.py	`34.06% <0.00%> (ø)`
python/cugraph/cugraph/dask/common/mg_utils.py	`30.43% <0.00%> (ø)`
python/cugraph/cugraph/dask/common/input_utils.py	`22.13% <0.00%> (ø)`
...pylibcugraph/pylibcugraph/experimental/__init__.py	`100.00% <0.00%> (ø)`
...ugraph/cugraph/dask/structure/mg_property_graph.py	`18.56% <0.00%> (+0.07%)`	⬆️
...ython/cugraph/cugraph/community/ktruss_subgraph.py	`88.23% <0.00%> (+2.94%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2aad5f2...6fb67a1. Read the comment docs.

seunghwak · 2022-07-19T22:55:26Z

cpp/include/cugraph/algorithms.hpp

+ * @param handle RAFT handle object to encapsulate resources (e.g. CUDA stream, communicator, and
+ * handles to various CUDA libraries) to run graph algorithms.
+ * @param graph_view graph view to operate on
+ * @param start_span Device span defining the starting vertices


I think span in the variable name is redundant (as the type is device_span). start_vertices might be more informative.

I agree, start_vertices would be more clear to developers higher up the stack

We had talked about that and I forgot. Updated in next push.

seunghwak · 2022-07-19T22:56:33Z

cpp/include/cugraph/algorithms.hpp

+ * handles to various CUDA libraries) to run graph algorithms.
+ * @param graph_view graph view to operate on
+ * @param start_span Device span defining the starting vertices
+ * @param max_depth maximum length of random walk


Better be max_length? (depth is more relevant to sampling a tree but I think length makes more sense for random walks).

Updated in next push.

cpp/include/cugraph/algorithms.hpp

seunghwak · 2022-07-19T22:59:06Z

cpp/include/cugraph/algorithms.hpp

+ *         vertices in the random walk.  If a path terminates before max_depth,
+ *         the vertices will be populated with invalid_vertex_id
+ *         (-1 for signed vertex_t, std::numeric_limits<vertex_t>::max() for an
+ *         unsigned vertex_t * type)<br>


an unsigned vertex_t * type => unsigned vertex_t

Fixed in next push

seunghwak · 2022-07-19T23:00:22Z

cpp/include/cugraph/algorithms.hpp

+  graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view,
+  raft::device_span<vertex_t const> start_span,
+  size_t max_depth,
+  uint64_t seed = 0);


How does this function work for unweighted graphs? Still return rmm::device_uvector<weight_t>?

I don't know that we've explored support for an unweighted graph very much, especially as it relates to return values.

I could wrap the return in std::optional and return std::nullopt if the graph is unweighted. biased_* would fail on an unweighted graph. node2vec_* could assume a weight of 1 on an unweighted graph.

Gotcha, and I think it is better to specify how we handle unweighted graphs in the documentation.

Updated in next push (check all 3 APIs)

seunghwak · 2022-07-19T23:07:36Z

cpp/src/sampling/random_walks_sg.cu

+  raft::handle_t const& handle,
+  graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view,
+  raft::device_span<vertex_t const> start_span,
+  size_t max_depth,


max_length?

cpp/src/sampling/random_walks_mg.cu

cpp/src/sampling/random_walks_sg.cu

seunghwak · 2022-07-19T23:10:22Z

cpp/tests/CMakeLists.txt

 ###################################################################################################
 # - RANDOM_WALKS tests ----------------------------------------------------------------------------
-ConfigureTest(RANDOM_WALKS_TEST sampling/random_walks_test.cu)
+ConfigureTest(RANDOM_WALKS_TEST sampling/sg_random_walks_test.cu)


Should we name SG tests with sg_? Our convention has been omitting sg_ for SG tests. If we decide to use sg_ for SG tests, we need to apply this for all SG tests.

This is also temporary. There is an existing random walks test for the original implementation (I hesitate to use the word legacy, since we usually refer to implementations that use the legacy graph objects). I can't delete the original test until I have a replacement working.

I could rename the existing as legacy if you think that would be cleaner.

OK, no complaint if temporary, just don't forget to fix this later.

Added a FIXME to remind me

seunghwak · 2022-07-19T23:10:42Z

cpp/tests/CMakeLists.txt

@@ -631,6 +639,7 @@ if(BUILD_CUGRAPH_MG_TESTS)
        ConfigureCTestMG(MG_CAPI_EIGENVECTOR_CENTRALITY c_api/mg_eigenvector_centrality_test.c c_api/mg_test_utils.cpp)
        ConfigureCTestMG(MG_CAPI_HITS c_api/mg_hits_test.c c_api/mg_test_utils.cpp)
        ConfigureCTestMG(MG_CAPI_UNIFORM_NEIGHBOR_SAMPLE c_api/mg_uniform_neighbor_sample_test.c c_api/mg_test_utils.cpp)
+	ConfigureCTestMG(MG_CAPI_RANDOM_WALKS c_api/mg_random_walks_test.c c_api/mg_test_utils.cpp)


Better align indentation.

Fixed in next push

alexbarghi-nv · 2022-07-20T13:46:58Z

Looks good, just the one change to use "start_vertices" or "source_vertices" to be consistent with the rest of the C API.

ChuckHastings · 2022-07-21T02:31:08Z

rerun tests

…onal

seunghwak · 2022-07-21T22:11:22Z

cpp/include/cugraph/algorithms.hpp

+ *         set to weight_t{0}.
+ */
+template <typename vertex_t, typename edge_t, typename weight_t, bool multi_gpu>
+std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>>


This does not accept unweighted graphs, and should the returned weight vector here be std::optional? Can this ever be std::nullopt?

I was keeping the signature the same for consistency. But you are correct, this function would never return std::null opt.

seunghwak · 2022-07-21T22:13:35Z

cpp/include/cugraph_c/sampling_algorithms.h

+cugraph_error_code_t cugraph_uniform_random_walks(
+  const cugraph_resource_handle_t* handle,
+  cugraph_graph_t* graph,
+  const cugraph_type_erased_device_array_view_t* sources,


sources == start_vertices in the code above? If yes, better use the same variable name?

Same for the functions below.

Just pushed an update for this. Also updated in implementation and test files.

seunghwak · 2022-07-21T22:15:58Z

cpp/tests/sampling/sg_random_walks_test.cu

+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */


This file should be renamed to random_walks_test.cu in the future. Just a reminder.

seunghwak · 2022-07-21T23:25:04Z

cpp/src/sampling/random_walks_mg.cu

+std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>>
+uniform_random_walks(raft::handle_t const& handle,
+                     graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view,
+                     raft::device_span<vertex_t const> start_span,


start_span here should be start_vertices, better search for start_span and replace them.

seunghwak · 2022-07-21T23:25:41Z

cpp/src/sampling/random_walks_sg.cu

+std::tuple<rmm::device_uvector<vertex_t>, std::optional<rmm::device_uvector<weight_t>>>
+uniform_random_walks(raft::handle_t const& handle,
+                     graph_view_t<vertex_t, edge_t, weight_t, false, multi_gpu> const& graph_view,
+                     raft::device_span<vertex_t const> start_span,


start_span here as well.

jnke2016

Looks good to me

alexbarghi-nv

👍

ChuckHastings · 2022-07-22T13:45:18Z

@gpucibot merge

first definition of new RW API

b08a221

ChuckHastings added 2 - In Progress improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 13, 2022

ChuckHastings added this to the 22.08 milestone Jul 13, 2022

ChuckHastings self-assigned this Jul 13, 2022

ChuckHastings added this to PRs in 22.08 Release Jul 13, 2022

fix some small errors

83f6618

ChuckHastings added 3 commits July 14, 2022 07:54

clean up a few things in the API definition

184fb80

implementation stubs and tests for new SG and MG random walks

7917558

fix clang-format issues

98c036b

ChuckHastings marked this pull request as ready for review July 19, 2022 01:26

ChuckHastings requested review from a team as code owners July 19, 2022 01:26

ChuckHastings added 3 - Ready for Review and removed 2 - In Progress labels Jul 19, 2022

ChuckHastings requested review from seunghwak, rlratzel and jnke2016 July 19, 2022 01:27

seunghwak reviewed Jul 19, 2022

View reviewed changes

ChuckHastings added 2 commits July 20, 2022 11:58

Updates based on PR reviews

3a3a868

clean up some documentation, add/clarify support for unweighted graphs

605cb3f

ChuckHastings added 2 commits July 21, 2022 08:36

update stub implementation to match API change making the weight opti…

ca0ae42

…onal

fix clang-format issues

0c813a9

seunghwak reviewed Jul 21, 2022

View reviewed changes

update sources to start_vertices more consistently

5d05cf6

seunghwak reviewed Jul 21, 2022

View reviewed changes

update start_span to start_vertices more consistently

6fb67a1

seunghwak approved these changes Jul 22, 2022

View reviewed changes

jnke2016 approved these changes Jul 22, 2022

View reviewed changes

alexbarghi-nv approved these changes Jul 22, 2022

View reviewed changes

rapids-bot bot merged commit 5bf07fb into rapidsai:branch-22.08 Jul 22, 2022

22.08 Release automation moved this from PRs to Done Jul 22, 2022

ChuckHastings deleted the mg_random_walk_definition branch August 4, 2022 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define API for MG random walk #2407

Define API for MG random walk #2407

ChuckHastings commented Jul 13, 2022 •

edited

Loading

codecov-commenter commented Jul 13, 2022 •

edited

Loading

seunghwak Jul 19, 2022

alexbarghi-nv Jul 20, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 19, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 19, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 19, 2022

ChuckHastings Jul 20, 2022 •

edited

Loading

seunghwak Jul 20, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 19, 2022

seunghwak Jul 19, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 20, 2022

ChuckHastings Jul 20, 2022

seunghwak Jul 19, 2022

ChuckHastings Jul 20, 2022

alexbarghi-nv commented Jul 20, 2022

ChuckHastings commented Jul 21, 2022

seunghwak Jul 21, 2022

ChuckHastings Jul 21, 2022

seunghwak Jul 21, 2022

seunghwak Jul 21, 2022

ChuckHastings Jul 21, 2022

seunghwak Jul 21, 2022

seunghwak Jul 21, 2022

seunghwak Jul 21, 2022

jnke2016 left a comment

alexbarghi-nv left a comment

ChuckHastings commented Jul 22, 2022

Define API for MG random walk #2407

Define API for MG random walk #2407

Conversation

ChuckHastings commented Jul 13, 2022 • edited Loading

codecov-commenter commented Jul 13, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChuckHastings Jul 20, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexbarghi-nv commented Jul 20, 2022

ChuckHastings commented Jul 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 left a comment

Choose a reason for hiding this comment

alexbarghi-nv left a comment

Choose a reason for hiding this comment

ChuckHastings commented Jul 22, 2022

ChuckHastings commented Jul 13, 2022 •

edited

Loading

codecov-commenter commented Jul 13, 2022 •

edited

Loading

ChuckHastings Jul 20, 2022 •

edited

Loading