Skip to content

Migrate RMM usage to CCCL MR design#1990

Merged
gforsyth merged 15 commits intorapidsai:mainfrom
bdice:rmm-cccl-migration
Apr 21, 2026
Merged

Migrate RMM usage to CCCL MR design#1990
gforsyth merged 15 commits intorapidsai:mainfrom
bdice:rmm-cccl-migration

Conversation

@bdice
Copy link
Copy Markdown
Contributor

@bdice bdice commented Apr 3, 2026

Summary

  • Migrate all RMM usage to the new CCCL memory resource design (de-templated resources, device_async_resource_ref instead of device_memory_resource*, value semantics)
  • Replace get_workspace_resource() / get_large_workspace_resource() with _ref() variants across 65 call sites
  • Rewrite cuda_huge_page_resource to satisfy CCCL resource concept directly
  • Remove owning_wrapper / dynamic_cast patterns in C API and benchmarks

Depends on rapidsai/rmm#2361.
Depends on rapidsai/ucxx#636.
Depends on rapidsai/raft#2996.

Changes

  • 33 files changed (~208 insertions, ~221 deletions)
  • device_memory_resource* params → device_async_resource_ref (ivf_common, ivf_pq, naive_knn)
  • get_current_device_resource()get_current_device_resource_ref()
  • set_current_device_resource()set_current_device_resource_ref()
  • De-templated pool_memory_resource, failure_callback_resource_adaptor in bench utils
  • Removed &resource pointer patterns (resources are now copyable value types)
  • Removed spurious mr arg from select_k calls (previously compiled due to implicit pointer→bool conversion)
  • C API pool resource management rewritten without owning_wrapper

Adapt cuVS to RMM breaking changes: removal of device_memory_resource
base class, de-templated resource/adaptor types, new per-device resource
ref APIs, and CCCL resource concept requirements.

Key changes:
- get_workspace_resource() -> get_workspace_resource_ref() (44 sites)
- get_large_workspace_resource() -> get_large_workspace_resource_ref() (21 sites)
- get_current_device_resource() -> get_current_device_resource_ref()
- device_memory_resource* params -> device_async_resource_ref
- Remove &resource pointer patterns (resources are now value types)
- Migrate cuda_huge_page_resource to CCCL concept
- De-template pool_memory_resource, failure_callback_resource_adaptor
- Rewrite C API pool resource management without owning_wrapper
- Remove deleted rmm/mr/device_memory_resource.hpp includes
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 3, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bdice bdice force-pushed the rmm-cccl-migration branch 4 times, most recently from 942f4e8 to 4c59082 Compare April 17, 2026 07:09
@bdice bdice added breaking Introduces a breaking change improvement Improves an existing functionality labels Apr 17, 2026
@bdice bdice force-pushed the rmm-cccl-migration branch from 4c59082 to ae4f344 Compare April 17, 2026 07:29
@bdice bdice changed the title Migrate RMM usage to CCCL memory resource design Migrate RMM usage to CCCL MR design Apr 17, 2026
@bdice bdice force-pushed the rmm-cccl-migration branch 5 times, most recently from 4df8847 to e45a83e Compare April 18, 2026 15:52
@bdice
Copy link
Copy Markdown
Contributor Author

bdice commented Apr 19, 2026

/ok to test

Comment thread c/src/core/c_api.cpp Outdated
@@ -17,37 +19,25 @@

namespace raft::mr {
Copy link
Copy Markdown
Contributor Author

@bdice bdice Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is cuVS defining classes in the raft:: namespace? (This is not in scope for this PR.)

Comment thread cpp/bench/ann/src/common/cuda_huge_page_resource.hpp Outdated
Comment thread cpp/bench/ann/src/cuvs/cuvs_ann_bench_utils.h Outdated
indices_topk + offset * k,
type != cuvs::distance::DistanceType::InnerProduct,
mr);
type != cuvs::distance::DistanceType::InnerProduct);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check if this change is intentional or a mistake.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this was changed in rapidsai/raft#1786?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was a dead code path. I can't see how it would've been building successfully prior to this migration otherwise.

indices_topk + offset * k,
cuvs::distance::is_min_close(type),
mr);
cuvs::distance::is_min_close(type));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check if this API change was intentional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this was changed in rapidsai/raft#1786?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was a dead code path. I can't see how it would've been building successfully prior to this migration otherwise.

@bdice bdice force-pushed the rmm-cccl-migration branch from 1428e7c to 423fa1c Compare April 19, 2026 15:54
@bdice bdice force-pushed the rmm-cccl-migration branch from 423fa1c to 8f50dc0 Compare April 19, 2026 16:27
@bdice bdice marked this pull request as ready for review April 19, 2026 16:47
@bdice bdice requested review from a team as code owners April 19, 2026 16:47
@bdice bdice requested a review from AyodeAwe April 19, 2026 16:47
Add explicit operator!= (required when _CCCL_HAS_CONCEPTS is 0, e.g. nvcc
C++17 mode, since operator!= is not auto-synthesized from operator==).
Use constexpr friend for get_property to match the RMM resource pattern.
Add synchronous_resource and resource static_asserts for finer diagnostics.
@bdice bdice removed the request for review from a team April 21, 2026 14:27
@bdice
Copy link
Copy Markdown
Contributor Author

bdice commented Apr 21, 2026

Requesting admin-merge. Most CI is passing, the one failure seems like a fluke. I am not going to wait for the slow L4 C++ job since all others have passed and this is blocking builds for cuML + cuGraph.

@gforsyth gforsyth merged commit c999208 into rapidsai:main Apr 21, 2026
77 of 78 checks passed
rapids-bot Bot pushed a commit to rapidsai/cuml that referenced this pull request Apr 21, 2026
## Summary
- Migrate all raw RMM `allocate`/`deallocate` calls to the new CCCL 3-argument API that requires explicit alignment
- Replace removed `rmm.librmm.per_device_resource` Cython import with `rmm.pylibrmm.memory_resource` and use `make_any_device_resource` to obtain the resource for `device_buffer` construction

Depends on rapidsai/rmm#2361.
Depends on rapidsai/ucxx#636.
Depends on rapidsai/raft#2996.
Depends on rapidsai/cuvs#1990.

## Changes
- **`cpp/src/genetic/genetic.cu`**: Add explicit `alignof(node)` / `alignof(program)` to all `allocate` and `deallocate` calls in `parallel_evolve` and `symFit`; fix deallocation bug in `parallel_evolve` where `h_nextprogs[i].len` was incorrectly used instead of `tmp.len` to compute the buffer size being freed
- **`cpp/examples/symreg/symreg_example.cpp`**: Use `params.population_size * sizeof(cg::program)` and `alignof(cg::program)` for `allocate`/`deallocate` calls, fixing incorrect byte-size computation; remove unused `<rmm/aligned.hpp>` include
- **`cpp/tests/sg/genetic/evolution_test.cu`**: Add alignment arguments to allocate/deallocate in `SymReg` test
- **`cpp/tests/sg/genetic/program_test.cu`**: Add alignment arguments to `SetUp`/`TearDown` allocate/deallocate calls
- **`python/cuml/cuml/manifold/umap/umap.pyx`**: Replace `get_current_device_resource()` with `make_any_device_resource(get_current_device_resource().get_mr())` for `device_buffer` construction

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Simon Adorf (https://github.com/csadorf)
  - Divye Gala (https://github.com/divyegala)
  - Victor Lafargue (https://github.com/viclafargue)

URL: #7951
rapids-bot Bot pushed a commit to rapidsai/cugraph that referenced this pull request Apr 23, 2026
## Summary

- Replace removed `rmm::mr::device_memory_resource` base class, `owning_wrapper`, `shared_ptr`-based resource management, and deprecated per-device resource APIs with CCCL-native memory resource types
- Use `cuda::mr::any_resource<cuda::mr::device_accessible>` for owning type-erased storage, `rmm::device_async_resource_ref` for non-owning references, and value-typed resources (`cuda_memory_resource`, `pinned_host_memory_resource`)
- Pass the memory resource to `raft::handle_t` as the `workspace_resource` (3rd) constructor argument, matching the new raft API (`stream_view`, `stream_pool`, `std::optional<raft::mr::device_resource>`)

Depends on rapidsai/rmm#2361.
Depends on rapidsai/ucxx#636.
Depends on rapidsai/raft#2996.
Depends on rapidsai/cuvs#1990.

## Files changed

**Headers:**
- `algorithms.hpp`, `dendrogram.hpp`, `legacy/graph.hpp`, `legacy/functions.hpp`: `get_current_device_resource()` → `get_current_device_resource_ref()` in default argument expressions
- `host_staging_buffer_manager.hpp`: Remove `owning_wrapper`, store `pool_memory_resource` by value in a `std::optional`, accept `pinned_host_memory_resource` by value in `init()`
- `large_buffer_manager.hpp`: Store `pinned_host_memory_resource` by value (not `shared_ptr`), return `device_async_resource_ref` from `get()`, `std::move` the resource into storage
- `mtmg/resource_manager.hpp`: Use `cuda::mr::any_resource<device_accessible>` instead of `shared_ptr<device_memory_resource>` for `per_device_rmm_resources_`, use non-deprecated `set_per_device_resource`, pass resource as `workspace_resource` to `raft::handle_t`

**Tests:**
- `base_fixture.hpp`: Return `any_resource<device_accessible>` from `create_memory_resource()`, use value-typed MR factory helpers (`make_cuda`, `make_managed`, `make_pool`, `make_binning`), switch to non-deprecated `set_current_device_resource` / `get_current_device_resource_ref`
- `multi_node_threaded_test.cpp`: Switch to non-deprecated `set_current_device_resource(resource)`
- `mg_graph500_bfs_test.cu`, `mg_graph500_sssp_test.cu`: Store `pinned_mr_` as `optional<pinned_host_memory_resource>` by value, prefer `.value()` over `operator*` for optional access

**Examples:**
- All 4 example files (`sg_graph_algorithms.cpp`, `mg_graph_algorithms.cpp`, `vertex_and_edge_partition.cu`, `graph_operations.cu`): Use value-typed `cuda_memory_resource`, non-deprecated `set_current_device_resource`, pass the resource to `raft::handle_t` as the `workspace_resource` (3rd positional arg, with `nullptr` for the unused `stream_pool`)

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #5483
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change improvement Improves an existing functionality

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants