What's Changed
🚨 Breaking Changes
- Simplify shuffle completion tracking by @wence- in #914
- Simplify async shuffle completion notification by @wence- in #916
- Implement streaming allreduce and use it in streaming bloom filter construction by @wence- in #928
- Switch to
std::string_viewwhere possible inBackendby @pentschev in #951 - Split out utilities in allgather and simplify completion tracking by @wence- in #957
- Execute PyActors in asyncio.TaskGroup by @mroeschke in #943
- Make BufferResource parameters required instead of optional by @vyasr in #961
- Replace
cudaMemcpyAsyncwithrapidsmpf::cuda_memcpy_asyncby @nirandaperera in #965 - Report failures programmatically in
rrun.bind()by @pentschev in #971 - Predefine statistics formatters by @madsbk in #976
- Migrate RMM usage to CCCL MR design by @bdice in #940
- Add
serialize/deserializeto PythonStatisticsby @madsbk in #980 - Change cuCascade GIT_TAG to main by @nirandaperera in #988
- Refactor memory resource ownership and type safety by @nirandaperera in #985
- Remove memory resources from Statistics by @nirandaperera in #1003
- Disable pinned memory by default by @nirandaperera in #1020
- Remove py_executor from run_actor_network by @wence- in #1023
- Remove Dask integration by @madsbk in #1049
🐛 Bug Fixes
- [MINOR] Fixing weakref on None by @nirandaperera in #917
- Fix UCXX import for pip installations by @madsbk in #921
- Fix allgather op_id reuse better by @wence- in #918
- Support reuse of op_ids in shuffles by @wence- in #927
- Fix Lineariser::drain() deadlock by @Matt711 in #932
- Avoid defining awaitable in
PyActor.__init__by @mroeschke in #941 - Add individual timeout for each of ctest nranks run by @pentschev in #945
- Move away from deprecated
ucxx::Address::getString()by @pentschev in #953 - Forward channel metadata through bloom filter apply by @wence- in #952
- Increase timeout for test_bootstrap_multiple_clients by @pentschev in #962
- Use non-deprecated
cudf::filtered_joinconstructor by @pentschev in #973 - Fix
ReceivedChunks::spill()throwing on control messages by @Matt711 in #975 - More BufferResource lifetime fixes by @vyasr in #970
- Fix
ReceivedChunks::spill()over-spilling past the amount target by @Matt711 in #984 - Suppress warnings from cuco headers, do not export CMake target by @nirandaperera in #990
- Remove unnecessary
$<BUILD_INTERFACE:...>from CMake targets by @pentschev in #1001 - Fix
PostBox::spill()passing stale amount tostd::ranges::lower_boundby @Matt711 in #983 - Fix librapidsmpf wheel builds with current cuDF packages by @pentschev in #1022
- Remove C language from librapidsmpf wheel builder by @pentschev in #1024
- Skip stack trace if gdb is unavailable by @pentschev in #1004
- Fix large-message initialization in communication benchmark by @pentschev in #1016
- Use new cuCascade::cucascade_topology_discovery target by @KyleFromNVIDIA in #1030
- Remove unnecessary include from
slurm_backend.cppby @pentschev in #1034 - Fix socket backend test write handling by @bdice in #1033
- More follow-ups to recent cucascade fixes by @KyleFromNVIDIA in #1035
- Remove CUDA includes from bootstrap sources by @bdice in #1044
- Pin cuCascade for 26.06 release by @pentschev in #1057
📖 Documentation
- Fix formatting and language issues in docs by @pentschev in #929
🚀 New Features
- Move resource binding to callable
bind()function by @pentschev in #949 - Python bindings for
rrun.bind()by @pentschev in #950 - Add functions for resource binding verification to
librrunby @pentschev in #972 - Add skill to reproduce CI locally by @pentschev in #991
- Enable Ray for aarch64 with conda by @pentschev in #1002
- Add
TableChunk.into_packed_databy @Matt711 in #986 - Add socket-based bootstrap coordination backend for
rrunby @pentschev in #1000 - [C++] Add OrderScheme partitioning metadata by @rjzamora in #993
- Add hybrid Slurm support to rrun with PMIx-based coordination by @pentschev in #844
- [Python] Add
OrderSchemepartitioning metadata by @rjzamora in #853 - Add support for Ray in Python 3.14 and wheels tests by @pentschev in #1015
🛠️ Improvements
- Avoid
forkmethod in dask tests by @TomAugspurger in #901 - Fix compiler warnings for deprecated APIs by @Matt711 in #903
- Forward-merge release/26.04 into main by @jameslamb in #923
- [MINOR] Adding python bindings for
reserve_or_failby @nirandaperera in #925 - Simplify Shuffler postboxes by @wence- in #920
- Dask cluster bootstrap with options and enable pinned memory by default by @nirandaperera in #926
- Limit pinned reservations by @nirandaperera in #935
- update pip devcontainers' base image tags by @trxcllnt in #937
- [MINOR] Improve error message for duplicate option keys by @nirandaperera in #938
- Pre-distribute listener addresses during UCXX
barrier()by @pentschev in #934 - Remove deprecated benchmark::internal::Benchmark usage by @bdice in #939
- [MINOR] Verbose nvtx traces by @nirandaperera in #936
- Report pinned memory resource profile stats by @nirandaperera in #948
- Update to clang 20.1.8 by @bdice in #956
- Add BufferResource lifetime tracking to Python wrapper classes by @vyasr in #960
- Gather statistics by @madsbk in #958
- Integrate
MetadataPayloadExchangeintoShufflerby @pentschev in #649 - [MINOR] Use cudf::packed_size to estimate the memory usage of a table. by @nirandaperera in #967
- Migrate from pynvml to cuda.core.system by @mdboom in #969
- Add
to_dict()to PythonStatisticsby @madsbk in #989 - Implement a sparse alltoall exchange pattern by @wence- in #959
- Harden FileBackend key validation and temp file creation by @pentschev in #992
- fix(ci): resolve all zizmor findings and add zizmor pre-commit checks by @gforsyth in #1007
- Add sql text validation to ndsh.py by @TomAugspurger in #1006
- Use
token.rapids.nvidia.comwhen issuing S3 bucket creds in devcontainers by @trxcllnt in #1005 - Topology-aware default pool sizing for
PinnedMemoryResourceby @nirandaperera in #1012 - Simplify
HostBuffermemory ownership by @nirandaperera in #1013 - Build and test with CUDA 13.2.0 by @bdice in #1021
- Use static linkage for CUDA runtime by @bdice in #725
- skip CuPy 14.1.0 by @jameslamb in #1068
New Contributors
Full Changelog: v26.06.00a...v26.06.00