Skip to content

[kernels] Migrate comm/ from LegacyUnsafePointer to UnsafePointer#5690

Closed
KrxGu wants to merge 4 commits into
modular:mainfrom
KrxGu:krxgu/migrate-comm-shmem-unsafe-pointer-part1
Closed

[kernels] Migrate comm/ from LegacyUnsafePointer to UnsafePointer#5690
KrxGu wants to merge 4 commits into
modular:mainfrom
KrxGu:krxgu/migrate-comm-shmem-unsafe-pointer-part1

Conversation

@KrxGu
Copy link
Copy Markdown
Contributor

@KrxGu KrxGu commented Dec 19, 2025

Fixes #5673

Summary

Migrates the comm module and its dependents from LegacyUnsafePointer to the new UnsafePointer API with explicit mutability and origin parameters.

Part 1 of #5673.

Changes

comm/ module:

  • sync.mojo: Updated imports, replaced LegacyOpaquePointer with MutOpaquePointer[MutOrigin.external] for FFI callbacks
  • allgather.mojo: Added explicit mut and MutAnyOrigin to all pointer signatures
  • allreduce.mojo: Migrated all pointer usages to new API

Dependent files:

  • linalg/distributed_matmul.mojo: Updated import and rank_sigs signatures
  • Mogg/MOGGKernelAPI/MOGGKernelAPI.mojo: Added RealUnsafePointer alias for comm API calls while preserving LegacyUnsafePointer for other usages

Testing

Builds successfully:

  • ./bazelw build //max/kernels/src/comm:comm
  • ./bazelw build //max/kernels/src/linalg:linalg
  • ./bazelw build //max/kernels/src/Mogg/MOGGKernelAPI:MOGGKernelAPI

Next Steps

Follow-up PRs will migrate shmem/ and comm/vendor/ccl.mojo.

Copilot AI review requested due to automatic review settings December 19, 2025 21:09
@KrxGu KrxGu requested a review from a team as a code owner December 19, 2025 21:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the comm/ directory from the legacy LegacyUnsafePointer API to the new UnsafePointer API with explicit mutability and origin parameters. The changes focus on three core communication kernel files for multi-GPU operations.

Key Changes:

  • Replaced LegacyUnsafePointer with UnsafePointer and added explicit mut=True/False parameters throughout
  • Replaced LegacyOpaquePointer with MutOpaquePointer[MutOrigin.external] for FFI callback boundaries
  • Removed the temporary UnsafePointerV2 workaround and consolidated to the new UnsafePointer API

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
max/kernels/src/comm/sync.mojo Updated imports, migrated OpaquePointer types for FFI callbacks, added explicit mutability to Signal pointer types in barrier synchronization functions
max/kernels/src/comm/allreduce.mojo Migrated pointer types across all reduction kernels and helper functions, added explicit mutability for source (mut=False) and destination (mut=True) buffers, updated signal pointer types
max/kernels/src/comm/allgather.mojo Updated pointer types in allgather kernels, added explicit mutability for output (mut=True) and source (mut=False) buffers, migrated signal pointer types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread max/kernels/src/comm/allreduce.mojo
@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Dec 19, 2025

@NathanSWard This PR(Part-1) is ready for review.

Ci all green, Good To Merge

Copy link
Copy Markdown
Contributor

@NathanSWard NathanSWard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!
Let a handful of comments - mostly around when needing/not needing to manually specify an origin or mut parameter.
Hopefully it makes sense, but if not please ask questions and I'll do my best to help/explain!

Comment thread max/kernels/src/comm/allgather.mojo Outdated
Comment thread max/kernels/src/comm/allgather.mojo Outdated
Comment thread max/kernels/src/comm/allgather.mojo Outdated
Comment thread max/kernels/src/comm/allreduce.mojo Outdated
Comment thread max/kernels/src/comm/allreduce.mojo Outdated
Comment thread max/kernels/src/comm/allreduce.mojo Outdated
Comment thread max/kernels/src/comm/allreduce.mojo
Comment thread max/kernels/src/comm/sync.mojo Outdated
Comment thread max/kernels/src/comm/sync.mojo
@NathanSWard NathanSWard self-assigned this Dec 19, 2025
@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Dec 20, 2025

@NathanSWard i have addressed all your reviews and made the necessary changes.

CI is all green, Good To Merge!!

@NathanSWard
Copy link
Copy Markdown
Contributor

NathanSWard commented Dec 22, 2025

@KrxGu - thanks! I'll take a look in the next few days! Modular is on a Holiday break but I'll get this is soon :)

@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Dec 22, 2025

@KrxGu - thanks! I'll take a look in the next few days! Modular is on a Holiday break but I'll get this is soon :)

Nw i can wait😄,Thank you for getting back!!
Happy Holidays✨

@NathanSWard
Copy link
Copy Markdown
Contributor

@KrxGu - now that we're back from holidays I'm re-prioritizing getting your PRs merged in!
We recently renamed the open-source directory to oss, would you be able to rebase your changes with the current main? I'll start reviewing here today :)

Copy link
Copy Markdown
Contributor

@NathanSWard NathanSWard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this gets rebased and the last to comments are address this looks great and I'll merge it in :D

Comment thread max/kernels/src/comm/allreduce.mojo
Comment thread max/kernels/src/comm/allreduce.mojo
KrxGu added 4 commits January 6, 2026 22:23
Replace LegacyUnsafePointer with the new UnsafePointer API in sync.mojo,
allgather.mojo, and allreduce.mojo.

Key changes:
- Use explicit mut=True/False and origin parameters
- Update OpaquePointer to MutOpaquePointer for FFI callbacks
- MutAnyOrigin for pointer types that need to work across origins

Part of modular#5673
- Add explicit mut=True/False to pointer signatures in allreduce.mojo
- Migrate distributed_matmul.mojo from LegacyUnsafePointer to UnsafePointer
- Update all rank_sigs type signatures to use new pointer API
Use RealUnsafePointer alias for rank_sigs in allreduce, allgather,
and matmul_allreduce calls while keeping LegacyUnsafePointer for
existing usages in the file.
- Remove unnecessary imports (types are in prelude)
- Use ImmutAnyOrigin for immutable pointers (mut=False)
- Remove mut= parameter when origin is explicit
- Keep origins explicit for GPU kernel parameters (enqueue_function_checked)
- Simplify pointer type signatures across all files
@KrxGu KrxGu force-pushed the krxgu/migrate-comm-shmem-unsafe-pointer-part1 branch from cc579c4 to 70a7bae Compare January 6, 2026 17:04
@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Jan 6, 2026

The CI failures in MOGGKernelAPI.mojo are pre-existing issues on main after the rebase, not caused by our changes. Our PR only touches the comm/ module migration (sync.mojo, allgather.mojo, allreduce.mojo) and the dependent files (distributed_matmul.mojo and minimal changes to MOGGKernelAPI.mojo for comm API compatibility). The comm module builds successfully. The MOGGKernelAPI errors at lines 6601, 7001, 7061, 9385, 9480 are unrelated to our UnsafePointer migration work.

@NathanSWard Kindly confirm that the above is the case, if not i am ready to make any changes suggested by you.

@NathanSWard
Copy link
Copy Markdown
Contributor

Correct @KrxGu - these issues are orthogonal to your changes. You can ignore them!

@KrxGu
Copy link
Copy Markdown
Contributor Author

KrxGu commented Jan 7, 2026

Correct @KrxGu - these issues are orthogonal to your changes. You can ignore them!

Gotcha, so now this PR is Good To Merge!!

@NathanSWard
Copy link
Copy Markdown
Contributor

!sync

@modularbot modularbot added the imported-internally Signals that a given pull request has been imported internally. label Jan 7, 2026
@modularbot
Copy link
Copy Markdown
Collaborator

✅🟣 This contribution has been merged 🟣✅

Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the main branch during the next Mojo nightly release, typically within the next 24-48 hours.

We use Copybara to merge external contributions, click here to learn more.

@modularbot modularbot added merged-internally Indicates that this pull request has been merged internally merged-externally Merged externally in public mojo repo labels Jan 7, 2026
@modularbot
Copy link
Copy Markdown
Collaborator

Landed in 4f80969! Thank you for your contribution 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

imported-internally Signals that a given pull request has been imported internally. merged-externally Merged externally in public mojo repo merged-internally Indicates that this pull request has been merged internally

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Migrate kernels/src/<comm/shmem> off of LegacyUnsafePointer

4 participants