Skip to content

Cherry-pick openxla/xla#44389: [ROCm] Enable float and buffer checker#989

Closed
magaonka-amd wants to merge 1 commit into
rocm-jaxlib-v0.10.2from
cherrypick-44389-to-v0.10.2
Closed

Cherry-pick openxla/xla#44389: [ROCm] Enable float and buffer checker#989
magaonka-amd wants to merge 1 commit into
rocm-jaxlib-v0.10.2from
cherrypick-44389-to-v0.10.2

Conversation

@magaonka-amd

Copy link
Copy Markdown

Motivation

Requested in #985 review: bring openxla#44389 onto rocm-jaxlib-v0.10.2. The upstream PR is approved but not yet copybara-merged, so this is cherry-picked from the PR head commit 9b7baa6dd5 (ROCm:draganm/float_checker) rather than a main commit.

Commit (cherry-picked with -x)

  • 9b7baa6dd5 — [ROCm] Enable float and buffer checker

Faithfulness note

Cherry-picked onto the pinned base 5a9e73cb. Result matches the upstream PR exactly (17 files, +963/-613). The only manual resolution was xla/stream_executor/rocm/BUILD: the 3-way merge dragged in unrelated newer-main targets (rocm_memory_reservation, rocm_vmm_allocator, rocm_device_address_vmm_allocator) that are not part of this PR and whose sources don't exist on this base. Those were discarded; only PR openxla#44389's two hunks were applied (the two :buffer_debug_*_kernel_rocm deps + the two new rocm_library targets). All referenced targets (gpu_kernel_registry, gpu:buffer_debug_*_kernel, runtime:buffer_debug_log_structs) resolve; no duplicate targets.

Files changed (highlights)

  • New: xla/stream_executor/gpu/buffer_debug_{float_check,xor_checksum}_kernel_lib.cu.h
  • New: xla/stream_executor/rocm/buffer_debug_{float_check,xor_checksum}_kernel_rocm.cu.cc
  • CUDA kernels refactored to share the gpu/ libs; ROCm BUILD + rocm_executor.cc wired up.

Test Plan

  • ROCm jaxlib build on rocm-jaxlib-v0.10.2; release-validation CI.

@magaonka-amd

Copy link
Copy Markdown
Author

Superseded by #993, which combines all four ROCm 0.10.2 cherry-pick PRs into a single PR against rocm-jaxlib-v0.10.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants