[pull] master from tensorflow:master by pull[bot] · Pull Request #775 · barkpixels/tensorflow

pull · 2025-11-05T15:40:15Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

… by --xla-dump-to through dump.h Imported from GitHub PR openxla/xla#33505 📝 Summary of Changes This PR migrates the command buffer content's dump through dump.h Copybara import of the project: -- 0b870f229a40435766613743d47f274e06af4763 by Shawn Wang <shawnw@nvidia.com>: Dump command buffer contents to folder specified by --xla-dump-to through dump.h -- 123c2541701f4d65771b522fbb68e1e0edb7a6e4 by Shawn Wang <shawnw@nvidia.com>: clang format fix Merging this change closes #33505 PiperOrigin-RevId: 828357446

@kxxt

Imported from GitHub PR openxla/xla#32812 Co-author: @kxxt 📝 Summary of Changes: This pull request adds support for RISC-V 64 architecture across the build system, code generation, and Python packaging infrastructure. 🎯 Justification: The changes ensure that riscv64 is recognized as a valid target in Bazel build configurations, LLVM toolchain selection, Python manylinux compliance checks, and related tests and patches. This allows the project to build and test components for riscv64 alongside other supported architectures. 🚀 Kind of Contribution: ✨ New Feature Copybara import of the project: -- 0d02393a6335fb43d67678d0cd15d671e77dc089 by gns <root@infi.wang>: [XLA:CPU] Add support for riscv64 Co-authored-by: Levi Zim <rsworktech@outlook.com> -- 5d95fb479e45524299ff4193b99bb4db0d74483b by gns <root@infi.wang>: Refresh `rules_python` riscv64 patch Co-authored-by: Levi Zim <rsworktech@outlook.com> Merging this change closes #32812 PiperOrigin-RevId: 828379922

The previous logic emitted code that was specific to triton, the CPU -> vector lowering works directly so move this code to triton specific lowering. It also means that we add another op that supports 0D tensors. PiperOrigin-RevId: 828381998

The kernel could be unstable and produce NaN values. When one want to detect such cases they need to set the flag xla_gpu_experimental_enable_nan_counter_on_thunks to true. In this case the counter for NaN values will be triggered and if there is any then the execution will be crashed. PiperOrigin-RevId: 828389430

…roadcast and dot helpers. PiperOrigin-RevId: 828398536

We can't easily depend on SYCL, so for now let's disable all SYCL specific targets in the internal build. PiperOrigin-RevId: 828414906

…hOpResolver`. PiperOrigin-RevId: 828415520

….34GB) Imported from GitHub PR openxla/xla#33504 📝 Summary of Changes Improved memory allocation error messages in tf_allocator_adapter.cc to display byte counts with comma separators and human-readable units. Before: Out of memory while trying to allocate 7450374152 bytes. After: Out of memory while trying to allocate 7,450,374,152 bytes (6.94GiB). Changes: - Added FormatByteSize() helper function that formats byte counts with commas for readability - Leverages existing tsl::strings::HumanReadableNumBytes() utility to append human-readable size (MiB/GiB/TiB) - Updated MemoryAllocationError() to use the new formatting 🎯 Justification When debugging out-of-memory errors, users need to quickly understand allocation sizes. Without formatting, it's difficult to distinguish between millions and billions at a glance (e.g., is 7450374152 closer to 7 million or 7 billion?). This change improves the developer experience by: 1. Making large numbers easier to parse with comma separators 2. Providing immediate intuition about size magnitude (MB vs GB vs TB) 3. Maintaining exact byte precision for detailed debugging This benefits all workloads that encounter memory allocation failures, making error messages more actionable. 🚀 Kind of Contribution ♻️ Cleanup 🧪 Unit Tests No new unit tests added. This change only affects error message formatting and does not alter program logic or behavior. The existing allocation failure paths remain unchanged. 🧪 Execution Tests No new execution tests needed. This is a cosmetic improvement to error messages that does not affect execution correctness or trigger any new code paths. Copybara import of the project: -- f738426970e6f067df3dde2f93bd2736294e7e5d by Ram Rachum <ram@rachum.com>: Use human-readable units in "Out of memory" errors (e.g. 7.34GB) Merging this change closes #33504 PiperOrigin-RevId: 828416974

Also adding a platform_name parameter to the `DeserializeThunkProto` function since we need to to deserialize this thunk (and it doesn't make sense to store in the proto). PiperOrigin-RevId: 828417473

…mitter. PiperOrigin-RevId: 828426672

…xing and in the gather emitter PiperOrigin-RevId: 828439076

Also removes an unused import in `gpu_executable.proto` PiperOrigin-RevId: 828445423

Imported from GitHub PR openxla/xla#33212 This PR enables the HloEvaluator to handle complex numbers in more operations (trigonometric and hyperbolic). Tests are added to check if folding works for these operations. It also disables `constant_folding` in `complex_unary_op_test.cc` test, as it was intended to check the accuracy of backends implementations. If constant folding remains enabled, the test ends up checking the accuracy of `libstdc++` implementations for `tan`, `asin`, and `asinh` instead Copybara import of the project: -- af6ae1d31b7d6bab2401a99dc397945c824e8631 by Aleksei Nurmukhametov <anurmukh@amd.com>: Do not run constant_folding on complex_unary_op_test.cc -- d3074cc139f07ef6e91723f6290b8294dcc38717 by Aleksei Nurmukhametov <anurmukh@amd.com>: Enable HloEvaluator for more complex ops Merging this change closes #33212 PiperOrigin-RevId: 828446740

…cast. This will enable us to rewrite the lowering patterns at a higher level if needed and also allow us to reuse buffers. It will also make vectorizing using target length vectors possible e.g vector<8xf32> rather than the current scheme of emitting super-vectors and relying on LLVM to split. PiperOrigin-RevId: 828448837

shawnwang18 and others added 14 commits November 5, 2025 01:50

[XLA:GPU][XTile] Support 0D tensors in emitting transpose, bitcast, b…

9f2d8e7

…roadcast and dot helpers. PiperOrigin-RevId: 828398536

[XLA:GPU] Exclude Intel targets from internal wildcard build

cf7fdfe

We can't easily depend on SYCL, so for now let's disable all SYCL specific targets in the internal build. PiperOrigin-RevId: 828414906

Pass allocation info to InterpreterBuilder in `InterpreterCreateWit…

fd26d87

…hOpResolver`. PiperOrigin-RevId: 828415520

Add (de)serialization methods to CubSortThunk

10669e6

Also adding a platform_name parameter to the `DeserializeThunkProto` function since we need to to deserialize this thunk (and it doesn't make sense to store in the proto). PiperOrigin-RevId: 828417473

[XLA:GPU][XTile] Always pass tensors in the values map in the tiled e…

22189ec

…mitter. PiperOrigin-RevId: 828426672

[XLA:GPU] Support start_index_map in ComputeOutputToInputGatherOpInde…

16334aa

…xing and in the gather emitter PiperOrigin-RevId: 828439076

Add flag that dumps a serialized GpuExecutable

85e759e

Also removes an unused import in `gpu_executable.proto` PiperOrigin-RevId: 828445423

pull bot locked and limited conversation to collaborators Nov 5, 2025

pull bot added the ⤵️ pull label Nov 5, 2025

pull bot merged commit 1f95e41 into barkpixels:master Nov 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from tensorflow:master#775

[pull] master from tensorflow:master#775
pull[bot] merged 14 commits intobarkpixels:masterfrom
tensorflow:master

pull bot commented Nov 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

pull bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

pull bot commented Nov 5, 2025 •

edited

Loading