[pull] master from tensorflow:master#62
Merged
pull[bot] merged 22 commits intonoaai:masterfrom Jan 15, 2025
Merged
Conversation
…er (NaNs go last). PiperOrigin-RevId: 715826491
Moves - byte_order.h - crash_analysis.h - dynamic_annotations.h - grpc_credentials.h - intrusive_ptr.h - prefetch.h - ram_file_system.h - resource.h - resource_loader.h - rocm_rocdl_path.h - stack_frame.h PiperOrigin-RevId: 715828782
PiperOrigin-RevId: 715831514
This method was renamed but staging function kept, switch to renamed variant. PiperOrigin-RevId: 715859237
Support Lock and Unlock, instantiate MLD cl environment as singleton instance. Added CompileModel CPU test with OpenCL Tensorbuffers as inputs and outputs. PiperOrigin-RevId: 715860165
PiperOrigin-RevId: 715863802
…saBufferInterval are inclusive. Update logging in MSA to indicate as much. PiperOrigin-RevId: 715882309
We had the GIL released when constructing an nb::bytes object, which isn't allowed. In passing, also avoid an unnecessary string copy. PiperOrigin-RevId: 715886008
…calizer` `DynamicDimensionInference` expects all conditional inputs/outputs to be tuplized so that it can easily add more inputs and `RET_CHECK`-fails otherwise, but `ConditionalCanonicalizer` only canonicalizes the outputs. This CL changes the canonicalizer to tuplize the inputs of conditionals as well. PiperOrigin-RevId: 715887964
…r::MemoryAllocators. PiperOrigin-RevId: 715890862
PiperOrigin-RevId: 715900904
…lder PiperOrigin-RevId: 715902371
Imported from GitHub PR openxla/xla#21273 `ncclCommInitRankScalable` enables the initialization of communicators via multiple roots which improves the init performance at large scale. The maximum number of ranks associated with a root rank to initialize a NCCL communicator can be tuned via `--xla_gpu_nccl_init_max_rank_per_root_ratio`. Default is 128 ranks per root. Copybara import of the project: -- 98ef02dabc0bcb2c8206753bec4873c5f48e269f by Nicolas Castet <ncastet@nvidia.com>: [XLA:GPU] Add support for NCCL ncclCommInitRankScalable API -- f146a48fef5f1a1098b5c01ae79c5a0d9a9af8d7 by Nicolas Castet <ncastet@nvidia.com>: Address review comments -- dd6362af36a1f4d22532ad15b2007527898b5fa1 by Nicolas Castet <ncastet@nvidia.com>: Add GpuCliqueKey::GetSubKeys unit test Merging this change closes #21273 PiperOrigin-RevId: 715903412
+ Correctly (zero/value-)initialize PJRT_ExecuteOptions in tests and pjrt_c_api_client ``` If the number of initializer clauses is less than the number of members or initializer list is completely empty, the remaining members are value-initialized ``` Context: openxla/xla#20429 PiperOrigin-RevId: 715906024
…lex buffer api. PiperOrigin-RevId: 715918395
…nge the function name MacOS mangling changes the function name, use less strict contains check that must work on all platforms. PiperOrigin-RevId: 715919685
…(dimensions whose size is 1).
It is meaningless to partition a dimension whose size is 1. Redundant padding and unpadding may be inserted. To avoid this, we replicate the sharding on these dimensions as a pre-processing.
Take the following input as example
```
ENTRY entry {
%constant.785 = f32[1,8] constant({{0,1,2,3,4,5,6,7}}), sharding={devices=[1,8]<=[8]}
%slice.62 = f32[1,1] slice(%constant.785), slice={[0:1], [0:1]}, sharding={devices=[1,8]<=[8]}
ROOT %reshape.779 = f32[] reshape(%slice.62), sharding={replicated}
}
```
Previous result with redundant instructions
```
ENTRY %entry_spmd () -> f32[] {
%constant.8 = u32[8]{0} constant({0, 1, 2, 3, 4, 5, 6, 7})
%partition-id = u32[] partition-id()
%dynamic-slice.3 = u32[1]{0} dynamic-slice(u32[8]{0} %constant.8, u32[] %partition-id), dynamic_slice_sizes={1}
%reshape.2 = u32[] reshape(u32[1]{0} %dynamic-slice.3)
%constant.9 = u32[] constant(0)
%compare = pred[] compare(u32[] %reshape.2, u32[] %constant.9), direction=EQ
%broadcast = pred[1,1]{1,0} broadcast(pred[] %compare), dimensions={}
%constant.0 = f32[1,8]{1,0} constant({ { 0, 1, 2, 3, 4, 5, 6, 7 } })
%constant.1 = s32[] constant(0)
%constant.2 = s32[8]{0} constant({0, 1, 2, 3, 4, 5, 6, 7})
%dynamic-slice = s32[1]{0} dynamic-slice(s32[8]{0} %constant.2, u32[] %partition-id), dynamic_slice_sizes={1}
%reshape = s32[] reshape(s32[1]{0} %dynamic-slice)
%dynamic-slice.1 = f32[1,1]{1,0} dynamic-slice(f32[1,8]{1,0} %constant.0, s32[] %constant.1, s32[] %reshape), dynamic_slice_sizes={1,1}
%copy = f32[1,1]{1,0} copy(f32[1,1]{1,0} %dynamic-slice.1)
%constant.10 = f32[] constant(0)
%broadcast.1 = f32[1,1]{1,0} broadcast(f32[] %constant.10), dimensions={}
%select = f32[1,1]{1,0} select(pred[1,1]{1,0} %broadcast, f32[1,1]{1,0} %copy, f32[1,1]{1,0} %broadcast.1)
%all-reduce = f32[1,1]{1,0} all-reduce(f32[1,1]{1,0} %select), channel_id=1, replica_groups={{0,1,2,3,4,5,6,7}}, use_global_device_ids=true, to_apply=%add.clone
ROOT %reshape.3 = f32[] reshape(f32[1,1]{1,0} %all-reduce)
}
```
Result with this improvement
```
ENTRY %entry_spmd () -> f32[] {
%constant.0 = f32[1,8]{1,0} constant({ { 0, 1, 2, 3, 4, 5, 6, 7 } })
%slice.0 = f32[1,1]{1,0} slice(f32[1,8]{1,0} %constant.0), slice={[0:1], [0:1]}
ROOT %reshape.1 = f32[] reshape(f32[1,1]{1,0} %slice.0)
}
```
PiperOrigin-RevId: 715924899
PiperOrigin-RevId: 715934702
PiperOrigin-RevId: 715950330
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.1)
Can you help keep this open source service alive? 💖 Please sponsor : )