[pull] master from tensorflow:master by pull[bot] · Pull Request #350 · barkpixels/tensorflow

pull · 2025-05-30T21:40:07Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

PiperOrigin-RevId: 765209533

PiperOrigin-RevId: 765233541

…uild-arm64 container to us-docker.pkg.dev/ml-oss-artifacts-published/ml-public-container/ml-build-arm64. These containers are the same (same build script), but they are just in a different repositories. Also change the remaining of `ml-build` container over to the new one as well. PiperOrigin-RevId: 765237536

Calling the same subgraph recursively via a CALL_ONCE op creates an infinite recursion that causes a stack overflow. Added a check so that the same subgraph cannot call itself via a CALL_ONCE op. PiperOrigin-RevId: 765251354

…imes. PiperOrigin-RevId: 765269354

…e us-docker.pkg.dev/ml-ss-artifacts-published/ml-public-container. The older container is `us-central1-docker.pkg.dev` is no longer maintained. PiperOrigin-RevId: 765283088

…d updates relevant scripts and configs PiperOrigin-RevId: 765284686

PiperOrigin-RevId: 765287094

PiperOrigin-RevId: 765302519

PiperOrigin-RevId: 765306410

PiperOrigin-RevId: 765308679

…_PLUGIN` PiperOrigin-RevId: 765312495

…ency PiperOrigin-RevId: 765325170

@xla

When XProf removed Tensorflow as a dependency, we also renamed @local_xla back to @xla, as well as @tsl. This broke compatibility with Tensorflow, so adding a mapping to mimic the old behavior. PiperOrigin-RevId: 765326579

…s. This is most relevant for Async Jax PST training, where workers can reconnect on preemption and the training continues on other workers. Following scenario is addressed: 1. begin loop barrier 2. Run training steps 3. end loop barrier 4. some_other_barrier 5. Perform checkpointing etc. 6. Go back to 1. If a task is restarted, while 2 is in progress, the restarted task will wait on begin loop barrier, while other tasks will wait on end loop barrier or the some_other_barrier (depending where the other tasks are). A task can wait only on one barrier at a time, so it creates a deadlock. So in this case, to avoid deadlock, we should ignore the restarted task in the end_loop barrier or any other barrier until the restarted task is synced again at begin_loop_barrier. This enables other tasks to proceed proceed. The restarted task will continue to wait on the begin loop barrier. When the other tasks reach begin loop barrier, at 5, the restarted task will be synced with the other tasks and thus can be removed from the unsynced_tasks set. This change allows the model to gracefully recover from preemption when some of the workers are slow and thus get preempted during training. PiperOrigin-RevId: 765331856

mkuperst and others added 15 commits May 30, 2025 08:59

[XLA] Roll back MakeShape* change.

469430f

PiperOrigin-RevId: 765209533

[XLA] Roll back MakeShape* change.

e68a8a4

PiperOrigin-RevId: 765233541

Prevent recursion on CALL_ONCE op

56e31d3

Calling the same subgraph recursively via a CALL_ONCE op creates an infinite recursion that causes a stack overflow. Added a check so that the same subgraph cannot call itself via a CALL_ONCE op. PiperOrigin-RevId: 765251354

Remove requirement that SafeStaticInit's init_fn be called multiple t…

5882085

…imes. PiperOrigin-RevId: 765269354

Change the container in generate_benchmark_matrices to use the new on…

bf044cf

…e us-docker.pkg.dev/ml-ss-artifacts-published/ml-public-container. The older container is `us-central1-docker.pkg.dev` is no longer maintained. PiperOrigin-RevId: 765283088

[XLA:benchmarks] Add postsubmit workflow to run benchmarks on B200 an…

acfb071

…d updates relevant scripts and configs PiperOrigin-RevId: 765284686

[HLODiff] Store canonical fingerprint for checking if node changed.

03a357b

PiperOrigin-RevId: 765287094

Fix a dangling reference in EagerOperation::UpdateName

f609db4

PiperOrigin-RevId: 765302519

Remove leftover comment

e7ac2c7

PiperOrigin-RevId: 765306410

[XLA:GPU] Use static functions for internal linkage

2cdac27

PiperOrigin-RevId: 765308679

Add missing header for a pjrt::SetPjrtApi use in macro `REGISTER_PJRT…

94792ec

…_PLUGIN` PiperOrigin-RevId: 765312495

[xla:benchmarks] Update workflow trigger to be push event for consist…

dbaa6a3

…ency PiperOrigin-RevId: 765325170

Update xprof git hash and enable repository mapping.

b475262

When XProf removed Tensorflow as a dependency, we also renamed @local_xla back to @xla, as well as @tsl. This broke compatibility with Tensorflow, so adding a mapping to mimic the old behavior. PiperOrigin-RevId: 765326579

pull bot added the ⤵️ pull label May 30, 2025

pull bot merged commit 6fb7fa5 into barkpixels:master May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from tensorflow:master#350

[pull] master from tensorflow:master#350
pull[bot] merged 15 commits intobarkpixels:masterfrom
tensorflow:master

pull bot commented May 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

pull bot commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

pull bot commented May 30, 2025 •

edited

Loading