Skip to content

Conversation

@rocm-repo-management-api-2
Copy link

Daily sync with upstream

hawkinsp and others added 30 commits March 18, 2025 21:38
This code is Mosaic specific, move it to the Mosaic directory.

PiperOrigin-RevId: 738404429
…MAOp`.

Now that we have full control over strides in the lowering, these attributes
are no longer necessary.

PiperOrigin-RevId: 738418852
This callback functionality is only used by JAX and shipped as part of its CUDA and ROCM GPU plugins. Move it into JAX, as part of a wider move of xla/python pieces that belong to JAX into JAX.

PiperOrigin-RevId: 738426489
This is a GPU-specific target.

PiperOrigin-RevId: 738441625
PiperOrigin-RevId: 738443014
…e more cleanups

PiperOrigin-RevId: 738503430
The new xla_extension_version is 320.

PiperOrigin-RevId: 738522486
Google-ML-Automation and others added 26 commits March 19, 2025 22:19
cupti initialization / finalization is somewhat expensive. This gives us the option of avoiding repeated initialization when performing multiple cupti timings. Disable kernel activity to ensure we've restored cupti to its original state.

PiperOrigin-RevId: 738685851
Also add layout inference and lowering rules for it. Its initial use case will
be to fence WGMMA accumulator registers. As a result, transform inference is
not immediately useful for this op, and we omit it here.

PiperOrigin-RevId: 738718000
Initializing and finalizing cupti has an overhead.

PiperOrigin-RevId: 738725435
jaxlib no longer includes any lowering logic, so we don't need this module anymore. Users would be better served by the APIs in JAX core like `jax.ffi` or `jax.interpreters.mlir`.

This module isn't covered by JAX's compatibility policy, so no formal deprecation period is required, but there are enough users that we should keep this warning for at least one full release cycle.

PiperOrigin-RevId: 738728721
- In order to migrate the GPU FFI handler from the internal API intended for static linking to the external API intended for dynamic linking, we need to migrate both CPU and GPU FFI handlers at the same time.
- Builds break if we include both versions of the FFI APIs.
- Now that py_client_gpu sits in jaxlib, tests that run new FFI API in jaxlib against old FFI API in xla (and vice versa) for GPU targets will fail.
- This change lets us update the CPU handler first in XLA and then update the GPU handler second in jaxlib.
- Because the GPU handler depends on new symbols in xla, we need to land XLA changes first anyway (i.e., no point to deleting both CPU and GPU to try to land jaxlib and xla in one go).

PiperOrigin-RevId: 738730955
It's unused but causes significant burden during Triton integrates.

PiperOrigin-RevId: 738744625
…ynamic base offsets

PiperOrigin-RevId: 738762062
PiperOrigin-RevId: 738774175
…nce the underlying issue is now resolved.

PiperOrigin-RevId: 738802372
PiperOrigin-RevId: 738820673
…DUS`'s operand was not sharded properly

PiperOrigin-RevId: 738959282
…hat are sharding-in-types specific errors should raise.

This is so that we can catch this exception in backward_pass/vmap and add extra message to inform users that this is a potential JAX bug. They should file an issue on the repo.

Currently, we only raise `ShardingTypeError` in one place, but we can expand to all other places in follow up changes. This change sets the machinery up.

Previous error:

```
jax._src.core.ShardingTypeError: dynamic_update_slice update sharding must be equal to operand sharding, got update sharding float32[2@x]({Explicit: ('x',)}) for operand sharding float32[16]({}).
```

New error:

```
jax._src.core.ShardingTypeError: dynamic_update_slice update sharding must be equal to operand sharding, got update sharding float32[2@x]({Explicit: ('x',)}) for operand sharding float32[16]({}).
This is a potential JAX bug. Please file an issue at https://github.com/jax-ml/jax/issues
```

The new added message of `This is a potential JAX bug...` is important because this error is raised in the backward pass which is 100% a JAX bug given that forward pass did not error.

PiperOrigin-RevId: 739053305
@rocm-repo-management-api-2 rocm-repo-management-api-2 bot requested a review from a team as a code owner March 21, 2025 06:03
auto-merge was automatically disabled April 1, 2025 15:36

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.