Skip to content

CI: 12/23/24 upstream sync#190

Merged
github-actions[bot] merged 52 commits intorocm-mainfrom
ci-upstream-sync-68_1
Jan 2, 2025
Merged

CI: 12/23/24 upstream sync#190
github-actions[bot] merged 52 commits intorocm-mainfrom
ci-upstream-sync-68_1

Conversation

@github-actions
Copy link

Daily sync with upstream

Ruturaj4 and others added 30 commits December 19, 2024 09:02
The `build.py` script uses Clang compiler by default, and JAX doesn't support building with GCC officially. However, experimental GCC support is still present.

Command examples:

```
python build/build.py build --wheels=jaxlib,jax-cuda-plugin --use_clang=false
python build/build.py build --wheels=jaxlib,jax-cuda-plugin --use_clang=false --gcc_path=/use/bin/gcc
```

This change addresses the request in jax-ml#25488.

PiperOrigin-RevId: 707930913
PiperOrigin-RevId: 707935849
PiperOrigin-RevId: 707992245
infer-vector-layout won't use the full generality anytime soon, but we could reuse this logic for relayouts

PiperOrigin-RevId: 708011538
…gardless

the value of config.enable_x64.

PiperOrigin-RevId: 708031525
… together

This commit modifies the behavior of the build CLI when building jaxlib and GPU plugin artifacts together (for instance `python build --wheels=jaxlib,jax-cuda-plugin`.

Before, CUDA/ROCm build options were only passed when building the CUDA/ROCm artifacts. However, this leads to inefficient use of the build cache as it looks like Bazel tries to rebuild some targets that has already been built in the previous run. This seems to be because the GPU plugin artifacts have a different set of build options compared to `jaxlib` which for some reason causes Bazel to invalidate/ignore certain cache hits. Therefore, this commit makes it so that the build options remain the same when the `jaxlib` and GPU artifacts are being built together so that we can better utilize the build cache.

As an example, this means that if `python build --wheels=jaxlib,jax-cuda-plugin` is run, the following build options will apply to both `jaxlib` and `jax-cuda-plugin` builds:
```
 /usr/local/bin/bazel run --repo_env=HERMETIC_PYTHON_VERSION=3.10 \
--verbose_failures=true --action_env=CLANG_COMPILER_PATH="/usr/lib/llvm-16/bin/clang" \
--repo_env=CC="/usr/lib/llvm-16/bin/clang" \
--repo_env=BAZEL_COMPILER="/usr/lib/llvm-16/bin/clang" \
--config=clang --config=mkl_open_source_only --config=avx_posix \
--config=cuda --action_env=CLANG_CUDA_COMPILER_PATH="/usr/lib/llvm-16/bin/clang" \
--config=build_cuda_with_nvcc
```

Note, this commit shouldn't affect the content of the wheel it self. It is only meant to give a performance boost when building `jalxib`+plugin aritfacts together.

Also, this removes code that was used to build (now deprecated) monolithic `jaxlib` build from `build_wheel.py`

PiperOrigin-RevId: 708035062
Followup to jax-ml#25614.

PiperOrigin-RevId: 708077981
…ut-validation

PiperOrigin-RevId: 708087898
For (1, 128) tiling 32-bit input, it assigns (1, 128) tiling at output, which can be invalid (e.g. it should be (1, 256) for bf16)

PiperOrigin-RevId: 708112341
…for i1 vector relayout

We can use relayout-insertion pass to insert necessary ops and their layouts for relayout before unrolling in apply-vector-layout pass.

PiperOrigin-RevId: 708143852
…th and fix the bug.

The bug was that bounds were dropped ctx.avals_in and then they were being
extracted. Extract them before dropping them.

PiperOrigin-RevId: 708266659
jakevdp and others added 22 commits December 20, 2024 04:28
Otherwise, the cycle can be broken by clearing the references of the helper
objects, at which points the deallocation of arrays proceeds through regular
reference counting (and does not trigger logs!). I have not verified that
this is what happens, but the test has been mysteriously failing under a
number of configurations and this seems to fix it.

I added a note to the garbage collection guard to clarify that it's not
guaranteed to report all cycles.

PiperOrigin-RevId: 708320953
… ops.

Only untransformed and unsliced loads/stores are supported for now. The rest will be a follow up.

PiperOrigin-RevId: 708347442
…inus changes in infer-vector-layout

We can enable them later but at least this way the support is available to build on
(e.g. in the new insert relayouts pass)

Reverts 05f3a70

PiperOrigin-RevId: 708397219
warnings.catch_warnings is not thread-safe. However it is always used to avoid complex-to-real conversion warnings, which we can avoid in other ways.
In jax-ml#24370, `ffi_call` was updated to return a callable, and the original calling convention was deprecated. This change is part of the deprecation cycle for this calling convention.

PiperOrigin-RevId: 708424223
…le colocated Python call

PiperOrigin-RevId: 708461989
This was causing an issue when building multiple wheels in editable mode.

i.e instead of wheels being stored as:
```
# jax-cuda12-pjrt   0.4.36.dev20241125           ./dist/jax-cuda-pjrt
# jax-cuda12-plugin 0.4.36.dev20241125           ./dist/jax-cuda-plugin
# jaxlib            0.4.36.dev20241125           ./dist/jaxlib
```

they were being stored as:
```
# jaxlib            0.4.36.dev20241125           ./dist/jaxlib
# jax-cuda12-pjrt   0.4.36.dev20241125           ./dist/jaxlib/jax-cuda-pjrt
# jax-cuda12-plugin 0.4.36.dev20241125           ./dist/jaxlib/jax-cuda-plugin
```

PiperOrigin-RevId: 708468522
…selection driven by constant prop in mosaic lowering.

This CL builds out a simple sketch of constant prop by construction in mosaic - we walk the graph up from cond, collecting the values and either const propping or failing out of const prop. Failure out of const prop is not a bug, but hitting an unimplemented const prop func is for now, in order to drive better coverage.

This then allows us to pick a single branch, and ignore branches which do not have a viable mosaic implementation.

And, finally, for diag, this means we can replace the initial gather-dependent implementation in lax with a mosaic specific one that avoids gather.

PiperOrigin-RevId: 708752566
PiperOrigin-RevId: 708811348
@github-actions github-actions bot enabled auto-merge December 23, 2024 06:02
@charleshofer charleshofer self-requested a review January 2, 2025 16:50
@github-actions github-actions bot merged commit 5f67bea into rocm-main Jan 2, 2025
@charleshofer charleshofer deleted the ci-upstream-sync-68_1 branch January 2, 2025 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.