torch_musa v2.1.1 bug fix release

torch_musa v2.1.1 is now available. This is an enhanced version of v2.1.0, aimed at fixing issues discovered during projects and improving core features. Despite some known issues, complete functional/integration tests have been passed based on MUSA 4.2.0. Native supported operators increased to over 948.

New Features

Support musagraphs backend for torch.compile, introducing reduced host overhead and e2e acceleration from musa-graph.
muSolver has been integrated into the backend of several linalg operators, including lu_factor_ex、lu_solve、solve_ex、cholesky_ex...
FusedAdamW/FusedAdam on MUSA are available on DTensor or other Tensor variants that based on the torch_dispatch mechanism.
Benchmark module has been expanded to include more operator cases.

EnhanceMent

Fixed the occurrence of 0-value in exponential，inspired from Intel MKL vRngExponential(...)
Ensured early return for some 0-numel op cases
Optimized one-hot by eliminating redundant preprocessing logics
Added rrelu_with_noise/nansum, RoPE supports multi-latent
Extended SDPA with no-batch inputs, enable mask-grad only for math backend
Fixed scatter_reduce crash and cross-entropy with none mode cases
Improved bandwiths of binary ops on rhs not last-contiguous cases

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch_musa Release v2.1.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

torch_musa v2.1.1 bug fix release

New Features

EnhanceMent

Uh oh!