Skip to content

fa4-v4.0.0.beta17

Pre-release
Pre-release

Choose a tag to compare

@github-actions github-actions released this 10 Jun 09:23
· 4 commits to main since this release
fb02fc8

What's Changed

  • [Triton] Fix graph capture issues and env var by @micmelesse in #2620
  • [CuTe,Bwd,Sm100] allow 2cta with score mod and mask mod in bwd by @reubenconducts in #2557
  • [CuTe] Fix lint failures by @drisspg in #2625
  • [CuTe] Fix lint failure in flash_bwd_sm100.py by @Johnsonms in #2627
  • fix: add weights_only=True to all torch.load call sites by @aryanputta in #2622
  • [Cute,Sm100,Fwd] use correction warps if not tma store; remove outdated packgqa guard by @jayhshah in #2629
  • Add aux-scalars to interface to enable dynamic ints and floats in expressions by @drisspg in #2616
  • fix: build and select cu13.2 prebuilt wheels by @ko3n1g in #2618
  • ci(fa4): enforce cutlass-dsl/quack dep floors and rebake cu130 image by @Johnsonms in #2636

New Contributors

Full Changelog: fa4-v4.0.0.beta16...fa4-v4.0.0.beta17