Skip to content

v0.2.8

Choose a tag to compare

@oulgen oulgen released this 23 Dec 20:10
· 1005 commits to main since this release
768326f

What's Changed

  • Support passing Triton function object to hl.triton_kernel() by @yf225 in #1263
  • chore: Bump actions/download-artifact from 6 to 7 by @dependabot[bot] in #1270
  • chore: Bump actions/upload-artifact from 5 to 6 by @dependabot[bot] in #1271
  • [Distributed] one_shot_allreduce_bias_rmsnorm example by @yf225 in #1266
  • [Distributed] matmul_reduce_scatter example by @yf225 in #1269
  • feat(benchmarks): add shapes to json output by @fulvius31 in #1273
  • [Autotuner] Log the 'started' state to CSV, for easier user monitoring of kernel hanging at runtime by @yf225 in #1279
  • default pattern search by @v0i0 in #1259
  • Set LFBOPatternSearch as default by @ethche in #1280
  • fix surrogate search for singleton population by @v0i0 in #1281
  • Ignore bzl files in git. by @Myrthan in #1282
  • chunk fused_linear_jsd by @v0i0 in #1277
  • Fix buggy interation between XYZProgramIDs and L2GroupingProgramIDs by @jansel in #1288
  • Fix bug with torch.rand_like compile error by @jansel in #1289
  • [autotuner] print path to generated Triton code after selection of kernel by @bringlein in #1285
  • Add support for torch.gather by @jansel in #1290
  • [docs] Add more example to docs by @oulgen in #1301
  • [lint] remove dead ignores by @oulgen in #1302
  • Add proper error handling for torch.split and torch.tensor_split in device loops by @oulgen in #1297
  • Fix BlockReductionStrategy to use existing index variables for argmax/argmin operations by @oulgen in #1298
  • Skip more failing tests on cpu backend by @oulgen in #1304
  • [CI] Fix broken notebook by @oulgen in #1305
  • Fix shape inference for tile indexing on size-1 dimensions and use broadcast_to for block_ptr by @oulgen in #1299
  • Fix codegen broadcasting for tile indexing on size-1 tensor dimensions by @oulgen in #1300
  • Enable tests on py314 by @oulgen in #1306

New Contributors

Full Changelog: v0.2.7...v0.2.8