Release v0.2.7 · pytorch/helion

What's Changed

[CI] Skip all failing distributed tests by @yf225 in #1206
Include index_dtype in the printed decorator snippet by @choijon5 in #1207
Add dict comprehension support by @oulgen in #1191
settings: set appropriate dot_precision default by @fulvius31 in #1184
[Interpret Mode] Support custom block size by @yf225 in #1194
[Autotuner] Add autotune_benchmark_fn setting by @yf225 in #1199
jagged_dense_bmm (#1126) by @trieuat in #1213
benchmarks: Include AMD GCN arch in get_device_name() by @fulvius31 in #1214
Fix linter errors by @yf225 in #1218
Fix unit test breakage due to upstream change by @yf225 in #1219
Fix static_shapes setting in test_dot.py by @yf225 in #1220
Fix memory leak when Triton compile error occurs by @yf225 in #1217
[Interpret Mode] Re-enable block-size dependent tests by @yf225 in #1212
[Interpret Mode] Raise error if hl.store is used with duplicate indices by @yf225 in #1221
[Interpret Mode] Fix hl.store automatic dtype conversion by @yf225 in #1226
[Interpret Mode] Fix hl.load with multiple 1D tensor indices by @yf225 in #1227
[CI] Fix NVSHMEM env vars and re-enable distributed CI job by @yf225 in #1201
Move jagged_dense_bmm expected code to the right place by @yf225 in #1232
Reduce log volume by moving output code logging behind HELION_PRINT_OUTPUT_CODE=1 by @yf225 in #1233
Add setup for Helion to compile on MTIA with basic test by @Myrthan in #1169
Make hl.triton_kernel support global var and recursive kernel call by @yf225 in #1234
Make hl.triton_kernel support output_like=None without being DCE'd by @yf225 in #1237
Show errors when pre-commit fails by @oulgen in #1238
example: gated delta net fwd_h by @v0i0 in #1119
Change property name from camel case to snake case. by @Myrthan in #1239
Move distributed examples to examples/distributed/ by @yf225 in #1240
fix for circular dependency by @mengluy0125 in #1236
Fix mask propagation for indexed stores when block_id is 0 by checking is not None instead of truthiness by @oulgen in #1244
Clean up distributed examples path refs by @yf225 in #1241
Fix RNG codegen for constant (specialized) dimensions by @yf225 in #1253
Avoid broadcasting for non-consecutive tensor indexers by @yf225 in #1254
Implement torch.sort support by @oulgen in #1247
Implement torch.topk support by @oulgen in #1248
Allow using hl.specialize to specialize on tensor strides by @yf225 in #1215
Use torch._dynamo.mark_static() API to allow tensor shape specialization outside of the kernel code by @yf225 in #1210
chore: Bump actions/cache from 4 to 5 by @dependabot[bot] in #1257
Fix invalid Triton code for mixed scalar/block indexing in store operations when block dimension has size 1 by @oulgen in #1258

New Contributors

@trieuat made their first contribution in #1213
@Myrthan made their first contribution in #1169

Full Changelog: v0.2.6...v0.2.7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.7

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!