v0.1.7
What's Changed
- examples: add N & D args to fwd & bwd examples by @DefTruth in #157
- chore: add decode-attn Nq=1:Nkv>1 to examples by @DefTruth in #158
- feat: support flash-decoding for Nq=1 by @DefTruth in #159
- feat: support cuda flash-decoding for Nq=1 by @DefTruth in #160
- refactor: use triton backend by default by @DefTruth in #161
Full Changelog: v0.1.6...v0.1.7