Release v0.1.3
Highlights
This release improves compatibility across FMHA Forward, DeepGEMM, extensions, wrappers, and CLI tooling. It expands FlashAttention 3 scenario coverage, adds more DeepGEMM API support, introduces additional MoE Fused Gate configurations, and provides new debugging utilities through the mate CLI.
What's Changed
FMHA Forward Compatibility
Enhanced FMHA Forward compatibility for broader FlashAttention 3 scenarios.
Supported QKV input modes:
NormalRaggedPaddedPaged— KV only
Supported mask modes:
NoneCausalLocalLocal w/ sink
Supported score modes:
NoneSoftcap
Supported configurations:
PageSize: arbitrary page size is supported;64is recommended.DataType:bf16,fp16.HeadDim: arbitrary head dimension up to512.
Optimization knobs:
SplitKVPackGQASchedulerMetadata
Additional compatibility:
ContextParallel: compatible with VLLM-style usage.Compile: JIT enabled.
DeepGEMM Compatibility
Enhanced DeepGEMM compatibility with additional API and edge-case support.
Added support for:
m_grouped_bf16_gemm_nt_*APIsm_grouped_fp8_gemm_nt_*APIsk_grouped_fp8_gemm_tn_contiguous- FP8 MQA Logits Prefill / Decode
NextN=4scenarios for Decodem/n/k = 0edge cases
Extensions
Added more MoE Fused Gate expert configurations:
160experts384experts256experts with1 group
Wrappers
Added compatibility wrappers to simplify migration and integration.
mate-deep-gemm
- Compatible with the
deep-gemmimport style. - Compatible with existing DeepGEMM API usage patterns.
mate-flash-attention
- Compatible with the
flash-attention3import style. - Compatible with FlashAttention 3 API usage patterns.
- Extended compatibility for SGL / VLLM FA3 fork usage patterns.
MATE CLI
Introduced the mate CLI for environment inspection and debugging.
New commands:
-
mate show-config- Displays environment status, commit ID, and related runtime/build information.
-
mate env- Displays available MATE-related environment variables.
Debugging improvements:
- Added new environment variables for dumping input/output data during debugging.
For more details, please refer to the repository documentation and mate --help.