Skip to content

[pull] main from tile-ai:main#165

Merged
pull[bot] merged 6 commits intobotbw:mainfrom
tile-ai:main
Dec 24, 2025
Merged

[pull] main from tile-ai:main#165
pull[bot] merged 6 commits intobotbw:mainfrom
tile-ai:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Dec 24, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

lucifer1004 and others added 6 commits December 24, 2025 16:33
* [Misc] add env for default target/backend/verbose

* fix: target_host signature

* fix: move all env logic to kernel_cache

* fix: example

* fix: type hint

* Update example_gqa_decode_varlen_logits.py

---------

Co-authored-by: Lei Wang <34334180+LeiWang1999@users.noreply.github.com>
* fp4 related update, require_cu13

* Enhance CUDA type conversion handling and optimize dtype management

- Updated CUDA vectorized cast functions to ensure proper handling of float16, float32, bfloat16, and float8 conversions, adding checks for bit sizes.
- Refactored dtype conversion logic in `cuda_fp4.h` to utilize `cudaRoundZero` for improved accuracy in floating-point conversions.
- Introduced a new method in `KernelParam` to convert TVM DataType to TileLang dtype.
- Adjusted argument binding logic in `arg_binder.cc` to allow for better subtype matching based on total bit counts.
- Enhanced dtype handling in `dtypes.py` to accommodate new float4_e2m1fn types and ensure compatibility with PyTorch.

This update aims to improve type safety and conversion accuracy across the codebase.

* lint fix

* lint fix

* typo fix

---------

Co-authored-by: Zhiwen Mo <zm125@ic.ac.uk>
* Enhance GEMM layout functions to include a fallback for float64 when mat_stride % 8 != 0. Refactor swizzling layout conditions to check mat_stride before mat_continuous, improving layout selection logic for better performance.

* lint fix
* Fix fp4 pointer arithmetic in CUDA codegen

* Fix fp4 pointer arithmetic in CUDA codegen
@pull pull bot locked and limited conversation to collaborators Dec 24, 2025
@pull pull bot added the ⤵️ pull label Dec 24, 2025
@pull pull bot merged commit d7e264f into botbw:main Dec 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants