Skip to content

fix: add cudaGetLastError check after cuLaunchKernel in TVM FFI backend#2000

Merged
LeiWang1999 merged 2 commits into
tile-ai:mainfrom
kurisu6912:fix/check-cuda-last-error
Apr 4, 2026
Merged

fix: add cudaGetLastError check after cuLaunchKernel in TVM FFI backend#2000
LeiWang1999 merged 2 commits into
tile-ai:mainfrom
kurisu6912:fix/check-cuda-last-error

Conversation

@kurisu6912
Copy link
Copy Markdown
Collaborator

@kurisu6912 kurisu6912 commented Mar 30, 2026

Summary

  • Update TVM submodule to include cudaPeekAtLastError() check after cuLaunchKernel in the TVM FFI backend
  • Detects asynchronous CUDA errors (e.g. illegal memory access) that cuLaunchKernel's return value does not capture
  • Error format aligned with the Cython backend's TILELANG_CHECK_LAST_ERROR: kernel_name: error_name - error_string

Upstream TVM PR: tile-ai/tvm#30

Fixes #1929

Test plan

  • test_tilelang_kernel_gemm::test_gemm_f16f16f16_nn passes — normal path unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Updated TVM submodule reference to the latest revision.

@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@kurisu6912 kurisu6912 force-pushed the fix/check-cuda-last-error branch from 4d6c49c to 87831ff Compare April 1, 2026 08:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

The 3rdparty/tvm git submodule reference was updated to point to a newer TVM commit, incorporating upstream changes to address robust error reporting from the TVM FFI backend.

Changes

Cohort / File(s) Summary
TVM Submodule Update
3rdparty/tvm
Updated git submodule commit reference from 12b47d31... to 882a7748..., pulling in upstream TVM changes for error handling improvements.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

🐰 A version bump so small and neat,
TVM's submodule gets a refresh so sweet,
Error whispers now ring true,
FFI speaks up with messages new! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a cudaGetLastError check after cuLaunchKernel in the TVM FFI backend, which is the core objective of the submodule update.
Linked Issues check ✅ Passed The PR updates the TVM submodule to add cudaPeekAtLastError checking, directly addressing issue #1929's objective to detect and report CUDA kernel failures in the TVM FFI backend with explicit error messages.
Out of Scope Changes check ✅ Passed The PR contains only a targeted submodule version bump to implement the specific CUDA error checking feature; no unrelated changes are present outside the stated objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
3rdparty/tvm (1)

1-1: Consider broader test coverage beyond a single gemm test.

The test plan mentions running only test_tilelang_kernel_gemm::test_gemm_f16f16f16_nn. Given that this change affects CUDA error reporting infrastructure in the TVM FFI backend, consider:

  1. Running a broader suite of CUDA kernel tests to ensure the error-checking mechanism doesn't introduce performance regressions or break existing functionality
  2. Testing error scenarios specifically (e.g., kernels that intentionally trigger CUDA errors) to verify the new error reporting works as intended
  3. Running CI/CD integration tests if available
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@3rdparty/tvm` at line 1, Update the test plan to expand coverage beyond
test_tilelang_kernel_gemm::test_gemm_f16f16f16_nn by adding a suite of CUDA
kernel tests (e.g., additional tilelang kernel tests, varied GEMM
shapes/precisions, and other CUDA-backed operators) to exercise the TVM FFI CUDA
error reporting path, add dedicated tests that intentionally trigger CUDA errors
(invalid launch config, out-of-memory, illegal memory access) to assert the new
error-reporting messages and stack traces, and ensure these tests are included
in CI runs or a nightly integration job to catch regressions in performance or
correctness for the FFI backend error-checking logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@3rdparty/tvm`:
- Line 1: The current TVM verification script cannot run in an uninitialized
submodule and references a non-existent commit; replace it with a non-git-based
checker that verifies the required include files referenced by
cmake/load_tvm.cmake (check existence of include/tvm.h,
src/runtime/c_runtime_api.h, 3rdparty/dlpack/include/dlpack/dlpack.h, and
3rdparty/dmlc-core/include/dmlc/logging.h) and fail with a clear message listing
missing paths instead of performing git checkout or referencing commit
12b47d316230fc777d13d4199200530e8c9529e1; alternatively remove the script and
add documentation noting the compatibility gaps found against the TVM commit
882a774844993d103ae6e317ba3c7bbb5952b662 (tilelang_main) and the specific
missing include paths.

---

Nitpick comments:
In `@3rdparty/tvm`:
- Line 1: Update the test plan to expand coverage beyond
test_tilelang_kernel_gemm::test_gemm_f16f16f16_nn by adding a suite of CUDA
kernel tests (e.g., additional tilelang kernel tests, varied GEMM
shapes/precisions, and other CUDA-backed operators) to exercise the TVM FFI CUDA
error reporting path, add dedicated tests that intentionally trigger CUDA errors
(invalid launch config, out-of-memory, illegal memory access) to assert the new
error-reporting messages and stack traces, and ensure these tests are included
in CI runs or a nightly integration job to catch regressions in performance or
correctness for the FFI backend error-checking logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ffb10929-f6d5-47cc-8e6e-46616a4c71b2

📥 Commits

Reviewing files that changed from the base of the PR and between a82fa71 and 87831ff.

📒 Files selected for processing (1)
  • 3rdparty/tvm

Comment thread 3rdparty/tvm
@@ -1 +1 @@
Subproject commit 12b47d316230fc777d13d4199200530e8c9529e1
Subproject commit 882a774844993d103ae6e317ba3c7bbb5952b662
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify TVM's CMake interface remains compatible after the submodule update.

cd 3rdparty/tvm || exit 1

# Check out the new commit
git checkout 882a774844993d103ae6e317ba3c7bbb5952b662 2>/dev/null

echo "=== Checking required CMake files exist ==="
test -f cmake/config.cmake && echo "✓ cmake/config.cmake exists" || echo "✗ cmake/config.cmake MISSING"
test -f CMakeLists.txt && echo "✓ CMakeLists.txt exists" || echo "✗ CMakeLists.txt MISSING"

echo -e "\n=== Checking required include directories exist ==="
test -d include && echo "✓ include/ exists" || echo "✗ include/ MISSING"
test -d src && echo "✓ src/ exists" || echo "✗ src/ MISSING"
test -d 3rdparty/dlpack/include && echo "✓ 3rdparty/dlpack/include exists" || echo "✗ 3rdparty/dlpack/include MISSING"
test -d 3rdparty/dmlc-core/include && echo "✓ 3rdparty/dmlc-core/include exists" || echo "✗ 3rdparty/dmlc-core/include MISSING"

echo -e "\n=== Checking FFI include directory (one of these should exist) ==="
test -d ffi/include && echo "✓ ffi/include exists" || echo "✗ ffi/include not found"
test -d 3rdparty/tvm-ffi/include && echo "✓ 3rdparty/tvm-ffi/include exists" || echo "✗ 3rdparty/tvm-ffi/include not found"

echo -e "\n=== Searching for 'tvm' target definition in CMakeLists.txt ==="
rg -n "add_library\(tvm" CMakeLists.txt || echo "⚠ 'tvm' library target not found via add_library"

echo -e "\n=== Comparing structure between old and new commits ==="
echo "Old commit file structure:"
git ls-tree -r --name-only 12b47d316230fc777d13d4199200530e8c9529e1 | rg "^(cmake/config\.cmake|CMakeLists\.txt|include/|src/|3rdparty/)" | head -20

echo -e "\nNew commit file structure:"
git ls-tree -r --name-only 882a774844993d103ae6e317ba3c7bbb5952b662 | rg "^(cmake/config\.cmake|CMakeLists\.txt|include/|src/|3rdparty/)" | head -20

Repository: tile-ai/tilelang

Length of output: 1086


🏁 Script executed:

#!/bin/bash
# Check if the 3rdparty/tvm directory exists and its state
echo "=== Checking 3rdparty/tvm directory ==="
ls -la 3rdparty/ | grep -E "^d.*tvm"
echo ""

# Check if it's a git submodule
echo "=== Checking .gitmodules ==="
if [ -f .gitmodules ]; then
  cat .gitmodules
else
  echo ".gitmodules not found"
fi
echo ""

# Check git submodule status
echo "=== Git submodule status ==="
git submodule status 3rdparty/tvm 2>/dev/null || echo "No submodule info found"
echo ""

# Check if 3rdparty/tvm/.git exists
echo "=== Checking 3rdparty/tvm/.git ==="
if [ -d 3rdparty/tvm/.git ]; then
  echo "✓ .git found in 3rdparty/tvm"
  cd 3rdparty/tvm && git log --oneline -5 && cd - > /dev/null
else
  echo "✗ No .git directory found in 3rdparty/tvm"
fi
echo ""

# Check cmake/load_tvm.cmake for actual TVM interface expectations
echo "=== Checking cmake/load_tvm.cmake ==="
if [ -f cmake/load_tvm.cmake ]; then
  wc -l cmake/load_tvm.cmake
  echo "---"
  head -50 cmake/load_tvm.cmake
else
  echo "cmake/load_tvm.cmake not found"
fi

Repository: tile-ai/tilelang

Length of output: 1614


🏁 Script executed:

#!/bin/bash
# Check what git knows about the TVM submodule remote
echo "=== Checking git config for submodule ==="
git config --file .gitmodules --get submodule.3rdparty/tvm.url

echo -e "\n=== Attempting to verify commit via git ls-remote ==="
git ls-remote https://github.com/TileLang/tvm 882a774844993d103ae6e317ba3c7bbb5952b662 2>&1 | head -5

echo -e "\n=== Checking main/master branch refs ==="
git ls-remote --heads https://github.com/TileLang/tvm 2>&1 | head -10

echo -e "\n=== Checking if this is a detached submodule (commit-based, not branch-based) ==="
# The '-' prefix in submodule status means it's at a detached commit
echo "Submodule is at detached commit: 882a774844993d103ae6e317ba3c7bbb5952b662"
echo "This is expected - submodules are typically pinned to specific commits."

echo -e "\n=== Full cmake/load_tvm.cmake for reference ==="
cat cmake/load_tvm.cmake

Repository: tile-ai/tilelang

Length of output: 2047


🏁 Script executed:

#!/bin/bash
echo "=== Checking all refs (heads, tags, remotes) for TileLang/tvm ==="
git ls-remote https://github.com/TileLang/tvm | grep 882a774844993d103ae6e317ba3c7bbb5952b662

echo -e "\n=== Checking tags in TileLang/tvm ==="
git ls-remote --tags https://github.com/TileLang/tvm 2>&1 | tail -20

echo -e "\n=== Checking if commit exists in main branch history ==="
git ls-remote https://github.com/TileLang/tvm refs/heads/main | awk '{print $1}'

echo -e "\n=== Checking CMakeLists.txt existence in repo root ==="
if [ -f CMakeLists.txt ]; then
  echo "✓ CMakeLists.txt exists in repository root"
  head -20 CMakeLists.txt
else
  echo "✗ CMakeLists.txt NOT in repository root"
fi

echo -e "\n=== Checking if there are any existing build tests or CI that validates TVM ==="
find . -maxdepth 3 -name "*.yml" -o -name "*.yaml" 2>/dev/null | grep -E "(ci|github|test|workflow)" | head -5

Repository: tile-ai/tilelang

Length of output: 1478


🏁 Script executed:

#!/bin/bash
echo "=== Verifying TVM CMake interface at commit 882a774844993d103ae6e317ba3c7bbb5952b662 ==="
echo "Checking files and directories in TileLang/tvm at this commit..."
echo ""

# Use git archive to check what files exist at that commit without cloning
echo "=== Checking if required CMake files exist in TVM commit ==="
git ls-tree -r --name-only https://github.com/TileLang/tvm 882a774844993d103ae6e317ba3c7bbb5952b662 2>&1 | grep -E "^(CMakeLists\.txt|cmake/config\.cmake)" || echo "✗ Required cmake files may not exist"

echo -e "\n=== Checking include directories structure ==="
git ls-tree -r --name-only https://github.com/TileLang/tvm 882a774844993d103ae6e317ba3c7bbb5952b662 2>&1 | grep -E "^(include/|src/|3rdparty/dlpack/include|3rdparty/dmlc-core/include|ffi/include|3rdparty/tvm-ffi/include)" | head -20

echo -e "\n=== Alternative: Using github raw content API ==="
echo "Attempting to fetch CMakeLists.txt from commit..."
curl -s -I "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/CMakeLists.txt" | head -1

echo -e "\nAttempting to fetch cmake/config.cmake from commit..."
curl -s -I "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/cmake/config.cmake" | head -1

echo -e "\nAttempting to fetch include/tvm.h from commit..."
curl -s -I "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/include/tvm.h" | head -1

Repository: tile-ai/tilelang

Length of output: 604


🏁 Script executed:

#!/bin/bash
echo "=== Fetching CMakeLists.txt from TVM commit ==="
curl -s "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/CMakeLists.txt" | head -50

echo -e "\n=== Fetching cmake/config.cmake from TVM commit ==="
curl -s "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/cmake/config.cmake" | head -80

echo -e "\n=== Checking directory structure via GitHub API ==="
# Try to get the root directory listing
curl -s "https://api.github.com/repos/TileLang/tvm/contents/?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | grep -o '"name":"[^"]*"' | head -20

Repository: tile-ai/tilelang

Length of output: 4719


🏁 Script executed:

#!/bin/bash
echo "=== Checking include directory structure in TVM commit ==="
curl -s "https://api.github.com/repos/TileLang/tvm/contents/include?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | head -50

echo -e "\n=== Checking src directory ==="
curl -s "https://api.github.com/repos/TileLang/tvm/contents/src?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | head -20

echo -e "\n=== Checking 3rdparty subdirectories ==="
curl -s "https://api.github.com/repos/TileLang/tvm/contents/3rdparty?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | grep -o '"name":"[^"]*"'

echo -e "\n=== Checking dlpack structure ==="
curl -s "https://api.github.com/repos/TileLang/tvm/contents/3rdparty/dlpack?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | grep -E '"name"' | head -10

echo -e "\n=== Checking dmlc-core structure ==="
curl -s "https://api.github.com/repos/TileLang/tvm/contents/3rdparty/dmlc-core?ref=882a774844993d103ae6e317ba3c7bbb5952b662" 2>&1 | grep -E '"name"' | head -10

echo -e "\n=== Summary: Verifying cmake/load_tvm.cmake expectations ==="
echo "Expected paths per cmake/load_tvm.cmake:"
echo "  - include/ → $(curl -s -I 'https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/include/tvm.h' 2>&1 | grep -oE 'HTTP/[0-9.]+ [0-9]+' | awk '{print $2}')"
echo "  - src/ → $(curl -s -I 'https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/src/' 2>&1 | grep -oE 'HTTP/[0-9.]+ [0-9]+' | awk '{print $2}')"
echo "  - 3rdparty/dlpack/include → $(curl -s -I 'https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/3rdparty/dlpack/include/' 2>&1 | grep -oE 'HTTP/[0-9.]+ [0-9]+' | awk '{print $2}')"
echo "  - 3rdparty/dmlc-core/include → $(curl -s -I 'https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/3rdparty/dmlc-core/include/' 2>&1 | grep -oE 'HTTP/[0-9.]+ [0-9]+' | awk '{print $2}')"

Repository: tile-ai/tilelang

Length of output: 1033


🏁 Script executed:

#!/bin/bash
echo "=== Direct check: Critical TVM headers existence ==="
echo "Checking if key header files exist at commit 882a774844993d103ae6e317ba3c7bbb5952b662..."

# Check for various possible header locations
for path in \
  "include/tvm.h" \
  "include/tvm/ir/expr.h" \
  "src/runtime/c_runtime_api.h" \
  "3rdparty/dlpack/include/dlpack/dlpack.h" \
  "3rdparty/dmlc-core/include/dmlc/logging.h"
do
  status=$(curl -s -o /dev/null -w "%{http_code}" "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/$path")
  result="✗ MISSING (HTTP $status)"
  [ "$status" = "200" ] && result="✓ EXISTS (HTTP $status)"
  printf "  %-50s %s\n" "$path" "$result"
done

echo -e "\n=== Checking if include structure is under a subdirectory ==="
for path in \
  "src/runtime/c_runtime_api.h" \
  "include/tvm/runtime/c_runtime_api.h"
do
  status=$(curl -s -o /dev/null -w "%{http_code}" "https://raw.githubusercontent.com/TileLang/tvm/882a774844993d103ae6e317ba3c7bbb5952b662/$path")
  [ "$status" = "200" ] && echo "Found: $path"
done

echo -e "\n=== Cross-check: Review comment verification script issues ==="
echo "The verification script in the review comment has structural issues:"
echo "1. It assumes 3rdparty/tvm is initialized (git checkout won't work on uninitialized submodule)"
echo "2. It references 2 different commit hashes for comparison (12b47d316... and 882a774844...)"
echo "3. The old commit hash doesn't exist in TileLang/tvm repo"
echo ""
echo "Available TileLang/tvm branches:"
git ls-remote --heads https://github.com/TileLang/tvm

Repository: tile-ai/tilelang

Length of output: 2405


Fix TVM CMake build interface verification approach.

The concern about TVM CMake compatibility is valid—the TVM commit at 882a774844993d103ae6e317ba3c7bbb5952b662 (in the tilelang_main branch) is missing critical include paths expected by cmake/load_tvm.cmake, including:

  • include/tvm.h (404)
  • src/runtime/c_runtime_api.h (404)
  • 3rdparty/dlpack/include/dlpack/dlpack.h (404)
  • 3rdparty/dmlc-core/include/dmlc/logging.h (404)

However, the provided verification script cannot execute because:

  1. It attempts git checkout on an uninitialized submodule (no .git directory present in sandbox)
  2. It references commit 12b47d316230fc777d13d4199200530e8c9529e1 which does not exist in the TileLang/tvm fork
  3. The script structure assumes the submodule is fully cloned, which is not the case

Either remove the script and document the specific compatibility issues found, or replace it with a working verification approach that checks required paths without relying on git operations on uninitialized submodules.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@3rdparty/tvm` at line 1, The current TVM verification script cannot run in an
uninitialized submodule and references a non-existent commit; replace it with a
non-git-based checker that verifies the required include files referenced by
cmake/load_tvm.cmake (check existence of include/tvm.h,
src/runtime/c_runtime_api.h, 3rdparty/dlpack/include/dlpack/dlpack.h, and
3rdparty/dmlc-core/include/dmlc/logging.h) and fail with a clear message listing
missing paths instead of performing git checkout or referencing commit
12b47d316230fc777d13d4199200530e8c9529e1; alternatively remove the script and
add documentation noting the compatibility gaps found against the TVM commit
882a774844993d103ae6e317ba3c7bbb5952b662 (tilelang_main) and the specific
missing include paths.

@kurisu6912
Copy link
Copy Markdown
Collaborator Author

fake ci error

@LeiWang1999
Copy link
Copy Markdown
Member

@regression-perf

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 4, 2026

Performance Regression Test Report

Triggered by: @LeiWang1999
Workflow run: https://github.com/tile-ai/tilelang/actions/runs/23972108142

Results

File Original Latency Current Latency Speedup
example_tilelang_gemm_splitk 1.08775 1.09211 0.996004
example_tilelang_nsa_decode 0.00741554 0.00743211 0.997772
example_tilelang_block_sparse_attn 0.00876647 0.00878558 0.997825
example_gqa_sink_bwd_bhsd_sliding_window 0.0255995 0.0256324 0.998717
example_dequant_gemv_fp16xint4 0.0283305 0.0283632 0.998848
example_warp_specialize_gemm_copy_0_gemm_1 0.0386048 0.0386453 0.998952
example_warp_specialize_gemm_barrierpipe_stage2 0.0395662 0.039602 0.999097
example_tilelang_gemm_splitk_vectorize_atomicadd 1.09412 1.095 0.999194
example_mha_fwd_bhsd 0.0107939 0.0108024 0.999216
block_sparse_attn_tilelang 0.00921675 0.00922398 0.999216
example_tilelang_sparse_gqa_decode_varlen_indice 0.0162472 0.0162578 0.99935
example_mhc_pre 0.148599 0.14868 0.999451
sparse_mla_fwd_pipelined 0.0956239 0.0956764 0.999452
example_dequant_gemm_bf16_mxfp4_hopper 0.514285 0.514547 0.999492
example_dequant_gemm_bf16_fp4_hopper 0.563 0.563281 0.999501
topk_selector 0.0538666 0.0538934 0.999503
example_mha_fwd_bshd 0.0258377 0.0258502 0.999517
example_mha_sink_fwd_bhsd_sliding_window 0.0157479 0.0157548 0.999559
example_tilelang_gemm_fp8_2xAcc 0.187222 0.187297 0.999596
example_mha_bwd_bhsd 0.038729 0.0387394 0.99973
example_convolution_autotune 0.98277 0.983027 0.999739
example_linear_attn_fwd 0.0359511 0.0359597 0.999762
example_mhc_post 0.10906 0.109086 0.999764
example_mha_bwd_bshd 0.0391126 0.0391219 0.999764
example_group_per_split_token_cast_to_fp8 0.0103498 0.0103522 0.999773
tilelang_example_sparse_tensorcore 0.0146124 0.0146154 0.999795
example_warp_specialize_gemm_copy_1_gemm_0 0.0269223 0.0269274 0.999811
example_convolution 1.29372 1.29389 0.99987
example_gemv 0.288222 0.288258 0.999878
example_mla_decode 0.454317 0.454361 0.999902
example_gqa_bwd_tma_reduce_varlen 0.0524065 0.0524102 0.999929
example_gqa_sink_bwd_bhsd 0.0413944 0.0413966 0.999948
example_fusedmoe_tilelang 0.132856 0.132857 0.999995
example_elementwise_add 0.115833 0.115832 1.00001
example_mha_fwd_varlen 0.0453495 0.0453488 1.00002
example_tilelang_sparse_gqa_decode_varlen_mask 0.0176971 0.0176962 1.00005
example_dequant_gemm_w4a8 5.68873 5.68844 1.00005
example_tilelang_gemm_fp8_intrinsic 0.843449 0.843403 1.00005
example_gqa_fwd_bshd 0.0702518 0.0702461 1.00008
example_dynamic 0.642873 0.642815 1.00009
example_warp_specialize_gemm_softpipe_stage2 0.0269319 0.0269273 1.00017
fp8_lighting_indexer 0.0357339 0.035727 1.00019
example_gqa_bwd 0.0506885 0.0506785 1.0002
sparse_mla_bwd 0.421521 0.421405 1.00028
example_tilelang_nsa_fwd 0.00686783 0.00686543 1.00035
example_mha_sink_bwd_bhsd 0.062243 0.0622194 1.00038
example_vertical_slash_sparse_attn 0.230883 0.230795 1.00038
example_tilelang_gemm_fp8 0.310981 0.310847 1.00043
example_linear_attn_bwd 0.152815 0.15274 1.00049
example_per_token_cast_to_fp8 0.00737755 0.00737091 1.0009
example_dequant_gemm_fp4_hopper 1.05169 1.0504 1.00122
example_topk 0.0111149 0.0111011 1.00124
sparse_mla_fwd 0.131515 0.131323 1.00145
example_blocksparse_gemm 0.0200193 0.0199846 1.00173
example_mha_sink_fwd_bhsd 0.015409 0.0153605 1.00316
example_mha_sink_bwd_bhsd_sliding_window 0.0445073 0.0442964 1.00476
example_dequant_groupedgemm_bf16_mxfp4_hopper 3.48454 3.41035 1.02175

Artifacts

  • regression_result.png (speedup plot) is attached as a workflow artifact. Download it from the workflow run page above.

@LeiWang1999 LeiWang1999 merged commit 01c714d into tile-ai:main Apr 4, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Robust error report from TVM FFI backend

2 participants