[Misc] Improve debug tooling and fix semaphore codegen#112
Merged
yaoyaoding merged 3 commits intomainfrom Apr 9, 2026
Merged
Conversation
495b2b1 to
68acf2c
Compare
- Add `disable_ptxas_opt` debug option (replaces `launch_blocking`) to compile with ptxas -O0 for easier PTX/SASS debugging - Include `disable_ptxas_opt` and `target` in program cache key so different debug/target configs don't share cached builds - Improve SASS dump to use nvdisasm -g for source-annotated disassembly - Add env var support for TILUS_CACHE_DIR and TILUS_DUMP_IR options - Fix LockSemaphoreEmitter to generate correct spin-wait code when inside nested thread groups (where warp-level sync_reduce is unavailable) - Add `current_thread_group_depth` property to BaseInstEmitter - Use cluster_sync instead of sync before tmem dealloc in matmul_v8 - Remove unused matmul_v9 example Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>
68acf2c to
4f022e7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
disable_ptxas_optdebug option (replaceslaunch_blocking) to compile with ptxas -O0 for easier PTX/SASS debuggingdisable_ptxas_optandtargetin program cache key so different debug/target configs don't share cached buildscurrent_thread_group_depthproperty to BaseInstEmitter