Skip to content

TritonParse v0.3.2 Release πŸŽ‰

Choose a tag to compare

@FindHao FindHao released this 26 Dec 18:12
· 304 commits to main since this release

TritonParse Release Notes v0.3.2 (34 commits)

  • Date range: 2025-11-05 β€” 2025-12-22
  • Scope: Major feature release - New info CLI subcommand, multi-file call graph analysis for reproducers, unified 0-based indexing, IR extraction tools, and infrastructure improvements.

Highlights

  • πŸ“Š New info CLI Subcommand: Query kernel information from NDJSON trace files without manual parsing. List all kernels with launch counts, view launches for specific kernels, and get fuzzy matching suggestions for kernel names.
  • πŸ” Multi-File Call Graph Analyzer: Advanced AST-based analysis that automatically extracts all transitively-called functions across multiple Python files. Enables self-contained kernel reproducers with all dependencies included.
  • 🎯 Unified 0-Based Indexing: All launch indices throughout the codebase (CLI, website, internal APIs) now use consistent 0-based indexing following Python conventions.
  • ⚑ Enhanced Reproducer: New --kernel and --launch-id arguments eliminate manual line number lookup. AST-based dependency extraction, autotune disabler, and code formatting for generated scripts.
  • πŸ› οΈ IR Extraction Tool: New command-line tool to extract Triton IRs (TTIR, TTGIR, LLIR, PTX) from trace logs with flexible output organization.
  • πŸ” PyPI Trusted Publishing: Migrated from API token authentication to OIDC-based Trusted Publishing for improved security and attestations.

Changes by area

πŸ“Š New info CLI Subcommand

  • Core query layer (PR #208):
    • New tritonparse/info/ module for kernel information queries
    • KernelSummary and LaunchInfo dataclasses for structured results
    • list_kernels(): List all kernels with launch counts
    • find_launch_index_by_kernel(): Find line index for a kernel's N-th launch
  • CLI interface (PR #210):
    • tritonparseoss info <trace.ndjson> - List all kernels with launch counts
    • tritonparseoss info <trace.ndjson> --kernel <name> - List launches for specific kernel
    • Auto-parsing: Automatically detects and parses raw logs
    • Fuzzy matching suggestions when kernel not found
    • Performance optimization using launch_diff events when available
  • Additional filtering (commit 8134195):
    • Added --args-list filtering to info command

πŸ” Multi-File Call Graph Analyzer

  • Three-phase implementation (PR #206 Phase 1-3):
    • Phase 1 - ImportResolver: Multi-file call graph analysis foundation
    • Phase 2 - ImportParser: AST-based import statement parsing
    • Phase 3 - MultiFileCallGraphAnalyzer: Complete multi-file traversal with BFS
  • Key features:
    • Automatic extraction of transitively-called functions across Python files
    • Per-file code root tracking (fbcode, Python projects, Git repositories)
    • Graceful fallback for files outside detected roots
    • Integrated into reproducer for self-contained script generation

🎯 Unified 0-Based Indexing

  • Breaking change (PR #211):
    • All launch indices now use 0-based indexing
    • Affects: trace processor, website components (KernelOverview, DiffViewer, StackDiffViewer, ArgumentViewer)
    • CLI --line argument changed to 0-based (PR #205)
  • Rationale:
    • Consistency with Python conventions
    • Alignment with existing info and reproduce commands
    • Simpler code without +1/-1 conversions

⚑ Reproducer Enhancements

  • Kernel name lookup (PR #209):
    • New --kernel argument to specify kernel by name instead of line number
    • New --launch-id argument (0-based) to select specific launch
    • Mutual exclusivity with --line argument
    • Example: tritonparseoss reproduce trace.ndjson --kernel matmul_kernel --launch-id 2
  • AST-based dependency extraction (commit 8ad24f6):
    • Automatic extraction of dependent helper functions
    • Call graph analysis for transitive dependencies
    • Self-contained reproducers without manual function hunting
  • Autotune disabler (commit 28486fc):
    • Automatically disable Triton's autotune decorator in generated scripts
    • New utils.py module with _disable_triton_autotune() function
    • Works with both IMPORT and COPY kernel import modes
  • Code formatting (commit 311e016):
    • Generated reproducers are now properly formatted
  • Bug fixes:
    • Fix FileNotFoundError with absolute path templates (commit 458e6e9)
    • Fix kernel signature parsing for return type annotations (commit d24ae1d)
    • Support for Triton dtype parameters (commit 86fa46b)

πŸ› οΈ IR Extraction Tool

  • New tool (PR #202):
    • tritonparse/tools/extract_irs.py for extracting Triton IRs from trace logs
    • Supports TTIR, TTGIR, LLIR, PTX, and other IR formats
    • Flexible output: flat or by-kernel directory structure
    • Comprehensive documentation in tritonparse/tools/readme.md
  • Logger fix:
    • Fixed NameError: 'logger' is not defined in generated reproducers
    • Added proper logging initialization to templates

πŸ” Infrastructure & CI/CD

  • PyPI Trusted Publishing (PR #219):
    • Migrated from API token to OIDC authentication
    • Enabled package attestations for provenance
    • No secrets management required
  • On-Demand Nightly Publishing (PR #216):
    • Flexible PyPI publishing workflow
  • Website build CI (PR #224):
    • Added CI test for website builds
    • Updated frontend dependencies
  • Usage tracking (commit 89913ff):
    • Extended usage_report_logger to track all subcommands and API calls
    • Entry function detection via call stack traversal
    • Added skip_logger parameter to prevent duplicate logging

πŸ”§ Bug Fixes & Improvements

  • CUDA Graph capture fix (PR #197):
    • Fixed crash during CUDA graph capture in tensor argument extraction
    • Detects capture mode and skips problematic operations
    • Fixes compatibility with triton.testing.do_bench_cudagraph
  • Gzip support (PR #207):
    • Added gzip support for load_ndjson() function
  • Compilation metadata (PR #198):
    • Sort compilation metadata attributes alphabetically
  • Import formatting (commit 86a2229):
    • Format imports following Python style guide
  • Debug message (commit a205e50):
    • Added message for debugging when BlockPingpong exits early

πŸ“š Documentation

  • Wiki pages (PR #223):
    • Added new wiki pages to documentation table
  • Dependency cleanup (PR #225):
    • Removed unnecessary npm overrides for prismjs and dompurify

Compatibility notes

  • Breaking Change: All launch indices are now 0-based. Website displays and CLI arguments have been updated. If you have scripts relying on 1-based line numbers from --line, update them to use 0-based indices.
  • New Features: The info subcommand and --kernel/--launch-id reproducer options are additive and don't break existing workflows.
  • Reproducer: Generated scripts now include autotune disabler and dependent functions automatically. Templates have been updated with proper logger initialization.

Upgrade guidance

  1. Update index references: Change any 1-based line number references to 0-based indices.
  2. Use info command: Replace manual trace file inspection with tritonparseoss info <trace.ndjson> to list kernels.
  3. Use kernel name lookup: Instead of --line N, use --kernel <name> --launch-id <id> for more intuitive reproducer generation.
  4. Extract IRs: Use new python -m tritonparse.tools.extract_irs <trace.ndjson> for IR extraction tasks.