Conversation
There was a problem hiding this comment.
Pull request overview
This PR modernizes the rocDSL test infrastructure by standardizing path setup across test files, improving GPU compilation pipelines, and adding a new LLVM build script. The changes remove hardcoded developer-specific paths in favor of dynamic project root detection, update MLIR transform APIs to use newer functions with folding support, and implement workarounds for MLIR Python binding issues. Additionally, a new build_llvm.sh script simplifies the LLVM/MLIR build process for new contributors.
Key changes include:
- Standardized path setup pattern across 14 Python test files for portability
- Enhanced GPU compilation pipelines with explicit lowering pass ordering and reconciliation
- New build_llvm.sh script for automated LLVM/MLIR building
- C++ API updates to use
applyPatternsAndFoldGreedilyin transform passes - Operation.create() workarounds for MLIR Python binding recursion bugs
- Polyfills for missing ROCDL operations in Python bindings
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/python/ir/test_rocir_coord_ops.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/ir/test_product_divide.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/ir/test_local_ops.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/ir/test_basic_ops.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/gpu/utils.py | Expanded GPU compilation pipeline with explicit control flow lowering passes before GPU-to-ROCDL conversion |
| tests/python/gpu/test_shared_working.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/gpu/test_mfma_simple_working.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/gpu/test_mfma_gemm_real_python_api.py | Added Operation.create() workaround for MFMA operations, enhanced compilation pipeline, added gpu.ModuleEndOp() terminator |
| tests/python/gpu/test_mfma_gemm_fp8_rocir.py | Added Operation.create() workaround for FP8 MFMA operations, enhanced compilation pipeline with additional lowering passes, added gpu.ModuleEndOp() terminator |
| tests/python/gpu/test_mfma_gemm_fp8_python_api.py | Added Operation.create() workaround for FP8 MFMA operations, enhanced compilation pipeline, added gpu.ModuleEndOp() terminator |
| tests/python/gpu/test_gpu_with_rocir_coords.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/gpu/test_gpu_simple.py | Updated to use standardized project root detection for sys.path setup, added missing os import |
| tests/python/gpu/test_gpu_rocdsl.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/gpu/test_gpu_layout.py | Updated to use standardized project root detection for sys.path setup |
| tests/python/examples/test_rocdl_ops.py | Updated to use standardized project root detection, added comments about potentially missing ROCDL operations |
| tests/python/conftest.py | Updated library paths to use dynamic project root detection instead of hardcoded absolute paths |
| python/rocdsl/rocir_opt_runner.py | Updated rocir-opt path to use dynamic project root detection |
| python/rocdsl/dialects/ext/rocir.py | Refactored operations to use Operation.create() API for crd2idx, size, cosize, and rank operations |
| python/rocdsl/dialects/ext/rocdl.py | Added polyfills for missing ROCDL operations (wavefrontsize, s_waitcnt, readlane, readfirstlane) |
| python/rocdsl/dialects/ext/gpu.py | Added automatic ModuleEndOp terminator to gpu.module region_op |
| python/rocdsl/compiler/pipeline.py | Updated rocir-opt path to use dynamic project root detection |
| lib/Transforms/RocirToStandard.cpp | Updated to use applyPatternsAndFoldGreedily instead of applyPatternsGreedily |
| lib/Transforms/RocirCoordLowering.cpp | Updated to use applyPatternsAndFoldGreedily instead of applyPatternsGreedily |
| build_llvm.sh | New script to automate LLVM/MLIR build process with release/19.x branch |
| README.md | Updated build instructions and test paths to reflect project structure changes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| loc=loc | ||
| ) | ||
| # Insert at current insertion point (inside the loop) | ||
| # ir.InsertionPoint.current.insert(op) |
There was a problem hiding this comment.
The commented-out lines # ir.InsertionPoint.current.insert(op) are present but unnecessary. When using Operation.create() within an active InsertionPoint context, the operation is automatically inserted. These commented lines should be removed to clean up the code.
| # ir.InsertionPoint.current.insert(op) |
| def s_waitcnt(simm16): | ||
| if isinstance(simm16, int): | ||
| simm16 = arith.ConstantOp(IntegerType.get_signless(32), simm16).result | ||
| return Operation.create("rocdl.waitcnt", operands=[simm16]).results |
There was a problem hiding this comment.
The s_waitcnt polyfill returns .results (a list), while other polyfills like wavefrontsize, readlane, and readfirstlane return .results[0] (a single result). This inconsistency may cause issues if the function is expected to return a single result value. Consider returning .results[0] for consistency, or document why it returns a list.
| return Operation.create("rocdl.waitcnt", operands=[simm16]).results | |
| return Operation.create("rocdl.waitcnt", operands=[simm16]).results[0] |
| # Or manually: | ||
| cd rocdsl | ||
| mkdir -p build && cd build | ||
| cmake .. -DMLIR_DIR=$MLIR_PATH//lib/cmake/mlir |
There was a problem hiding this comment.
The path contains a double slash // which is incorrect. Should be -DMLIR_DIR=$MLIR_PATH/lib/cmake/mlir with only a single slash separator.
| cmake .. -DMLIR_DIR=$MLIR_PATH//lib/cmake/mlir | |
| cmake .. -DMLIR_DIR=$MLIR_PATH/lib/cmake/mlir |
|
|
||
| with ip or InsertionPoint.current: | ||
| op = rocir_ops.Crd2IdxOp(_unwrap_value(coord), _unwrap_value(layout), results=[result_type], loc=loc, ip=ip) | ||
| with ip or InsertionPoint.current as insertion_point: |
There was a problem hiding this comment.
The insertion_point variable captured from the with statement is unused. Consider either removing the variable name (use just with ip or InsertionPoint.current:), or document why it's captured if there's a specific reason.
| with ip or InsertionPoint.current as insertion_point: | |
| with ip or InsertionPoint.current: |
|
|
||
| with ip or InsertionPoint.current: | ||
| op = rocir_ops.CosizeOp(_unwrap_value(layout), results=[result_type], loc=loc, ip=ip) | ||
| with ip or InsertionPoint.current as insertion_point: |
There was a problem hiding this comment.
The insertion_point variable captured from the with statement is unused. Consider either removing the variable name (use just with ip or InsertionPoint.current:), or document why it's captured if there's a specific reason.
| operands=[_unwrap_value(coord), _unwrap_value(layout)], | ||
| loc=loc | ||
| ) | ||
| # insertion_point.insert(op) |
There was a problem hiding this comment.
The commented-out line insertion_point.insert(op) suggests uncertainty about whether manual insertion is needed. When using Operation.create() within an active InsertionPoint context (from the with statement), the operation is automatically inserted. The comment should be removed for clarity, or if there was a specific reason to keep it commented, that reason should be documented.
| # insertion_point.insert(op) |
|
|
||
| with ip or InsertionPoint.current: | ||
| op = rocir_ops.SizeOp(_unwrap_value(shape_or_layout), results=[result_type], loc=loc, ip=ip) | ||
| with ip or InsertionPoint.current as insertion_point: |
There was a problem hiding this comment.
The insertion_point variable captured from the with statement is unused. Consider either removing the variable name (use just with ip or InsertionPoint.current:), or document why it's captured if there's a specific reason.
|
|
||
| with ip or InsertionPoint.current: | ||
| op = rocir_ops.RankOp(_unwrap_value(shape_or_layout), results=[result_type], loc=loc, ip=ip) | ||
| with ip or InsertionPoint.current as insertion_point: |
There was a problem hiding this comment.
The insertion_point variable captured from the with statement is unused. Consider either removing the variable name (use just with ip or InsertionPoint.current:), or document why it's captured if there's a specific reason.
| loc=loc | ||
| ) | ||
| # Insert at current insertion point | ||
| # ir.InsertionPoint.current.insert(op) |
There was a problem hiding this comment.
The commented-out lines ir.InsertionPoint.current.insert(op) are present but unnecessary. When using Operation.create() within an active InsertionPoint context, the operation is automatically inserted. These commented lines should be removed to clean up the code.
| # ir.InsertionPoint.current.insert(op) |
| loc=loc | ||
| ) | ||
| # Insert at current insertion point (inside the loop) | ||
| # ir.InsertionPoint.current.insert(op) |
There was a problem hiding this comment.
The commented-out lines # ir.InsertionPoint.current.insert(op) are present but unnecessary. When using Operation.create() within an active InsertionPoint context, the operation is automatically inserted. These commented lines should be removed to clean up the code.
| # ir.InsertionPoint.current.insert(op) |
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist