Skip to content

[MLIR] Fix use-after-free in block walk, Shape canonicalization, and …#4

Merged
zhanghb97 merged 1 commit into
RuyiAI-Stack:mainfrom
XYenChi:patch-3
Apr 18, 2026
Merged

[MLIR] Fix use-after-free in block walk, Shape canonicalization, and …#4
zhanghb97 merged 1 commit into
RuyiAI-Stack:mainfrom
XYenChi:patch-3

Conversation

@XYenChi
Copy link
Copy Markdown

@XYenChi XYenChi commented Apr 18, 2026

…opsrun test

Fix three issues found on RISC-V:

  1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object returned by Iterator::makeIterable(region) was passed as a temporary directly to llvm::make_early_inc_range(). The temporary was destroyed at the end of the full expression while iterators from make_early_inc_range still referenced it. Store the result in a local variable with auto&& to extend the lifetime of the temporary and bind lvalue references for ForwardIterator.

  1. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds can resolve shapes behind casts inserted by earlier canonicalization passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in CstrBroadcastableOp::fold that recognizes broadcasting is trivially valid when at most one operand is non-scalar (resolved via getShapeVec). Rewrite CanonicalizeCastExtentTensorOperandsPattern to use modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with op builder template instantiation.

  1. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that copy_and_update in multithreaded_tests.py properly strips them during import, preventing JIT execution at module load time which crashes on RISC-V due to R_RISCV_HI20 relocation range limits.

…opsrun test

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
@zhanghb97 zhanghb97 merged commit d5f380f into RuyiAI-Stack:main Apr 18, 2026
zhanghb97 pushed a commit that referenced this pull request May 10, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 10, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 12, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 12, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 12, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 13, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 14, 2026
llvm#183506 revealed a pre-existing
use-after-scope in createInstrInfo (MSan bot:
https://lab.llvm.org/buildbot/#/builders/164/builds/21562 [*]).

This patch fixes the issue by changing the stack-allocated
AArch64Subtarget (which goes out of scope once createInstrInfo()
returns) into heap-allocated, allowing it to be safely stored in the
returned AArch64InstrInfo.

-----

[*] WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55555666fabd in
llvm::AArch64InstrInfo::getInstSizeInBytes(llvm::MachineInstr const&)
const
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:247:5
...

/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:85:3
#9 0x555556508559 in InstSizes_MOVaddrTagged_Test::TestBody()
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:301:3
...

  Member fields were destroyed
#0 0x555556498a1d in __sanitizer_dtor_callback_fields
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1074:5
#1 0x5555564fbda6 in ~Triple
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:348:12
#2 0x5555564fbda6 in ~Triple
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/include/llvm/TargetParser/Triple.h:47:7
#3 0x5555564fbda6 in llvm::AArch64Subtarget::~AArch64Subtarget()
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/lib/Target/AArch64/AArch64Subtarget.h:38:7
#4 0x555556503396 in (anonymous
namespace)::createInstrInfo(llvm::TargetMachine*)
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:38:1
#5 0x5555565084cb in InstSizes_MOVaddrTagged_Test::TestBody()
/home/b/sanitizer-x86_64-linux-bootstrap-msan/build/llvm-project/llvm/unittests/Target/AArch64/InstSizes.cpp:299:42
WuXintong123 pushed a commit that referenced this pull request May 14, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 15, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 16, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 17, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 18, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 19, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 20, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 21, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 22, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
WuXintong123 pushed a commit that referenced this pull request May 23, 2026
…opsrun test (#4)

Fix three issues found on RISC-V:

1. Fix use-after-free in block walk (Visitors.h)

When walking blocks with ReverseDominanceIterator, the Traversal object
returned by Iterator::makeIterable(region) was passed as a temporary
directly to llvm::make_early_inc_range(). The temporary was destroyed at
the end of the full expression while iterators from make_early_inc_range
still referenced it. Store the result in a local variable with auto&& to
extend the lifetime of the temporary and bind lvalue references for
ForwardIterator.

2. Fix Shape dialect CstrBroadcastableOp fold and cast canonicalization

Extend getShapeVec to look through tensor.cast operations so that folds
can resolve shapes behind casts inserted by earlier canonicalization
passes (e.g., ShapeOfOpToConstShapeOp). Add a new fold check in
CstrBroadcastableOp::fold that recognizes broadcasting is trivially
valid when at most one operand is non-scalar (resolved via getShapeVec).
Rewrite CanonicalizeCastExtentTensorOperandsPattern to use
modifyOpInPlace instead of replaceOpWithNewOp to avoid issues with
op builder template instantiation.

3. Fix opsrun.py test invocation pattern for multithreaded_tests.py

Convert direct test_foo() calls to run(test_foo) so that
copy_and_update in multithreaded_tests.py properly strips them during
import, preventing JIT execution at module load time which crashes on
RISC-V due to R_RISCV_HI20 relocation range limits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants