DSE engine crashed when using `+=1` to increment #48

chhzh123 · 2022-05-12T18:49:02Z

This is somehow a weird problem. I wanted to generate the HLS code for the following input program (a simple convolutional layer), but the DSE engine crashed.

#define bs 4
#define oc 16
#define ic 6
#define ih 8
#define iw 8
#define kh 3
#define kw 3
#define oh 6
#define ow 6

void test_conv2d(float A[bs][ic][ih][iw], float B[oc][ic][kh][kw], float C[bs][oc][oh][ow]) {
#pragma scop
  for (int n = 0; n < bs; n += 1) {
    for (int c = 0; c < oc; c += 1) {
      for (int y = 0; y < oh; y += 1) {
        for (int x = 0; x < ow; x += 1) {
          float sum = 0;
          for (int rc = 0; rc < ic; rc += 1) {
            for (int rh = 0; rh < kh; rh += 1) {
              for (int rw = 0; rw < kw; rw += 1) {
                sum += A[n][rc][y+rh][x+rw] * B[c][rc][rh][rw];
          }}}
          C[n][c][y][x] = sum;
}}}}
#pragma endscop
}

scalehls-opt: /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/include/llvm/ADT/SmallVector.h:273: llvm::SmallVectorTemplateCommon::reference llvm::SmallVectorTemplateCommon<mlir::scalehls::LoopDesignPoint>::operator[](llvm::SmallVectorTemplateCommon::size_type) [T = mlir::scalehls::LoopDesignPoint]: Assertion `idx < size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: scalehls-opt test_conv2d.mlir -debug-only=scalehls "-scalehls-dse-pipeline=top-func=test_conv2d target-spec=../polybench/config.json"
 #0 0x00000000043c90ca llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:11
 #1 0x00000000043c927b PrintStackTraceSignalHandler(void*) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00000000043c7926 llvm::sys::RunSignalHandlers() /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/lib/Support/Signals.cpp:103:5
 #3 0x00000000043c9985 SignalHandler(int) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007fec525ab630 __restore_rt sigaction.c:0:0
 #5 0x00007fec51198387 raise (/lib64/libc.so.6+0x36387)
 #6 0x00007fec51199a78 abort (/lib64/libc.so.6+0x37a78)
 #7 0x00007fec511911a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #8 0x00007fec51191252 (/lib64/libc.so.6+0x2f252)
 #9 0x00000000034910e9 llvm::SmallVectorTemplateCommon<mlir::scalehls::LoopDesignPoint, void>::operator[](unsigned long) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/llvm/include/llvm/ADT/SmallVector.h:0:5
#10 0x0000000003489c65 void updateParetoPoints<mlir::scalehls::LoopDesignPoint>(llvm::SmallVector<mlir::scalehls::LoopDesignPoint, 16u>&) /scratch/users/hc676/github/scalehls/lib/Transforms/DesignSpaceExplore.cpp:37:22
#11 0x0000000003489c2c mlir::scalehls::LoopDesignSpace::initializeLoopDesignSpace(unsigned int) /scratch/users/hc676/github/scalehls/lib/Transforms/DesignSpaceExplore.cpp:282:1
#12 0x000000000348c7da mlir::scalehls::ScaleHLSExplorer::exploreDesignSpace(mlir::func::FuncOp, bool, llvm::StringRef, llvm::StringRef) /scratch/users/hc676/github/scalehls/lib/Transforms/DesignSpaceExplore.cpp:746:5
#13 0x000000000348d0a9 mlir::scalehls::ScaleHLSExplorer::applyDesignSpaceExplore(mlir::func::FuncOp, bool, llvm::StringRef, llvm::StringRef) /scratch/users/hc676/github/scalehls/lib/Transforms/DesignSpaceExplore.cpp:819:7
#14 0x000000000348feff (anonymous namespace)::DesignSpaceExplore::runOnOperation() /scratch/users/hc676/github/scalehls/lib/Transforms/DesignSpaceExplore.cpp:894:5
#15 0x0000000003fdad12 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Pass/Pass.cpp:402:21
#16 0x0000000003fdb2b6 mlir::detail::OpToOpPassAdaptor::runPipeline(llvm::iterator_range<llvm::pointee_iterator<std::unique_ptr<mlir::Pass, std::default_delete<mlir::Pass> >*, mlir::Pass> >, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Pass/Pass.cpp:463:16
#17 0x0000000003fdc886 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Pass/Pass.cpp:705:10
#18 0x0000000003fdc785 mlir::PassManager::run(mlir::Operation*) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Pass/Pass.cpp:685:60
#19 0x0000000003432bad performActions(llvm::raw_ostream&, bool, bool, llvm::SourceMgr&, mlir::MLIRContext*, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:79:17
#20 0x0000000003431aa3 processBuffer(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, bool, bool, bool, bool, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, llvm::ThreadPool*) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:119:12
#21 0x0000000003431865 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, llvm::function_ref<mlir::LogicalResult (mlir::PassManager&)>, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:169:10
#22 0x0000000003431c6a mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, mlir::PassPipelineCLParser const&, mlir::DialectRegistry&, bool, bool, bool, bool, bool) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:189:10
#23 0x0000000003432851 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&, bool) /scratch/users/hc676/github/scalehls/polygeist/llvm-project/mlir/lib/Tools/mlir-opt/MlirOptMain.cpp:293:14
#24 0x0000000001934b8c main /scratch/users/hc676/github/scalehls/tools/scalehls-opt/scalehls-opt.cpp:16:23
#25 0x00007fec51184555 __libc_start_main (/lib64/libc.so.6+0x22555)
#26 0x0000000001934a69 _start (/scratch/users/hc676/github/scalehls/build/bin/scalehls-opt+0x1934a69)

However, if I use ++ to increment instead of +=1, this problem will not occur. For comparison, the MLIR assembly of the +=1 version is shown below.

module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} {
  func @test_conv2d(%arg0: memref<4x6x8x8xf32>, %arg1: memref<16x6x3x3xf32>, %arg2: memref<4x16x6x6xf32>) attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 0.000000e+00 : f32
    affine.for %arg3 = 0 to 4 {
      affine.for %arg4 = 0 to 16 {
        affine.for %arg5 = 0 to 6 {
          affine.for %arg6 = 0 to 6 {
            %0 = affine.for %arg7 = 0 to 6 iter_args(%arg8 = %cst) -> (f32) {
              %1 = affine.for %arg9 = 0 to 3 iter_args(%arg10 = %arg8) -> (f32) {
                %2 = affine.for %arg11 = 0 to 3 iter_args(%arg12 = %arg10) -> (f32) {
                  %3 = affine.load %arg0[%arg3, %arg7, %arg5 + %arg9, %arg6 + %arg11] : memref<4x6x8x8xf32>
                  %4 = affine.load %arg1[%arg4, %arg7, %arg9, %arg11] : memref<16x6x3x3xf32>
                  %5 = arith.mulf %3, %4 : f32
                  %6 = arith.addf %arg12, %5 : f32
                  affine.yield %6 : f32
                }
                affine.yield %2 : f32
              }
              affine.yield %1 : f32
            }
            affine.store %0, %arg2[%arg3, %arg4, %arg5, %arg6] : memref<4x16x6x6xf32>
          }
        }
      }
    }
    return
  }
}

while the ++ version does not use iter_args. I don't know why this is the case.

module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.endianness", "little">, #dlti.dl_entry<i64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f80, dense<128> : vector<2xi32>>, #dlti.dl_entry<i1, dense<8> : vector<2xi32>>, #dlti.dl_entry<i8, dense<8> : vector<2xi32>>, #dlti.dl_entry<i16, dense<16> : vector<2xi32>>, #dlti.dl_entry<i32, dense<32> : vector<2xi32>>, #dlti.dl_entry<f16, dense<16> : vector<2xi32>>, #dlti.dl_entry<f64, dense<64> : vector<2xi32>>, #dlti.dl_entry<f128, dense<128> : vector<2xi32>>>, llvm.data_layout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128", llvm.target_triple = "x86_64-unknown-linux-gnu"} {
  func @test_conv2d(%arg0: memref<4x6x8x8xf32>, %arg1: memref<16x6x3x3xf32>, %arg2: memref<4x16x6x6xf32>) attributes {llvm.linkage = #llvm.linkage<external>} {
    %cst = arith.constant 0.000000e+00 : f32
    %0 = memref.alloca() : memref<1xf32>
    %1 = llvm.mlir.undef : f32
    affine.store %1, %0[0] : memref<1xf32>
    affine.for %arg3 = 0 to 4 {
      affine.for %arg4 = 0 to 16 {
        affine.for %arg5 = 0 to 6 {
          affine.for %arg6 = 0 to 6 {
            affine.store %cst, %0[0] : memref<1xf32>
            affine.for %arg7 = 0 to 6 {
              affine.for %arg8 = 0 to 3 {
                affine.for %arg9 = 0 to 3 {
                  %3 = affine.load %arg0[%arg3, %arg7, %arg5 + %arg8, %arg6 + %arg9] : memref<4x6x8x8xf32>
                  %4 = affine.load %arg1[%arg4, %arg7, %arg8, %arg9] : memref<16x6x3x3xf32>
                  %5 = arith.mulf %3, %4 : f32
                  %6 = affine.load %0[0] : memref<1xf32>
                  %7 = arith.addf %6, %5 : f32
                  affine.store %7, %0[0] : memref<1xf32>
                }
              }
            }
            %2 = affine.load %0[0] : memref<1xf32>
            affine.store %2, %arg2[%arg3, %arg4, %arg5, %arg6] : memref<4x16x6x6xf32>
          }
        }
      }
    }
    return
  }
}

The text was updated successfully, but these errors were encountered:

hanchenye · 2022-05-12T22:43:12Z

Hi @chhzh123, I pushed a small patch to fix this issue - 0f9627a.

Basically, ScaleHLS DSE is not supporting iter_args currently, so we always apply materialize-reduction and buffer-loop-hoisting in the scalehls-dse-pipeline to remove iter_args and hoist the generated local buffers outside of the loop nest. However, the materialize-reduction pass can generate some redundant memory loads and stores, which may decrease the performance the DSE can achieve. For now, we don't have a optimization pass to remove these redundancies.

For the parsing result difference between ++ and += 1 in mlir-clang, I'm not sure whether this is intended or actually an issue to resolve. Could you open an issue in the Polygeist repository for this if it is not already existed?

chhzh123 · 2022-05-12T22:50:12Z

Okay, thanks a lot! I'll file an issue in the Polygeist repository.

chhzh123 closed this as completed May 12, 2022

mehank7 mentioned this issue Apr 9, 2024

Invalid entry region without an attached successor region #76

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSE engine crashed when using `+=1` to increment #48

DSE engine crashed when using `+=1` to increment #48

chhzh123 commented May 12, 2022

hanchenye commented May 12, 2022

chhzh123 commented May 12, 2022

DSE engine crashed when using +=1 to increment #48

DSE engine crashed when using +=1 to increment #48

Comments

chhzh123 commented May 12, 2022

hanchenye commented May 12, 2022

chhzh123 commented May 12, 2022

DSE engine crashed when using `+=1` to increment #48

DSE engine crashed when using `+=1` to increment #48