-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Hi all :) During the implementation of the TOSA to Taskflow pipeline in #245 , I have identified the standard tosa-layerwise-constant-fold pass does not currently fold simple constant subexpressions when using arith.constant as operands.
- Input TOSA IR:
func.func @const_fold_test() -> tensor<4xf32> {
%cst1 = arith.constant dense<[1.0, 2.0, 3.0, 4.0]> : tensor<4xf32>
%cst2 = arith.constant dense<[10.0, 20.0, 30.0, 40.0]> : tensor<4xf32>
// This addition should be folded!
%folded = tosa.add %cst1, %cst2 : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>
return %folded : tensor<4xf32>
}
- Current Suboptimal Output:
// Actual result from current pipeline: runtime calculation
affine.for %arg1 = 0 to 4 {
%2 = affine.load %0[%arg1] : memref<4xf32>
%3 = affine.load %1[%arg1] : memref<4xf32>
%4 = arith.addf %2, %3 : f32 // <--- Suboptimal: Runtime addition
affine.store %4, %alloc[%arg1] : memref<4xf32>
}
- Expected Target Output:
// Desired Result: Pure constant propagation
memref.global "private" constant @__constant_sum : memref<4xf32> = dense<[11.0, 12.0, 13.0, 14.0]>
func.func @const_fold_test(%arg0: memref<4xf32>) {
%0 = memref.get_global @__constant_sum : memref<4xf32>
memref.copy %0, %arg0 : memref<4xf32> to memref<4xf32>
return
}
Which indicates the current solution is not optimal.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request