Skip to content

Conversation

@avik-pal
Copy link
Collaborator

@avik-pal avik-pal commented May 22, 2025

Summarizing the transform pass:

  1. Pattern: concat of insert dims (at the same dim) with equivalent ops.
  2. For all the parent operands, we insert a dim at dim=0 and concat those.
  3. Insert a batch op and immediately resolve the op into a callop using batching utilities
  4. Move the first batched dim to the concat dim.

Later we can use this same pass for ops like cholesky / tringular_solve / lu where we need to check that the concat dim is not among the last 2 dims.

For lu it is a bit more complicated since it has 4 returns and we need to check whether all the other returns are either (1) unused or (2) also concatenated.

@avik-pal avik-pal marked this pull request as ready for review May 22, 2025 00:38
@avik-pal avik-pal requested a review from wsmoses May 22, 2025 00:42
@wsmoses
Copy link
Member

wsmoses commented May 22, 2025

rather than call batch, why not just call the batchopinterface if defined? or other utility function from the implementation of batch lowering

@wsmoses
Copy link
Member

wsmoses commented May 22, 2025

e.g. we can make https://github.com/EnzymeAD/Enzyme/blob/db0181320d6e425ee963bd496ed0d8dbb615be18/enzyme/Enzyme/MLIR/Passes/EnzymeBatchPass.cpp#L130 into a batchOperation utility or something, which we call directly here instead of outlining into a new function

@avik-pal
Copy link
Collaborator Author

yeah that sounds like a better thing to do

@avik-pal avik-pal marked this pull request as draft May 22, 2025 01:36
@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch from 983ae0e to 0a3007c Compare July 18, 2025 02:56
@avik-pal avik-pal marked this pull request as ready for review July 19, 2025 13:48
@avik-pal
Copy link
Collaborator Author

somehow I made it segfault only on mac...

@wsmoses
Copy link
Member

wsmoses commented Jul 19, 2025

That's probably an issue of f(a(), b()) running a before b on Linux and opposite on Mac

@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch 2 times, most recently from ec069b9 to a39c55c Compare July 27, 2025 20:41
@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch 4 times, most recently from 80e45aa to 6a6c099 Compare August 4, 2025 02:16
@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch 3 times, most recently from 13cc867 to a24c05f Compare August 16, 2025 00:14
@avik-pal
Copy link
Collaborator Author

@avik-pal
Copy link
Collaborator Author

dumping f before crash

func.func private @"Z%\BF\E3=[\00\00\9ClB\1A\E1T\FD\0Bhed_concatReshapeOpToBatch_34"(%arg0: tensor<288xf32>, %arg1: tensor<768xf32>) -> tensor<288x768xf32> {
  %0 = stablehlo.dot_general %arg0, %arg1, contracting_dims = [] x [], precision = [DEFAULT, DEFAULT] : (tensor<288xf32>, tensor<768xf32>) -> tensor<288x768xf32>
  return %0 : tensor<288x768xf32>
}

@avik-pal
Copy link
Collaborator Author

@wsmoses
Copy link
Member

wsmoses commented Aug 16, 2025

er.....wat re the linked failure

@wsmoses
Copy link
Member

wsmoses commented Aug 16, 2025

that seems like some sort of use after free undefined memory error

@wsmoses
Copy link
Member

wsmoses commented Aug 16, 2025

we can always just change that to
std::String new_name = ("batched_" + fn.getName()).str();

@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch 4 times, most recently from e8a16d7 to 8be4e35 Compare August 19, 2025 22:03
@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch from 4794337 to 02584da Compare August 23, 2025 04:15
@avik-pal avik-pal force-pushed the ap/general_concat_push_up branch from 02584da to b3b62ef Compare August 23, 2025 04:18
@avik-pal avik-pal merged commit 0b1f902 into main Aug 23, 2025
12 of 14 checks passed
@avik-pal avik-pal deleted the ap/general_concat_push_up branch August 23, 2025 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants