[SYSTEMDS-THESIS] Add a DP optimization for matrix chains with transposes by Elmanjhg · Pull Request #2465 · apache/systemds

Elmanjhg · 2026-05-04T02:53:32Z

This adds a new HOP rewrite rule, RewriteMatrixMultChainWithTransOptimization.java, to find the optimal execution plan for matrix multiplication chains containing transposes. Previously, these chains were optimized using a simple heuristic that just pushes transposes down from t(A %% B) -> t(B) %% t(A), which fails to be the optimal plan in some instances especially with large matrices.

An example would be R = t(A %% B) %% C with dimensions A = [16, 23], B = [23, 22], C = [16, 34]
which would be according to the old rewrite class solved with (t(B) %% t(A)) %% C -> costs: t(B) -> 2322 + t(A) -> 16 * 23 + t(B) %% t(A) -> 222316 + [...] %% C -> 221634 = 20938 FLOPs
Optimal would be simply: t(A %% B) %% C - costs: A %% B -> 162322 + t(A %% B) -> 1622 + [...] %% C -> 2216*34 = 20416 FLOPs - difference gets larger with higher matrix dimensions.

To solve this, we applied a DP Algorithm with a Memo Table containing Plans without transposing and Plans containing Transposing subchains calculating wether an algebraic transpose pushdown or direct transpose operation is cheaper.

This also includes 24 automated DML test cases asserting intermediate HOP dimensions to validate optimal parenthesization and transpose placement. = 20938 FLOPs
Optimal would be simply: t(A %% B) %% C - costs: A %% B -> 162322 + t(A %% B) -> 1622 + [...] %% C -> 221634 = 20416 FLOPs - difference gets larger with higher matrix dimensions.

To solve this, we applied a DP Algorithm with a Memo Table containing Plans without transposing and Plans containing Transposing subchains calculating wether an algebraic transpose pushdown or direct transpose operation is cheaper.

This also includes 24 automated DML test cases asserting intermediate HOP dimensions to validate optimal parenthesization and transpose placement.

I added 5 scripts as dml script with matrix chain multplications and transposes included, with comments that state the optimal Rewrite Plan.

…oses This adds a new HOP rewrite rule, RewriteMatrixMultChainWithTransOptimization.java, to find the optimal execution plan for matrix multiplication chains containing transposes. Previously, these chains were optimized using a simple heuristic that just pushes transposes down from t(A %*% B) -> t(B) %*% t(A), which fails to be the optimal plan in some instances especially with large matrices. An example would be R = t(A %*% B) %*% C with dimensions A = [16, 23], B = [23, 22], C = [16, 34] which would be according to the old rewrite class solved with (t(B) %*% t(A)) %*% C -> costs: t(B) -> 23*22 + t(A) -> 16 * 23 + t(B) %*% t(A) -> 22*23*16 + [...] %*% C -> 22*16*34 = 20938 FLOPs Optimal would be simply: t(A %*% B) %*% C - costs: A %*% B -> 16*23*22 + t(A %*% B) -> 16*22 + [...] %*% C -> 22*16*34 = 20416 FLOPs - difference gets larger with higher matrix dimensions. To solve this, we applied a DP Algorithm with a Memo Table containing Plans without transposing and Plans containing Transposing subchains calculating wether an algebraic transpose pushdown or direct transpose operation is cheaper. This also includes 24 automated DML test cases asserting intermediate HOP dimensions to validate optimal parenthesization and transpose placement. = 20938 FLOPs Optimal would be simply: t(A %*% B) %*% C - costs: A %*% B -> 16*23*22 + t(A %*% B) -> 16*22 + [...] %*% C -> 22*16*34 = 20416 FLOPs - difference gets larger with higher matrix dimensions. To solve this, we applied a DP Algorithm with a Memo Table containing Plans without transposing and Plans containing Transposing subchains calculating wether an algebraic transpose pushdown or direct transpose operation is cheaper. This also includes 24 automated DML test cases asserting intermediate HOP dimensions to validate optimal parenthesization and transpose placement.

mboehm7 · 2026-05-04T13:58:50Z

LGTM - thanks for the patch @Elmanjhg. During the merge I resolved the merge conflict of the pom.xml (and reverted the new dependency), disabled the new flags, added the licenses to java and dml test files, moved the dml test files to rewrites, and fixed the formatting (tabs vs spaces) in the java test.

Elmanjhg added 2 commits March 3, 2026 20:07

[DPSizeRewrite] - First Test Cases as DML Script

00fef4d

I added 5 scripts as dml script with matrix chain multplications and transposes included, with comments that state the optimal Rewrite Plan.

github-project-automation Bot added this to SystemDS PR Queue May 4, 2026

github-project-automation Bot moved this to In Progress in SystemDS PR Queue May 4, 2026

mboehm7 closed this in b748091 May 4, 2026

github-project-automation Bot moved this from In Progress to Done in SystemDS PR Queue May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYSTEMDS-THESIS] Add a DP optimization for matrix chains with transposes#2465

[SYSTEMDS-THESIS] Add a DP optimization for matrix chains with transposes#2465
Elmanjhg wants to merge 2 commits into
apache:mainfrom
Elmanjhg:DPSizeRewrite

Elmanjhg commented May 4, 2026

Uh oh!

mboehm7 commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Elmanjhg commented May 4, 2026

Uh oh!

mboehm7 commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants