[SYSTEMDS-3123] Rewrite cbind 0 Matrix Multiplication#1385
[SYSTEMDS-3123] Rewrite cbind 0 Matrix Multiplication#1385Baunsgaard wants to merge 4 commits intoapache:masterfrom
Conversation
|
LGTM! |
dc6c2f7 to
dbf7498
Compare
|
Thanks for staring the discussion, a few additional points to consider (besides separating it from the compression-related changed):
|
I merged in the compression changes when i was happy with the rewrite.
Yes good point, i will include it in the a check that the dims are known
The comparison i make currently is based on the extra allocation therefore it has to compare m (the left hand side number of rows) with n (the common dimension/ number of rows in right). |
|
no, you misunderstood my main comment - we need to compare m vs n (and apparently that's what you wanted to do), but what the code does is to compare m vs (k+1) because you look at rows and columns of the output cbind (variable hi) - simply replace |
233ff13 to
3a42ae0
Compare
- Compressed matrix factory improvements - Add decompression if the data is serialized and larger in compressed format - Decompress on write to HDFS - Abort compression after cocode if the compression sizes are bad - Make c bind decompressing in workload tree (worst case) - Add a minimum compression ratio argument to the CompressionSettings - Reduce the sampling size in c bind compression and set high minimum compression ratio - Fix order of operations in compressed append - Add compressed output size to unary hops - More utilization of the cached decompressed matrix if it fits in memory by looking for soft reference of uncompressed in certain cases
``` cbind((X %*% Y), matrix(0, nrow(X), 1)) -> X %*% (cbind(Y, matrix(0, nrow(Y), 1))) ``` This commit contains a rewrite that change the sequences if number of rows in X is 2x larger than Y: This rewrite effects MLogReg in line 215 to not force allocation of the large X twice.
c532220 to
7f28983
Compare
This commit move the decompression instruction to the input hop of a decompressing instruction execution on the workload trees. this in practice means that if a variable is used in a forloop and needs decompression it is taken into account that the decompression only happens once, and outside the for loop. - Update to allow return of constant column groups in various cases - Remove System.out.println from estim estimatorDensityMap - Add Compression in loop right mult with decompression - Fix Binary matrix matrix operation Compressed - Add isDensifying boolean to compression cost, to allow compression to compare to dense allocation.
9b20fd6 to
2794ba1
Compare
This PR contains a rewrite that change the sequences if number of rows in X is 2x larger than Y:
This rewrite effects MLogReg in line 215 to not force allocation of the large X twice.
Review appreciated!