Skip to content

Conversation

@Baunsgaard
Copy link
Contributor

This commit update the change of matrix to frame to more efficiently change the MatrixBlock to frames.
The previous implementation has nice cache blocks and allocation for direct double to double change, this PR simply adds support for this change in the case of changing into other types, like boolean.

Changing a Matrix 64kx2k to boolean frame:

After:
22/12/21 19:56:11 ERROR frame.FrameFromMatrixBlockTest: 1055.994364
22/12/21 19:56:12 ERROR frame.FrameFromMatrixBlockTest: 1039.756463
22/12/21 19:56:13 ERROR frame.FrameFromMatrixBlockTest: 946.029085
22/12/21 19:56:14 ERROR frame.FrameFromMatrixBlockTest: 928.161053
22/12/21 19:56:15 ERROR frame.FrameFromMatrixBlockTest: 943.132151
22/12/21 19:56:16 ERROR frame.FrameFromMatrixBlockTest: 950.212744
22/12/21 19:56:17 ERROR frame.FrameFromMatrixBlockTest: 964.515222
22/12/21 19:56:17 ERROR frame.FrameFromMatrixBlockTest: 966.944032
22/12/21 19:56:18 ERROR frame.FrameFromMatrixBlockTest: 965.85695
22/12/21 19:56:19 ERROR frame.FrameFromMatrixBlockTest: 956.783357

Before:
22/12/21 19:59:56 ERROR frame.FrameFromMatrixBlockTest: 2199.846241
22/12/21 19:59:58 ERROR frame.FrameFromMatrixBlockTest: 2373.381971
22/12/21 20:00:01 ERROR frame.FrameFromMatrixBlockTest: 2270.362306
22/12/21 20:00:03 ERROR frame.FrameFromMatrixBlockTest: 2324.07255
22/12/21 20:00:05 ERROR frame.FrameFromMatrixBlockTest: 2294.39046
22/12/21 20:00:08 ERROR frame.FrameFromMatrixBlockTest: 2284.978142
22/12/21 20:00:10 ERROR frame.FrameFromMatrixBlockTest: 2295.71655
22/12/21 20:00:12 ERROR frame.FrameFromMatrixBlockTest: 2297.712022
22/12/21 20:00:14 ERROR frame.FrameFromMatrixBlockTest: 2311.518135
22/12/21 20:00:17 ERROR frame.FrameFromMatrixBlockTest: 2467.055097

@Baunsgaard
Copy link
Contributor Author

And now with parallelization:

22/12/21 22:03:53 ERROR frame.FrameFromMatrixBlockTest: 197.328218
22/12/21 22:03:53 ERROR frame.FrameFromMatrixBlockTest: 136.362345
22/12/21 22:03:53 ERROR frame.FrameFromMatrixBlockTest: 130.854457
22/12/21 22:03:53 ERROR frame.FrameFromMatrixBlockTest: 148.331303
22/12/21 22:03:53 ERROR frame.FrameFromMatrixBlockTest: 129.145552
22/12/21 22:03:54 ERROR frame.FrameFromMatrixBlockTest: 129.617421
22/12/21 22:03:54 ERROR frame.FrameFromMatrixBlockTest: 130.632412
22/12/21 22:03:54 ERROR frame.FrameFromMatrixBlockTest: 133.174287
22/12/21 22:03:54 ERROR frame.FrameFromMatrixBlockTest: 195.967821
22/12/21 22:03:54 ERROR frame.FrameFromMatrixBlockTest: 132.401876

@Baunsgaard
Copy link
Contributor Author

@phaniarnab @mboehm7

Any ideas on how to add (or where to add) threads to all methods/internals/instructions.
It is now the---i don't know---millionth time i run into the issue of not having the instruction include number of threads allowed for the instruction.
It should be set somewhere for all instructions such that no matter what all of them include a thread number, even if it is not used.

@Baunsgaard
Copy link
Contributor Author

Baunsgaard commented Dec 23, 2022

Reading time of boolean matrix

blockSize (k),  before       , after
          0.5,  0.557+-0.025 ,   0.134+-0.005
          1.0,  0.296+-0.003 ,   0.111+-0.001
          2.0,  0.210+-0.010 ,   0.107+-0.006
          4.0,  0.159+-0.009 ,   0.106+-0.007
          8.0,  0.137+-0.003 ,   0.110+-0.006
         16.0,  0.139+-0.009 ,   0.108+-0.005
         32.0,  0.152+-0.023 ,   0.129+-0.006
         64.0,  0.159+-0.012 ,   0.130+-0.002
        128.0,  0.112+-0.004 ,   0.117+-0.004

This Commit introduce various updates and refinements to the FrameBlock
infrastructure. In specific the modification and changing of MatrixBlock
to FrameBlock is optimized.

In this process the parallelization of the instructions is critical,
and therefore contained in this commit is a larger change to Unary
instructions to now all contain a thread count in the instruction string.
This change is also effecting instructions that does not nessesarily need
a thread count such as broadcast, but it did give the opportunity to
refine the applySchema, toMatrix, toFrame, and other instructions to
be parallel.

Example: changing a Matrix 64kx2k to boolean frame:

before 2.2 sec single thread, after single thread 0.9 sec, and parallel
0.13 sec.

Also improved is the reading time of frames, where before the time
varied drastically depending on block size saved, it is now improved from
0.56 sec to 0.13 sec on 500 block size.

A final update to also imrove overall execution is compiletime,
I observed that the compiletime if we include IO operations increase to
0.6 sec. While if we do not have IO operations it is 0.3 sec. This
is due to the hadoop IO we are using taking up to 70% of the compile time
in cases where we have simple scripts with only read and a single operation.
This is a constant overhead on the fist IO operation that does not effect
subsequent IO operations, to improve this i have moved this to a parallel
operation when we construct the JobConfiguration. This improve the
compile time of systemds in general from ~0.6 sec when using IO to ~0.2 sec.
@Baunsgaard Baunsgaard closed this in 785bb95 Jan 4, 2023
@Baunsgaard Baunsgaard deleted the MatrixBlockToFrame branch January 4, 2023 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant