Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMDS-3482] Parallel Hadoop IO Startup #1757

Closed
wants to merge 1 commit into from

Conversation

Baunsgaard
Copy link
Contributor

I observed that the compile time if we include IO operations increase to ~0.6 sec. While if we do not have IO operations it is ~0.2 sec. This is due to the hadoop IO we are using taking up to 70% of the compile time in cases where we have simple scripts with only read and a single operation. This is a constant overhead on the fist IO operation that does not effect subsequent IO operations, to improve this I have moved this to a parallel operation when we construct the JobConfiguration. This improve the compile time of systemds in general from ~0.6 sec when using IO to ~0.2 sec.

I observed that the compile time if we include IO operations increase to
~0.6 sec. While if we do not have IO operations it is ~0.2 sec. This
is due to the hadoop IO we are using taking up to 70% of the compile time
in cases where we have simple scripts with only read and a single operation.
This is a constant overhead on the fist IO operation that does not effect
subsequent IO operations, to improve this I have moved this to a parallel
operation when we construct the JobConfiguration. This improve the
compile time of systemds in general from ~0.6 sec when using IO to ~0.2 sec.
@Baunsgaard
Copy link
Contributor Author

I chose to also start the threadpool here, since this improve the startup time of the first instruction that use the pool. If i used a simple thread it becomes slightly faster, but any parallel subsequent operation would have to pay the pool startup time, and the extra thread would be kept around without being used, or would have to be closed again.

@Baunsgaard
Copy link
Contributor Author

Before :

SystemDS Statistics:
Total elapsed time:		0.732 sec.
Total compilation time:		0.635 sec.
Total execution time:		0.097 sec.
Cache hits (Mem/Li/WB/FS/HDFS):	1/0/0/0/1.
Cache writes (Li/WB/FS/HDFS):	0/0/0/0.
Cache times (ACQr/m, RLS, EXP):	0.060/0.000/0.000/0.000 sec.
HOP DAGs recompiled (PRED, SB):	0/0.
HOP DAGs recompile time:	0.000 sec.
Total JIT compile time:		0.641 sec.
Total JVM GC count:		0.
Total JVM GC time:		0.0 sec.
Heavy hitter instructions:
 #  Instruction  Time(s)  Count
 1  rightIndex     0.067      1
 2  createvar      0.015      2
 3  toString       0.014      1
 4  print          0.000      1
 5  rmvar          0.000      2

After

SystemDS Statistics:
Total elapsed time:		0.279 sec.
Total compilation time:		0.222 sec.
Total execution time:		0.056 sec.
Cache hits (Mem/Li/WB/FS/HDFS):	1/0/0/0/1.
Cache writes (Li/WB/FS/HDFS):	0/0/0/0.
Cache times (ACQr/m, RLS, EXP):	0.036/0.000/0.000/0.000 sec.
HOP DAGs recompiled (PRED, SB):	0/0.
HOP DAGs recompile time:	0.000 sec.
Total JIT compile time:		0.213 sec.
Total JVM GC count:		0.
Total JVM GC time:		0.0 sec.
Heavy hitter instructions:
 #  Instruction  Time(s)  Count
 1  rightIndex     0.040      1
 2  createvar      0.009      2
 3  toString       0.007      1
 4  print          0.000      1
 5  rmvar          0.000      2

@mboehm7
Copy link
Contributor

mboehm7 commented Jan 3, 2023

well good, just two comments: (1) if the thread pool is not the JVM-internal pool, the shutdown would essentially wait for successful completion of the task, and (2) please double check that this method your calling is internally properly synchronized.

@Baunsgaard
Copy link
Contributor Author

1: verified . the JVM is correctly allocating and using the correct pool, there is no difference to normal execution. But it seems like we need to analyze if we have some other locations that are allocating threads that are not needed, since the pool does not allocate the 1,2 threads, indicating maybe that there is something unintended happening somewhere.

Screenshot from 2023-01-03 17-26-30

2: The filesystem to get is documented and coded in a way that the context is cached if created by a request. It seems like adding a synchronization block is not needed, since in the case two threads ask they will just both create a FileSystem object and overwrite the cached.

Screenshot from 2023-01-03 18-27-10

@Baunsgaard Baunsgaard closed this in 6a759ce Jan 4, 2023
@Baunsgaard Baunsgaard deleted the StartUpParallel branch January 4, 2023 22:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants