-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46205][CORE] Improve PersistenceEngine
performance with KryoSerializer
#44113
Conversation
|
if (out != null) { | ||
out.close() | ||
} | ||
fileOut.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this reordering, KryoSerializer complains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @sadikovi
Could you review this when you have a chance, @yaooqinn , @beliefer , @LuciferYang ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except some comments.
val e = master.invokePrivate(_persistenceEngine()).asInstanceOf[FileSystemPersistenceEngine] | ||
assert(e.serializer.isInstanceOf[KryoSerializer]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val e = master.invokePrivate(_persistenceEngine()).asInstanceOf[FileSystemPersistenceEngine] | |
assert(e.serializer.isInstanceOf[KryoSerializer]) | |
val persistenceEngine = master.invokePrivate(_persistenceEngine()).asInstanceOf[FileSystemPersistenceEngine] | |
assert(persistenceEngine.serializer.isInstanceOf[KryoSerializer]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -60,15 +61,23 @@ object PersistenceEngineBenchmark extends BenchmarkBase { | |||
runBenchmark("PersistenceEngineBenchmark") { | |||
val benchmark = new Benchmark(s"$numWorkers Workers", numWorkers, output = output) | |||
|
|||
benchmark.addCase("ZooKeeperPersistenceEngine", numIters) { _ => | |||
benchmark.addCase("ZooKeeperPersistenceEngine with JavaSerializer", numIters) { _ => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
Seq(serializerJava, serializerKryo).foreach { serializer =>
benchmark.addCase("ZooKeeperPersistenceEngine with ${serializer.getClass.getSimpleName}", numIters) { _ =>
val engine = new ZooKeeperPersistenceEngine(conf, serializer)
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but this is also intentional for the ordering like the following.
OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Linux 5.15.0-1051-azure
AMD EPYC 7763 64-Core Processor
1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------
ZooKeeperPersistenceEngine with JavaSerializer 1202 1298 138 0.0 1201614.2 1.0X
ZooKeeperPersistenceEngine with KryoSerializer 951 1004 48 0.0 950559.0 1.3X
FileSystemPersistenceEngine with JavaSerializer 212 217 6 0.0 211623.2 5.7X
FileSystemPersistenceEngine with KryoSerializer 79 81 2 0.0 79132.5 15.2X
BlackHolePersistenceEngine 0 0 0 30.9 32.4 37109.8X
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, for this one, it seems that I misunderstood your previous comment. Let me address it at the next PR (or a follow-up), @beliefer . Sorry for my mistake.
import org.apache.spark.util.Utils | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Sure.
Thank you so much, @beliefer . Merged to master for Apache Spark 4.0.0. |
### What changes were proposed in this pull request? This is a follow-up of #44113 to address a comment about simplifying the benchmark. ### Why are the changes needed? To simplify. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ build/sbt "core/Test/runMain org.apache.spark.deploy.master.PersistenceEngineBenchmark" ... [info] OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Mac OS X 14.2 [info] Apple M1 Max [info] 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------- [info] ZooKeeperPersistenceEngine with JavaSerializer 12602 12847 253 0.0 12601717.5 1.0X [info] ZooKeeperPersistenceEngine with KryoSerializer 12116 12130 13 0.0 12116373.8 1.0X [info] FileSystemPersistenceEngine with JavaSerializer 429 435 6 0.0 429374.9 29.3X [info] FileSystemPersistenceEngine with KryoSerializer 179 180 2 0.0 178795.7 70.5X [info] BlackHolePersistenceEngine 0 0 0 46.6 21.5 587273.6X [success] Total time: 126 s (02:06), completed Dec 1, 2023, 7:59:25 PM ``` ### Was this patch authored or co-authored using generative AI tooling? Closes #44118 from dongjoon-hyun/SPARK-46205-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…oSerializer` ### What changes were proposed in this pull request? This PR aims to improve `PersistenceEngine` performance with `KryoSerializer` via introducing a new configuration, `spark.deploy.recoverySerializer`. ### Why are the changes needed? Allow users to choose a better serializer to get a better performance in their environment. Especially, `KryoSerializer` is about **3x faster** than `JavaSerializer` with `FileSystemPersistenceEngine`. ``` ================================================================================================ PersistenceEngineBenchmark ================================================================================================ OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Linux 5.15.0-1051-azure AMD EPYC 7763 64-Core Processor 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------------- ZooKeeperPersistenceEngine with JavaSerializer 1202 1298 138 0.0 1201614.2 1.0X ZooKeeperPersistenceEngine with KryoSerializer 951 1004 48 0.0 950559.0 1.3X FileSystemPersistenceEngine with JavaSerializer 212 217 6 0.0 211623.2 5.7X FileSystemPersistenceEngine with KryoSerializer 79 81 2 0.0 79132.5 15.2X BlackHolePersistenceEngine 0 0 0 30.9 32.4 37109.8X ``` ### Does this PR introduce _any_ user-facing change? No. The default behavior is the same. ### How was this patch tested? Pass the CIs with the new added test cases. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44113 from dongjoon-hyun/SPARK-46205. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This is a follow-up of apache#44113 to address a comment about simplifying the benchmark. ### Why are the changes needed? To simplify. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ build/sbt "core/Test/runMain org.apache.spark.deploy.master.PersistenceEngineBenchmark" ... [info] OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Mac OS X 14.2 [info] Apple M1 Max [info] 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------- [info] ZooKeeperPersistenceEngine with JavaSerializer 12602 12847 253 0.0 12601717.5 1.0X [info] ZooKeeperPersistenceEngine with KryoSerializer 12116 12130 13 0.0 12116373.8 1.0X [info] FileSystemPersistenceEngine with JavaSerializer 429 435 6 0.0 429374.9 29.3X [info] FileSystemPersistenceEngine with KryoSerializer 179 180 2 0.0 178795.7 70.5X [info] BlackHolePersistenceEngine 0 0 0 46.6 21.5 587273.6X [success] Total time: 126 s (02:06), completed Dec 1, 2023, 7:59:25 PM ``` ### Was this patch authored or co-authored using generative AI tooling? Closes apache#44118 from dongjoon-hyun/SPARK-46205-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to update `Spark Standalone` cluster recovery configurations. ### Why are the changes needed? We need to document - #44173 - #44129 - #44113 ![Screenshot 2023-12-06 at 7 15 24 PM](https://github.com/apache/spark/assets/9700541/04f0be6f-cdfb-4d87-b1b5-c4bf131f460a) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44227 from dongjoon-hyun/SPARK-46299. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…oSerializer` ### What changes were proposed in this pull request? This PR aims to improve `PersistenceEngine` performance with `KryoSerializer` via introducing a new configuration, `spark.deploy.recoverySerializer`. ### Why are the changes needed? Allow users to choose a better serializer to get a better performance in their environment. Especially, `KryoSerializer` is about **3x faster** than `JavaSerializer` with `FileSystemPersistenceEngine`. ``` ================================================================================================ PersistenceEngineBenchmark ================================================================================================ OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Linux 5.15.0-1051-azure AMD EPYC 7763 64-Core Processor 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------------- ZooKeeperPersistenceEngine with JavaSerializer 1202 1298 138 0.0 1201614.2 1.0X ZooKeeperPersistenceEngine with KryoSerializer 951 1004 48 0.0 950559.0 1.3X FileSystemPersistenceEngine with JavaSerializer 212 217 6 0.0 211623.2 5.7X FileSystemPersistenceEngine with KryoSerializer 79 81 2 0.0 79132.5 15.2X BlackHolePersistenceEngine 0 0 0 30.9 32.4 37109.8X ``` ### Does this PR introduce _any_ user-facing change? No. The default behavior is the same. ### How was this patch tested? Pass the CIs with the new added test cases. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44113 from dongjoon-hyun/SPARK-46205. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This is a follow-up of apache#44113 to address a comment about simplifying the benchmark. ### Why are the changes needed? To simplify. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ build/sbt "core/Test/runMain org.apache.spark.deploy.master.PersistenceEngineBenchmark" ... [info] OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Mac OS X 14.2 [info] Apple M1 Max [info] 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------- [info] ZooKeeperPersistenceEngine with JavaSerializer 12602 12847 253 0.0 12601717.5 1.0X [info] ZooKeeperPersistenceEngine with KryoSerializer 12116 12130 13 0.0 12116373.8 1.0X [info] FileSystemPersistenceEngine with JavaSerializer 429 435 6 0.0 429374.9 29.3X [info] FileSystemPersistenceEngine with KryoSerializer 179 180 2 0.0 178795.7 70.5X [info] BlackHolePersistenceEngine 0 0 0 46.6 21.5 587273.6X [success] Total time: 126 s (02:06), completed Dec 1, 2023, 7:59:25 PM ``` ### Was this patch authored or co-authored using generative AI tooling? Closes apache#44118 from dongjoon-hyun/SPARK-46205-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to update `Spark Standalone` cluster recovery configurations. ### Why are the changes needed? We need to document - apache#44173 - apache#44129 - apache#44113 ![Screenshot 2023-12-06 at 7 15 24 PM](https://github.com/apache/spark/assets/9700541/04f0be6f-cdfb-4d87-b1b5-c4bf131f460a) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44227 from dongjoon-hyun/SPARK-46299. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…oSerializer` ### What changes were proposed in this pull request? This PR aims to improve `PersistenceEngine` performance with `KryoSerializer` via introducing a new configuration, `spark.deploy.recoverySerializer`. ### Why are the changes needed? Allow users to choose a better serializer to get a better performance in their environment. Especially, `KryoSerializer` is about **3x faster** than `JavaSerializer` with `FileSystemPersistenceEngine`. ``` ================================================================================================ PersistenceEngineBenchmark ================================================================================================ OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Linux 5.15.0-1051-azure AMD EPYC 7763 64-Core Processor 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------------- ZooKeeperPersistenceEngine with JavaSerializer 1202 1298 138 0.0 1201614.2 1.0X ZooKeeperPersistenceEngine with KryoSerializer 951 1004 48 0.0 950559.0 1.3X FileSystemPersistenceEngine with JavaSerializer 212 217 6 0.0 211623.2 5.7X FileSystemPersistenceEngine with KryoSerializer 79 81 2 0.0 79132.5 15.2X BlackHolePersistenceEngine 0 0 0 30.9 32.4 37109.8X ``` ### Does this PR introduce _any_ user-facing change? No. The default behavior is the same. ### How was this patch tested? Pass the CIs with the new added test cases. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44113 from dongjoon-hyun/SPARK-46205. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This is a follow-up of apache#44113 to address a comment about simplifying the benchmark. ### Why are the changes needed? To simplify. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually. ``` $ build/sbt "core/Test/runMain org.apache.spark.deploy.master.PersistenceEngineBenchmark" ... [info] OpenJDK 64-Bit Server VM 17.0.9+9-LTS on Mac OS X 14.2 [info] Apple M1 Max [info] 1000 Workers: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------------- [info] ZooKeeperPersistenceEngine with JavaSerializer 12602 12847 253 0.0 12601717.5 1.0X [info] ZooKeeperPersistenceEngine with KryoSerializer 12116 12130 13 0.0 12116373.8 1.0X [info] FileSystemPersistenceEngine with JavaSerializer 429 435 6 0.0 429374.9 29.3X [info] FileSystemPersistenceEngine with KryoSerializer 179 180 2 0.0 178795.7 70.5X [info] BlackHolePersistenceEngine 0 0 0 46.6 21.5 587273.6X [success] Total time: 126 s (02:06), completed Dec 1, 2023, 7:59:25 PM ``` ### Was this patch authored or co-authored using generative AI tooling? Closes apache#44118 from dongjoon-hyun/SPARK-46205-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This PR aims to update `Spark Standalone` cluster recovery configurations. ### Why are the changes needed? We need to document - apache#44173 - apache#44129 - apache#44113 ![Screenshot 2023-12-06 at 7 15 24 PM](https://github.com/apache/spark/assets/9700541/04f0be6f-cdfb-4d87-b1b5-c4bf131f460a) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44227 from dongjoon-hyun/SPARK-46299. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request? This is a logical revert of SPARK-46205 - #44113 - #44118 ### Why are the changes needed? The initial implementation didn't handle the class initialization logic properly. Until we have a fix, I'd like to revert this from `master` branch. ### Does this PR introduce _any_ user-facing change? No, this is not released yet. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46087 from dongjoon-hyun/SPARK-47875. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
This PR aims to improve
PersistenceEngine
performance withKryoSerializer
via introducing a new configuration,spark.deploy.recoverySerializer
.Why are the changes needed?
Allow users to choose a better serializer to get a better performance in their environment. Especially,
KryoSerializer
is about 3x faster thanJavaSerializer
withFileSystemPersistenceEngine
.Does this PR introduce any user-facing change?
No. The default behavior is the same.
How was this patch tested?
Pass the CIs with the new added test cases.
Was this patch authored or co-authored using generative AI tooling?
No.