Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I find blaze has no acceleration effect?why #426

Closed
bigmancomeon opened this issue Mar 29, 2024 · 11 comments
Closed

I find blaze has no acceleration effect?why #426

bigmancomeon opened this issue Mar 29, 2024 · 11 comments

Comments

@bigmancomeon
Copy link

spark version 3.3.3

this is spark conf with blaze
spark.executor.memory 5g
spark.executor.memoryOverhead 3072
spark.blaze.memoryFraction 0.7
spark.blaze.enable.caseconvert.functions true
spark.blaze.enable.smjInequalityJoin false
spark.blaze.enable.bhjFallbacksToSmj false

this is spark conf without blaze
spark.executor.memory 6g
spark.executor.memoryOverhead 2048

driver-memory 4G
num-executors 6

I find , There is not much difference between Spark sql job with and without blaze,
using 100G tpcds parquet data, running spark-sql like query4 query5, query16 query17 in the picture ,The speed difference is not significant between using blaze and not using.Even for the same spark-sql job with blaze or without blaze, running it three times, The time consumption is also different each time
1711691205813

@richox
Copy link
Collaborator

richox commented Mar 29, 2024

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

@bigmancomeon
Copy link
Author

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

@richox
Copy link
Collaborator

richox commented Apr 1, 2024

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

i suggest testing with a larger dataset with isolated cluster to get stable benchmark result. most time is taken in driver side and the performance is not stable if the dataset is too small.

@bigmancomeon
Copy link
Author

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

i suggest testing with a larger dataset with isolated cluster to get stable benchmark result. most time is taken in driver side and the performance is not stable if the dataset is too small.

thanks ,maybe you are right,now the data I use tpcds to generate 100 G text data,then transformer it to parquet format ,spark resources is 3*8g executor + 4g driver ,as show in the picture ,query4 sql run thre times,the largest time is 993s,small time is 802,this is not stable.Next I will do as you suggest
image

@bigmancomeon
Copy link
Author

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

i suggest testing with a larger dataset with isolated cluster to get stable benchmark result. most time is taken in driver side and the performance is not stable if the dataset is too small.

By the way, what are the num executor and executor core values for Spark job during the following official testing

https://github.com/kwai/blaze/blob/master/benchmark-results/20240202.md

@richox
Copy link
Collaborator

richox commented Apr 1, 2024

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

i suggest testing with a larger dataset with isolated cluster to get stable benchmark result. most time is taken in driver side and the performance is not stable if the dataset is too small.

By the way, what are the num executor and executor core values for Spark job during the following official testing

https://github.com/kwai/blaze/blob/master/benchmark-results/20240202.md

we use the setting spark.executor.cores 5 which is the same to production conf.

@bigmancomeon
Copy link
Author

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

thanks ,this maybe the reason.But on the other hand, there is a limit 100 at the end of each query sql. Could this also be the reason for the unstable running time? Because the limit 100 only needs to take the 100 pieces of data from the final calculation result. Maybe every time it is run, the last one obtained 100 pieces of data are all different

i suggest testing with a larger dataset with isolated cluster to get stable benchmark result. most time is taken in driver side and the performance is not stable if the dataset is too small.

By the way, what are the num executor and executor core values for Spark job during the following official testing
https://github.com/kwai/blaze/blob/master/benchmark-results/20240202.md

we use the setting spark.executor.cores 5 which is the same to production conf.

spark.executor.cores is 5, and what is the number of executor?

@bigmancomeon
Copy link
Author

did you run the benchmark on a isolated cluster? it looks like the sql time varies greatly, may be resources issue?

I use 15executors2core +1driver core=31 CPUs and 158g executor+2g driver=122g of memory to run spark jobs. The resources are sufficient. Each job is run three times, but the running time is still different each time. The speed difference between using and not using blaze is almost the same. Does this plugin really work?

@MrFireChow
Copy link

I built the environment of spark3.3.3 and blaze2.0.8, then i did some tests based on 100G tpcds data,however,I did not receive any benefits compared to not using blaze as well.This is my login command:
spark-sql --master spark://xxxx:xxxx --conf spark.sql.extensions=org.apache.spark.sql.blaze.BlazeSparkSessionExtension --conf spark.shuffle.manager=org.apache.spark.sql.execution.blaze.shuffle.BlazeShuffleManager --conf spark.blaze.enable.smjInequalityJoin=true

@MrFireChow
Copy link

By the way, the plan tree shows that the plugin is indeed effective,plans are converted to native plan but the query time does not decrease.

@richox
Copy link
Collaborator

richox commented Jun 19, 2024

it's likely related to some hard-written configurations, for example shuffle compression is fixed to zstd in blaze, while spark uses lz4 as default. in low IO latency environment blaze will take more time on compression and slow down the performance.
we are working on a new version which should work on spark's default compression. you can benchmark tpch this branch: https://github.com/kwai/blaze/tree/3.0.0-preview1 (not completed, has some bugs on tpcds)

@richox richox closed this as completed Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants