New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.5: Support executor cache locality #9563
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -331,4 +331,24 @@ private long driverMaxResultSize() { | |
SparkConf sparkConf = spark.sparkContext().conf(); | ||
return sparkConf.getSizeAsBytes(DRIVER_MAX_RESULT_SIZE, DRIVER_MAX_RESULT_SIZE_DEFAULT); | ||
} | ||
|
||
public boolean executorCacheLocalityEnabled() { | ||
return executorCacheEnabled() && executorCacheLocalityEnabledInternal(); | ||
} | ||
|
||
private boolean executorCacheEnabled() { | ||
return confParser | ||
.booleanConf() | ||
.sessionConf(SparkSQLProperties.EXECUTOR_CACHE_ENABLED) | ||
.defaultValue(SparkSQLProperties.EXECUTOR_CACHE_ENABLED_DEFAULT) | ||
.parse(); | ||
} | ||
|
||
private boolean executorCacheLocalityEnabledInternal() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a little concerned that this doesn't play well with Spark's Did you test how would this work with dynamic allocation enabled? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My original approach was to enable the executor cache locality by default only if dynamic allocation is disabled. After thinking more about it, I decided to simply disable it by default no matter whether static or dynamic allocation is used. As of right now, folks have to opt-in explicitly to enable executor cache locality. That way, we ensure there are no extra waits added on our end as we can't guarantee the locality would be beneficial. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That's my first thought too. Then I realize what if users want to enable this anyway. It should be up to the users to decide. What about log a warning when both dynamic allocation and executor cache locality are enabled. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My worry is that we don't really know if enabling this with dynamic allocation is going to hurt. For instance, it still may make sense if the min number of executors is big enough or if the cluster is hot. Given that we would also have to add logic to parse the dynamic allocation config, I'd probably not log it and trust the person setting this for now. |
||
return confParser | ||
.booleanConf() | ||
.sessionConf(SparkSQLProperties.EXECUTOR_CACHE_LOCALITY_ENABLED) | ||
.defaultValue(SparkSQLProperties.EXECUTOR_CACHE_LOCALITY_ENABLED_DEFAULT) | ||
.parse(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it only checks for records equality, but doesn't check the executor cache locality?
I think we may check spark RDD's
getPreferredLocations
instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am afraid we can't really test this as
SparkUtil$executorLocations
would return an empty list in our local testing env. This test is to simply ensure nothing breaks if run on the driver.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Iceberg's spark tests only support
local
mode. There's alocal-cluster
mode which requires extra setups. I think it's fine to left it as it is.