[SPARK-23563] config codegen compile cache size#20712
[SPARK-23563] config codegen compile cache size#20712passionke wants to merge 6 commits intoapache:masterfrom
Conversation
|
Can one of the admins verify this patch? |
|
@cloud-fan @gatorsmile if I am not wrong, this is one of the place which is used not only by the driver but also by the executors, so the introduced parameter won't be evaluated properly, am I right? |
|
@mgaido91 Yeah, you are right. This is pretty pointless unless you either set this as a static spark conf or pass it to the executor using local properties. Both aren't very attractive IMO. @passionke why do we need this? |
|
+1, I would like to know the case where we increased cache hit ratio. |
|
@passionke can you elaborate on why we need this? A problem with increasing the cache is that we keep more classes around and those classes can blow out the JVM's internal code caches, which will have a far more negative influence on performance than any increase in the number of cached plans will. |
|
Thank's your attention, My application is a long term spark sql engine server, hundreds of spark sql run on it in spark local model. And these sql will not change until biz logic update. The sql depends on some tables which will be update before compute. So, In this special case, if I can make a larger CodeGenerator.cache size, I can reuse GeneratedClass instance, and In my view it will reduce meta space increase and decrease the full gc times |
|
it sounds reasonable to make the cache configurable, but we should figure out the best way to propagate codegen configs to the executor side. |
|
@passionke what you did doesn't solve the issue, I think. Your UTs are working because you are checking the value only on the driver. |
|
oops, so there has no good solution right now? shall I close this PR? |
|
@passionke yes, currently there is no good solution. But if you can find one, you're welcome |
What changes were proposed in this pull request?
config codegen compile class cache size
How was this patch tested?
spark sql integration tests