[KYUUBI #7407] STGroup free to avoid OOM Kill#7408
[KYUUBI #7407] STGroup free to avoid OOM Kill#7408oh0873 wants to merge 5 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a heap growth/OOM issue in LDAP authentication by ensuring StringTemplate’s cached compiled templates/tokens don’t accumulate indefinitely.
Changes:
- Instantiate
STwith a per-querySTGroupinstead of the default group. - Unload the
STGroupafter rendering the LDAP search filter to release cached template state.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ication/ldap/Query.scala Spelling Fix Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| this.filterTemplate = new ST(filterTemplate) | ||
| val group = new STGroup() | ||
| this.filterTemplateGroup = Some(group) | ||
| this.filterTemplate = new ST(group, filterTemplate) |
There was a problem hiding this comment.
does cache only happen in filterTemplate.render?
what will happen if we call def filter multi times?
There was a problem hiding this comment.
Cache is created in def filter, but it won't be needed after createFilter that's why I added unload right after createFilter. I'm not sure if this is the best place to unload.
At least in our use case it seems to unload most if not all STToken used by LDAP auth process.
There was a problem hiding this comment.
Cache is created in
def filter
so cache still leaks if we call def filter multi times?
There was a problem hiding this comment.
Yes. IF def filter is called but build never called those STToken will stay in heap forever.
There was a problem hiding this comment.
I mean
builder
.filter("exp1")
.filter("exp2") // this replaces exp1
.build()
will this case cause a leak?
There was a problem hiding this comment.
Yes, I think this would cause a leak.
There was a problem hiding this comment.
okay, as this is only used internally, we don't have such anti-patterns yet, let's somehow document those dangerous behaviors
BTW, this part of the code is ported from Apache Hive, it should have the same issue
|
just out of curiosity, how much heap memory do you configure for kyuubi server when observing this issue? we have some customers who have used LDAP for many years, but have not received such reports. |
We set 2G for kyuubi heap memory. Kyuubi server pod would get OOM kill in 2-7 days. |
|
ah, it's relatively rare to set such a small heap for a big data component 🤣 |
|
Oh okay. After this STToken fixes, we generally see less than 25% of our heap getting used. Should we need to increase our heap size? (2G is in our testing environment) |
2g is fine if it works well on your workload. it's intended to shift heavy workloads from server to engine so that server should be light and stable. |
…ication/ldap/Query.scala Co-authored-by: Cheng Pan <pan3793@gmail.com>
### Why are the changes needed? When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up. This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods. This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up. ### How was this patch tested? This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix. ### Was this patch authored or co-authored using generative AI tooling? Cursor auto-complete feature was used. Closes #7408 from oh0873/hoonoh/STTokenCleanups. Closes #7407 72740e0 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala 18ee9bb [Hoon Oh] Added () to createFilter and render dd0f910 [Hoon Oh] Explicit group definition 321c430 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala 28cadbc [Hoon Oh] STGroup free to avoid OOM Kill Lead-authored-by: Hoon Oh <hoonoh@geico.com> Co-authored-by: Hoon Oh <92890928+oh0873@users.noreply.github.com> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit 73a1af1) Signed-off-by: Cheng Pan <chengpan@apache.org>
### Why are the changes needed? When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up. This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods. This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up. ### How was this patch tested? This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix. ### Was this patch authored or co-authored using generative AI tooling? Cursor auto-complete feature was used. Closes #7408 from oh0873/hoonoh/STTokenCleanups. Closes #7407 72740e0 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala 18ee9bb [Hoon Oh] Added () to createFilter and render dd0f910 [Hoon Oh] Explicit group definition 321c430 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala 28cadbc [Hoon Oh] STGroup free to avoid OOM Kill Lead-authored-by: Hoon Oh <hoonoh@geico.com> Co-authored-by: Hoon Oh <92890928+oh0873@users.noreply.github.com> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit 73a1af1) Signed-off-by: Cheng Pan <chengpan@apache.org>
|
thanks, merged to master/1.11.2/1.10.4 |
Why are the changes needed?
When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up.
This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods.
This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up.
How was this patch tested?
This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix.
Was this patch authored or co-authored using generative AI tooling?
Cursor auto-complete feature was used.