Skip to content

[KYUUBI #7407] STGroup free to avoid OOM Kill#7408

Closed
oh0873 wants to merge 5 commits intoapache:masterfrom
oh0873:hoonoh/STTokenCleanups
Closed

[KYUUBI #7407] STGroup free to avoid OOM Kill#7408
oh0873 wants to merge 5 commits intoapache:masterfrom
oh0873:hoonoh/STTokenCleanups

Conversation

@oh0873
Copy link
Copy Markdown
Contributor

@oh0873 oh0873 commented Apr 16, 2026

Why are the changes needed?

When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up.
This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods.

This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up.

How was this patch tested?

This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix.

Was this patch authored or co-authored using generative AI tooling?

Cursor auto-complete feature was used.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a heap growth/OOM issue in LDAP authentication by ensuring StringTemplate’s cached compiled templates/tokens don’t accumulate indefinitely.

Changes:

  • Instantiate ST with a per-query STGroup instead of the default group.
  • Unload the STGroup after rendering the LDAP search filter to release cached template state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

oh0873 and others added 2 commits April 20, 2026 10:39
…ication/ldap/Query.scala


Spelling Fix

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
this.filterTemplate = new ST(filterTemplate)
val group = new STGroup()
this.filterTemplateGroup = Some(group)
this.filterTemplate = new ST(group, filterTemplate)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does cache only happen in filterTemplate.render?

what will happen if we call def filter multi times?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is created in def filter, but it won't be needed after createFilter that's why I added unload right after createFilter. I'm not sure if this is the best place to unload.

At least in our use case it seems to unload most if not all STToken used by LDAP auth process.

Copy link
Copy Markdown
Member

@pan3793 pan3793 Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache is created in def filter

so cache still leaks if we call def filter multi times?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. IF def filter is called but build never called those STToken will stay in heap forever.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean

builder
  .filter("exp1")
  .filter("exp2") // this replaces exp1
  .build()

will this case cause a leak?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this would cause a leak.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, as this is only used internally, we don't have such anti-patterns yet, let's somehow document those dangerous behaviors

BTW, this part of the code is ported from Apache Hive, it should have the same issue

https://github.com/apache/hive/blob/branch-4.2/service/src/java/org/apache/hive/service/auth/ldap/Query.java#L90

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 21, 2026

just out of curiosity, how much heap memory do you configure for kyuubi server when observing this issue? we have some customers who have used LDAP for many years, but have not received such reports.

@oh0873
Copy link
Copy Markdown
Contributor Author

oh0873 commented Apr 21, 2026

just out of curiosity, how much heap memory do you configure for kyuubi server when observing this issue? we have some customers who have used LDAP for many years, but have not received such reports.

We set 2G for kyuubi heap memory. Kyuubi server pod would get OOM kill in 2-7 days.

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 21, 2026

ah, it's relatively rare to set such a small heap for a big data component 🤣

@oh0873
Copy link
Copy Markdown
Contributor Author

oh0873 commented Apr 21, 2026

Oh okay. After this STToken fixes, we generally see less than 25% of our heap getting used. Should we need to increase our heap size? (2G is in our testing environment)

@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 22, 2026

Should we need to increase our heap size? (2G is in our testing environment)

2g is fine if it works well on your workload. it's intended to shift heavy workloads from server to engine so that server should be light and stable.

…ication/ldap/Query.scala

Co-authored-by: Cheng Pan <pan3793@gmail.com>
@pan3793 pan3793 added this to the v1.10.4 milestone Apr 28, 2026
@pan3793 pan3793 closed this in 73a1af1 Apr 28, 2026
pan3793 pushed a commit that referenced this pull request Apr 28, 2026
### Why are the changes needed?

When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up.
This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods.

This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up.

### How was this patch tested?

This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix.

### Was this patch authored or co-authored using generative AI tooling?

Cursor auto-complete feature was used.

Closes #7408 from oh0873/hoonoh/STTokenCleanups.

Closes #7407

72740e0 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala
18ee9bb [Hoon Oh] Added () to createFilter and render
dd0f910 [Hoon Oh] Explicit group definition
321c430 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala
28cadbc [Hoon Oh] STGroup free to avoid OOM Kill

Lead-authored-by: Hoon Oh <hoonoh@geico.com>
Co-authored-by: Hoon Oh <92890928+oh0873@users.noreply.github.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 73a1af1)
Signed-off-by: Cheng Pan <chengpan@apache.org>
pan3793 pushed a commit that referenced this pull request Apr 28, 2026
### Why are the changes needed?

When LDAP Authentication is used ST Token is created and saved in the cache, but it is never freed up.
This is causing continuous increase in heap usage, eventually causing out-of-memory for kyuubi server pods.

This changes is added to clear ST Tokens. Also ST Group is added to avoid any race condition during clean up.

### How was this patch tested?

This patch was tested in our customer environment. We observed there's no more continuous heap increase after the fix.

### Was this patch authored or co-authored using generative AI tooling?

Cursor auto-complete feature was used.

Closes #7408 from oh0873/hoonoh/STTokenCleanups.

Closes #7407

72740e0 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala
18ee9bb [Hoon Oh] Added () to createFilter and render
dd0f910 [Hoon Oh] Explicit group definition
321c430 [Hoon Oh] Update kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Query.scala
28cadbc [Hoon Oh] STGroup free to avoid OOM Kill

Lead-authored-by: Hoon Oh <hoonoh@geico.com>
Co-authored-by: Hoon Oh <92890928+oh0873@users.noreply.github.com>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 73a1af1)
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793
Copy link
Copy Markdown
Member

pan3793 commented Apr 28, 2026

thanks, merged to master/1.11.2/1.10.4

@oh0873 oh0873 deleted the hoonoh/STTokenCleanups branch April 28, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants