-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] v0.29.1 seems to have a file descriptor leak #4296
Comments
Thanks for reporting this @mrparkers. I believe I tracked down the root cause and am not seeing a file descriptor leak anymore. Do you mind trying this snapshot version and confirm that it fixes the leak on your side as well? KARPENTER_VERSION=v0-8d82ffce1f13161df94bc9959bcefbbfcdcd0a3c |
We had some OOMKills for karpenter with v0.29.1. Our requests/limits are set to 300Mi. The memory usage increased rapidly, until the container/pods were. This might be related to this issue, since using |
Hi @jonathan-innis, thanks for the quick response, I appreciate the turnaround on this. I tested the new version and I no longer see an unusual amount of open file descriptors. It looks like the issue has been fixed. |
We're planning to release a |
|
Description
Observed Behavior:
The number of open file descriptors held by karpenter (v0.29.1) pods seems to be steadily climbing to really high numbers. This was caught by a DataDog monitor that monitors open file descriptors held by containers, via the metric
container.pid.open_files
.We automatically deploy new releases of karpenter to our staging environment. Here is a graph of this metric over a ~6 hour time period for v0.29.1:
Here is a graph of the same time period for all of our other clusters, which are running v0.29.0:
For another test, I used SSM to get a shell into one of the EKS nodes running a karpenter pod, and got a count of the open file descriptors:
Then I deleted the karpenter pod running on that node, and tried again:
You can see the results of deleting one of the pods using the same metric I referenced above. The new pod already climbed to ~250 open file descriptors within 30 minutes of the test:
Expected Behavior:
Karpenter pods should not have a file descriptor leak.
Reproduction Steps (Please include YAML):
I can provide more details about our exact configuration if necessary.
Versions:
kubectl version
):The text was updated successfully, but these errors were encountered: