Resource Limits #395

PsychoSid · 2020-03-31T10:38:51Z

Hi,

The Resource Limits, and Requests in the deployment checks are hard-coded and are currently failing on my environment and are being OOM killed using the requested nginx pods.

Containers:
  deployment-container:
    Container ID:   docker://1877dbf02a13e62f65410980aae380461218175dab76623f879a8b2fcacbd504
    Image:          nginx:1.17.8
    Image ID:       docker-pullable://nginx@sha256:4a50ed86d8c86e35f530d4a168173677a192177eed14146fbb5728b1b3a2d4de
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 31 Mar 2020 11:26:23 +0100
      Finished:     Tue, 31 Mar 2020 11:26:29 +0100
    Ready:          False
    Restart Count:  2
    Limits:
      cpu:     75m
      memory:  75Mi
    Requests:
      cpu:        15m
      memory:     20Mi

I am not sure why this is the case. I tried setting a limit range on the namespace but the deployment.go hard codes them and they are not configurable as far as I can tell.

Is there a way to amend this in configuration, or possible to change them at all ?

Thanks

The text was updated successfully, but these errors were encountered:

joshulyne · 2020-03-31T16:39:13Z

Hi @PsychoSid!
That's interesting, I don't think we've ever run into that issue yet. It's possible, as we set default resource limits here in the code: https://github.com/Comcast/kuberhealthy/blob/master/cmd/deployment-check/deployment.go#L45 and we can replace that with env vars so its more configurable on the deployment-check.yaml. @integrii @jonnydawg thoughts?

jonnydawg · 2020-03-31T16:49:36Z

Interesting -- the resource limits and requests are set to be hard-coded. @joshulyne I think you are right, I think the best way about this is probably parsing some env vars and then passing it over to the nginx container limits and requests.

I can look into making this change later today.

mikeinton · 2020-03-31T18:43:36Z

@PsychoSid - Could it be that the OOM killer is acting aggressively when these deployment pods are scheduled on nodes that are already under heavy memory pressure?

PsychoSid · 2020-03-31T19:03:25Z

Thanks all for looking into this. Yes it’s likely that more usage on these is causing it. They are small clusters whilst we do some testing for Anthos.

jonnydawg · 2020-04-30T17:48:19Z

This has been implemented!

PsychoSid added the feature request A request for a specific feature to be added to Kuberhealthy label Mar 31, 2020

jonnydawg mentioned this issue Apr 23, 2020

Deployment check patch4 #434

Merged

jonnydawg closed this as completed Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource Limits #395

Resource Limits #395

PsychoSid commented Mar 31, 2020

joshulyne commented Mar 31, 2020

jonnydawg commented Mar 31, 2020

mikeinton commented Mar 31, 2020

PsychoSid commented Mar 31, 2020

jonnydawg commented Apr 30, 2020

Resource Limits #395

Resource Limits #395

Comments

PsychoSid commented Mar 31, 2020

joshulyne commented Mar 31, 2020

jonnydawg commented Mar 31, 2020

mikeinton commented Mar 31, 2020

PsychoSid commented Mar 31, 2020

jonnydawg commented Apr 30, 2020