Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not releasing old ENIs #123

Closed
vpm-bradleyhession opened this issue Jul 3, 2018 · 51 comments
Closed

Not releasing old ENIs #123

vpm-bradleyhession opened this issue Jul 3, 2018 · 51 comments
Labels
priority/P1 Must be staffed and worked currently or soon. Is a candidate for next release
Milestone

Comments

@vpm-bradleyhession
Copy link

vpm-bradleyhession commented Jul 3, 2018

It looks like old ENIs are not being deleted by IPAM. What happens is it keeps creating ENIs till it hits the account limits.

What then happens is new pods cannot start - error "Failed create pod sandbox." Manually deleting "Available" ENIs allows the controller to create a new ENI, and schedule the pod as normal.

Number of nodes is 25. Not the highest we have had.

Potentially related to #18

I am happy to help with debugging/finding the root cause.

@vpm-bradleyhession
Copy link
Author

vpm-bradleyhession commented Jul 3, 2018

Additionally it looks like the aws-node controller hits API request ratelimits.

Error code
Client.RequestLimitExceeded

error

@vpm-bradleyhession
Copy link
Author

Deleting aws-node on this node resolved the issue. Something is definitely wonky.

@tomfotherby
Copy link

I'm seeing this issue, sporadically, with some nodes. I agree deleting aws-node resolves the issue. Related: #59

@liwenwu-amazon
Copy link
Contributor

liwenwu-amazon commented Jul 5, 2018

@vpm-bradleyhession , troubleshootting-guilde provides guideline on how to troubleshoot CNI issue at cluster level.

ipamD should release old ENI if the number of Pods running on the nodes goes below threshold.

You can use cni-metrics-helper to view aggregated ENIs and IPs information at the cluster level.

kubectl apply -f cni_metrics_helper.yaml

You can find out cluster level ipamD statistics

kubectl logs cni-metrics-helper-xxxxxx -n kube-system
...
0705 15:52:30.539478       7 metrics.go:250] Processing metric: total_ip_addresses
I0705 15:52:30.539482       7 metrics.go:350] Produce GAUGE metrics: assignIPAddresses, value: 23.000000
I0705 15:52:30.539491       7 metrics.go:350] Produce GAUGE metrics: eniAllocated, value: 8.000000
I0705 15:52:30.539496       7 metrics.go:350] Produce GAUGE metrics: eniMaxAvailable, value: 15.000000
I0705 15:52:30.539500       7 metrics.go:350] Produce GAUGE metrics: maxIPAddresses, value: 735.000000
I0705 15:52:30.539506       7 metrics.go:350] Produce GAUGE metrics: totalIPAddresses, value: 387.000000
I0705 15:52:30.539510       7 metrics.go:340] Produce COUNTER metrics: ipamdErr, value: 0.000000
I0705 15:52:30.539516       7 metrics.go:350] Produce GAUGE metrics: ipamdActionInProgress, value: 0.000000
I0705 15:52:30.539521       7 metrics.go:340] Produce COUNTER metrics: reconcileCount, value: 0.000000

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession , @tomfotherby if you see issue #18 or #59 on some nodes, can you run /opt/cni/bin/aws-cni-support.sh and collect node level debugging information? You can send the node level debugging info directly to me (liwenwu@amazon.com)

@vpm-bradleyhession
Copy link
Author

The problem is that 90% of the requests to the AWS API are rate limited. The number of ENIs that are hanging around far out way the ones that are actually deleted.

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession is it possible your "DeleteNetworkInterface" API threshold is caused by some other tools or manually delete all ENIs at same time as ipamD.

if ipamD is being throttled, today, it uses AWS SDK exponential backoff and you should ipamdActionInProgress not being 0

kubectl logs cni-metrics-helper-xxxxxx -n kube-system
...
I0705 15:52:30.539482       7 metrics.go:350] Produce GAUGE metrics: assignIPAddresses, value: 23.000000
I0705 15:52:30.539491       7 metrics.go:350] Produce GAUGE metrics: eniAllocated, value: 8.000000
I0705 15:52:30.539496       7 metrics.go:350] Produce GAUGE metrics: eniMaxAvailable, value: 15.000000
I0705 15:52:30.539500       7 metrics.go:350] Produce GAUGE metrics: maxIPAddresses, value: 735.000000
I0705 15:52:30.539506       7 metrics.go:350] Produce GAUGE metrics: totalIPAddresses, value: 387.000000
I0705 15:52:30.539510       7 metrics.go:340] Produce COUNTER metrics: ipamdErr, value: 0.000000
I0705 15:52:30.539516       7 metrics.go:350] Produce GAUGE metrics: ipamdActionInProgress, value: 0.000000 <---- not zero
I0705 15:52:30.539521       7 metrics.go:340] Produce COUNTER metrics: reconcileCount, value: 0.000000

@vpm-bradleyhession
Copy link
Author

Hm, I dont think so. They we recently being deleted up until a few days ago (no changes that would affect it. Then one day it hit the ENI limit for our account.

I'm going to monitor it to catch when this happens again. Although, currently i'm getting

cni_metrics.go:102] Failed to grab CNI endpoint: the server is currently unable to handle the request (get pods aws-node-h99r5:61678) E0705 16:08:07.082577 6 metrics.go:407] Failed to getMetricsFromTarget: the server is currently unable to handle the request (get pods aws-node-h99r5:61678)

When using the metrics helper?

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession is it possible you can deploy cni_metrics_helper and share the output of kubectl logs cni-metrics-helper-xxxxx -n kube-system?

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession , what version is your CNI? It needs to be 1.0.0

kubectl get ds aws-node -n kube-system -o yaml | grep image
        image: 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:1.0.0

@vpm-bradleyhession
Copy link
Author

ah this might be it. It's Images:
602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:0.1.1

right now. Should mention this is a KOPs cluster so the CNI was provisioned there. Is this safe to upgrade?

@vpm-bradleyhession
Copy link
Author

We should open an issue w/ Kops to get the version bumped upstream also. Good spot on the version.

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession Please open an issue w/ Kops. 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:0.1.1 has fixed few issues and made improvement using AWS APIs

@vpm-bradleyhession
Copy link
Author

So is upgrading from 0.1.1 -> 1.0.0 safe to do?

@liwenwu-amazon
Copy link
Contributor

@vpm-bradleyhession Yes. it is safe to upgrade from 0.1.1 -> 1.0.0. The only thing you need to make sure your worker node can reach K8S apiServer on port 443. In 1.0.0, ipamD needs to communicate with K8S apiServer. Some deployment have HTTP PROXY enable to block aws-node communicating with K8S apiServer (e.g. #104).

@liwenwu-amazon
Copy link
Contributor

You can use telnet to confirm if worker node can reach its apiServer

telnet 10.100.0.1 443  <-- kubernetes SVC IP, port 443
Trying 10.100.0.1...
Connected to 10.100.0.1.
Escape character is '^]'.

@vpm-bradleyhession
Copy link
Author

vpm-bradleyhession commented Jul 5, 2018

@liwenwu-amazon thanks so much for your help! Yeah this seems to work, I will bump all nodes over to the CNI 1.0.0 version and see if this fixes our issue with rate limiting. I will update this when I have more information for you.

Thanks!

@tomfotherby
Copy link

For reference, I'm seeing the issue using EKS with CNI 1.0.0:

$ kubectl get ds aws-node -n kube-system -o yaml | grep image
image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/amazon-k8s-cni:1.0.0

(I'll provide more info when it next happens)

@jlogsdon
Copy link

jlogsdon commented Jul 5, 2018

We're experiencing this issue on a cluster using m4.xlarge instances (currently running 2 instances, only seen the issue one one of them so far). Cycling the aws-node pods as mentioned in another issue had no effect.

We use Helm and the failed upgrade was rolled back, but the pod count on either cluster shouldn't have gone above 20 during the process. Currently the node we saw the issue on is running 17 pods. While I've been investigating the issue it appears that one ENI on each of the instances was finally released, but they both still have far to many IPs associated for the number of pods running (17 pods across all namespaces on one instance, with 3 ENIs each with 15 addresses).

$ kubectl get ds aws-node -n kube-system -o yaml | grep image
image: 602401143452.dkr.ecr.us-east-1.amazonaws.com/amazon-k8s-cni:1.0.0
$ kubectl logs cni-metrics-helper-xxxxxxxxx -n kube-system
...
I0705 22:41:49.173918       7 metrics.go:101] Label: [name:"error" value:"free ENI: no ENI can be deleted at this time"  name:"fn" value:"decreaseIPPoolFreeENIFailed" ], Value: 167
I0705 22:41:49.173955       7 metrics.go:350] Produce GAUGE metrics: eniMaxAvailable, value: 8.000000
I0705 22:41:49.173969       7 metrics.go:350] Produce GAUGE metrics: maxIPAddresses, value: 112.000000
I0705 22:41:49.173977       7 metrics.go:340] Produce COUNTER metrics: ipamdErr, value: 167.000000
I0705 22:41:49.173983       7 metrics.go:350] Produce GAUGE metrics: assignIPAddresses, value: 18.000000
I0705 22:41:49.173996       7 metrics.go:350] Produce GAUGE metrics: totalIPAddresses, value: 70.000000
I0705 22:41:49.174000       7 metrics.go:350] Produce GAUGE metrics: eniAllocated, value: 5.000000
I0705 22:41:49.174004       7 metrics.go:350] Produce GAUGE metrics: ipamdActionInProgress, value: 0.000000

@liwenwu-amazon
Copy link
Contributor

@jlogsdon , today, ipamD will NOT free an ENI if there is any running Pod on top of this ENI. In another word, if there is any Pod using secondary IP address from this ENI, ipamD will not free that ENI. And it will increment that particular error count

I0705 22:41:49.173918       7 metrics.go:101] Label: [name:"error" value:"free ENI: no ENI can be deleted at this time"  name:"fn" value:"decreaseIPPoolFreeENIFailed" ], Value: 167

@jlogsdon
Copy link

jlogsdon commented Jul 5, 2018

Would the IP addresses still assigned to that ENI be unavailable for new pods? It seems like there should have been plenty of capacity for more pods given our instance size and running pod count.

@liwenwu-amazon
Copy link
Contributor

you can find out pod allocation by

curl http://localhost:61678/v1/enis | python -m json.tool

@jlogsdon
Copy link

jlogsdon commented Jul 5, 2018

Thanks. Next time we see this I'll look at that output.

@liwenwu-amazon
Copy link
Contributor

The IP address of ENI that is NOT assigned to other Pods are available for new Pod. In another word, all those IP address with "Assigned": false are available for new Pods. Here is one example on my setup:

curl http://localhost:61678/v1/enis | python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   941  100   941    0     0   165k      0 --:--:-- --:--:-- --:--:--  183k
{
    "AssignedIPs": 8,
    "ENIIPPools": {
        "eni-00d33be72e3648649": {
            "AssignedIPv4Addresses": 5,
            "DeviceNumber": 2,
            "ID": "eni-00d33be72e3648649",
            "IPv4Addresses": {
                "10.0.101.97": {
                    "Assigned": true
                },
                "10.0.106.145": {
                    "Assigned": true
                },
                "10.0.107.149": {
                    "Assigned": true
                },
                "10.0.126.201": {
                    "Assigned": true
                },
                "10.0.127.191": {
                    "Assigned": true
                }
            },
            "IsPrimary": false
        },
        "eni-0353b1dce00ab1b51": {
            "AssignedIPv4Addresses": 3,
            "DeviceNumber": 0,
            "ID": "eni-0353b1dce00ab1b51",
            "IPv4Addresses": {
                "10.0.110.85": {
                    "Assigned": false  <---- this one is available for new pods
                },
                "10.0.118.236": {
                    "Assigned": false
                },

@tomfotherby
Copy link

For us, because we're running pods with small resource requests, the limiting factor for a node is the number of IP addresses. I did some calculations and the best 3 types are t2.medium, t2.large and c5.large. I put the price per IP into a blog post.

@garo
Copy link

garo commented Nov 16, 2018

I'm seeing a possibly related problem in our cluster that in some cases the worker nodes doesn't release POD IP addresses, which ends up filling the worker node with unusable slots and eventually the worker node is not able to start new pods.

When this problem appears kubernetes thinks that the node can accept pods, so they are scheduled but stuck with this error: "Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "xxxxxx" network: add cmd: failed to assign an IP address to container"

Currently the only workaround I know is to ssh into the worker nodes and delete all docker containers which are in Exited-state, for example with this oneliner: "docker ps -a | grep Exited | awk '{print $1}' | xargs -n 1 docker rm"

@tomfotherby
Copy link

tomfotherby commented Dec 11, 2018

Today we were seeing Pods fail to start in EKS:

k describe pod my-pod-fb756

Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "my-pod-fb756_default" network: add cmd: failed to assign an IP address to container"

It turned out our subnet had only 256 IPs and it had run out of spare IPs.

For reference, to debug it, on Amazon Linux 2, official AMI:

  1. ssh to the EKS worker with the issue and check the ipamd logs.
  2. Use ls -ltrh /var/log/aws-routed-eni/ to see what ipamd logs have been written.
  3. tail the latest log:
$ tail /var/log/aws-routed-eni/ipamd.log.2018-12-11-14
2018-12-11T14:53:32Z [DEBUG] Found eni eni-06443b854 that have less IP address allocated: cur=0, max=14
2018-12-11T14:53:32Z [DEBUG] Attempt again to allocate IP address for eni :eni-06e104f443b854
2018-12-11T14:53:32Z [INFO] Trying to allocate 14 IP address on eni eni-06e10243b854
2018-12-11T14:53:32Z [ERROR] Failed to allocate a private IP address InsufficientFreeAddressesInSubnet: The specified subnet does not have enough free addresses to satisfy the request.
	status code: 400, request id: 236be01a9-82d9-dc815adf8c7e
2018-12-11T14:53:32Z [WARN] During eni repair: error encountered on allocate IP addressallocate ip address: failed to allocate a private IP address: InsufficientFreeAddressesInSubnet: The specified subnet does not have enough free addresses to satisfy the request.
	status code: 400, request id: 236b77dd-41a9-82d9-dc815adf8c7e
2018-12-11T14:53:32Z [DEBUG] IP pool stats: total = 0, used = 0, c.currentMaxAddrsPerENI = 14, c.maxAddrsPerENI = 14
2018-12-11T14:53:32Z [DEBUG] Start increasing IP Pool size
2018-12-11T14:53:32Z [INFO] createENI: use primary interface's config, [0xc428700 0xc4204d8730], subnet-d358a5
2018-12-11T14:53:32Z [ERROR] Failed to CreateNetworkInterface InsufficientFreeAddressesInSubnet: The specified subnet does not have enough free addresses to satisfy the request.
	status code: 400, request id: 59abf6c-b54a9a-0ea02639904a
2018-12-11T14:53:32Z [ERROR] Failed to increase pool size due to not able to allocate ENI allocate eni: failed to create eni: failed to create network interface: InsufficientFreeAddressesInSubnet: The specified subnet does not have enough free addresses to satisfy the request.
	status code: 400, request id: 59abfa6c-b1-8a9a-0ea02639904a

Aha: InsufficientFreeAddressesInSubnet.

We run a few CronJob every minute - this burns through IPs :(

@stafot
Copy link

stafot commented Dec 11, 2018

Following up @tomfotherby's comment seem that we had a significant amount of ENIs that were not in use but kept IPs attached. So our EKS workers subnets starved.

@vpm-bradleyhession
Copy link
Author

vpm-bradleyhession commented Dec 11, 2018

Some more information for context - the kubernetes scheduler is not aware of a node that's hit it's ENI / Addr limits. For example:

Instance with 18 Addresses -> 18 Pods assigned -> 0 Free addresses -> 75% CPU Requested ->
75% Memory requested

In this scenario we get pods that have < 25% CPU OR Memory trying to schedule on this node when there are no addresses left.

The Kubernetes scheduler will always try to assign a pod to this node (since the scheduler is not aware IP Limits are a thing).

@hjimene
Copy link

hjimene commented Dec 12, 2018

@vpm-bradleyhession Hello, I was wondering if you ever got the cni-metrics-helper to push logs to cloudwatch? It seems that we are both using kops for our kubernetes cluster so we might have a similar setup, I deployed the metrics helper to the cluster, and created the IAM policy for cloudwatch, I also attached that policy to the master/worker roles for my cluster. However, I am still seeing this error 'Unable to publish cloudwatch metrics: NoCredentialProviders: no valid providers in chain. Deprecated.' in the cni-metrics-helper pod.

@harshal-shah
Copy link

Does anyone here have any reliable solutions on how to resolve these issues? We seem to run into them multiple times and our work around so far is to find nodes with more than X ENI's and cordon them. Eventually the cluster autoscaler takes care of removing the cordoned instances

@RTodorov
Copy link

This is causing some real issues for us. All of our clusters are affected by this as the scheduler gets busier throughout the day. Can we acknowledge this as a bug?

@liwenwu-amazon do you have any suggestions on how to mitigate the problem? It might not affect the majority of the use-cases but when you're relying 100% on k8s+aws-cni and it happens, the disturbance on the cluster's stability is quite big. For us, it's happening literally every day now.

@garo
Copy link

garo commented Jan 11, 2019

Our solution is to clean terminated pods, in our case every terminated pod older than one hour is deleted. This seems to prevent this problem from manifesting.

@vpm-bradleyhession
Copy link
Author

Yeah we have set successfulJobHistory: (and failed, too) to 1 on most of our jobs and it seems to mitigate the issue for us, however - this still needs to be acknowledged with an upstream fix.

@harshal-shah
Copy link

@garo Could you please explain "clean terminated pods" ? This could be a good solution for us as well

@harshal-shah
Copy link

To be clear again, our problem is that the node does not have many pods if you do a describe node or even run ifconfig on the host :

Non-terminated Pods:         (7 in total)
  Namespace                  Name                                                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                                      ------------  ----------  ---------------  -------------  ---
  default                    some-service-http-59dbc5678-t7rpl                  50m (1%)      0 (0%)      100Mi (0%)       0 (0%)         21d
  kube-system                aws-node-b6xpw                                            10m (0%)      0 (0%)      0 (0%)           0 (0%)         29d
  kube-system                kube-proxy-ip-172-23-113-91.eu-west-1.compute.internal    100m (2%)     0 (0%)      0 (0%)           0 (0%)         29d
  logging                    kube-gelf-d4ztv                                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         29d
  monitoring                 kube-prometheus-prometheus-node-exporter-v8kqz            0 (0%)        0 (0%)      0 (0%)           0 (0%)         29d
  tracing                    jaeger-agent-daemonset-zmsqb                              0 (0%)        0 (0%)      0 (0%)           0 (0%)         3d23h
  utility                    kiam-agent-l88bs                                          0 (0%)        0 (0%)      0 (0%)           0 (0%)         29d

But looking at the node from aws console:

image

So how can we clean up these secondary IPs ?

@garo
Copy link

garo commented Jan 11, 2019

@harshal-shah I run this https://github.com/hjacobs/kube-job-cleaner as a slightly modified version which just does the cleanup again in an infinitive loop every five minutes.

If you ssh into your affected worker node and type "docker ps -a" you should see a great amount of Exited containers and these seems to be what are holding the cni plugin back from releasing the IPs. It seems that deleting terminated pods will eventually clean these away and thus provide a workaround.

@harshal-shah
Copy link

I checked on the node mentioned above, there were a few dead containers, even after deleting those containers, the IPs are not released.

@vpm-bradleyhession
Copy link
Author

vpm-bradleyhession commented Jan 16, 2019

Update: We're now setting --max-pods on the Kubelet, this will match the number of ENIs (and addresses available). We are setting this per node group (using kops).

For kops specifically -

spec: kubelet: maxPods: <number>
https://gist.github.com/vpm-bradleyhession/b43a61a94e724f040dbf822a1a206f0d

This seems like the only mitigation that is holding strong for us at the moment.

@RTodorov
Copy link

Hello @mogren, could you comment on this issue please? we're getting affected on a daily basis, would be nice to have an official recommendation from AWS here. Thank you.

@mogren
Copy link
Contributor

mogren commented Jan 16, 2019

Hi @RTodorov!

What version of the plugin are you using? Is this cluster running on EKS? We have a known issue with the CNI that happens if ipamd is killed while it is in the middle of expanding ENI (e.g after having created a new ENI but before attaching it), then it might not be cleaned up.

Could you run /opt/cni/bin/aws-cni-support.sh (comment out line 34 if you get the error reported in #285) and send it to mogren@amazon.com?

@harshal-shah
Copy link

Hello @mogren
We are using the following image 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:1.3.0

The cluster is created via kops (version 1.11.0)

I have sent you the output of /opt/cni/bin/aws-cni-support.sh over email.
The node being checked here has 31 Non terminated pods.
We have also kept our MaxPods limit to 55

@mogren
Copy link
Contributor

mogren commented Jan 18, 2019

Thanks @harshal-shah, I discussed this with Liwen and basically it comes down to an optimization trade-off in the CNI plugin itself.

The way the CNI works is that it allocates IPs in bulk, and adds new ENIs when needed. It un-assigns IPs when the pod gets deleted, but keeps them allocated to the ENI as long as there are pods active in that ENI. (Check the eni.output file for details). When a new pod gets scheduled to the worker, it will randomly pick any existing IP address that is not assigned and allocate it to the new pod.

As soon as an ENI has no assigned IPs, meaning all pods using that ENI has been deleted, the ENI will be released and all the IPs with it. The reason for this behavior is both to make scheduling new pods a lot faster, no need to call another service, and also to prevent throttling from calling EC2 too much. (Throttling can still happen when scheduling a lot of pods quickly on new workers, not configuring WARM_IP_TARGET.)

If you run out of IP addresses in your Subnet, one solution is to use CNI custom network config and create separate subnets for the pods. Would that work for you?

@harshal-shah
Copy link

Hi @mogren
We ran into this issue today as well.
I'm sending you further details over email.

One thing that we see in ipamd logs is

2019-01-21T09:33:19Z [DEBUG] UnAssignIPv4Address: IP address pool stats: total:56, assigned 27, pod(Name: masked-service-projector-6ffc8b489c-pcm2d, Namespace: default, Container 403ceda1c2b1665fa89745f36755d0621c2813584e409036389488420f1860e3)
2019-01-21T09:33:19Z [WARN] UnassignIPv4Address: Failed to find pod masked-service-projector-6ffc8b489c-pcm2d namespace default Container 403ceda1c2b1665fa89745f36755d0621c2813584e409036389488420f1860e3
2019-01-21T09:33:19Z [DEBUG] UnAssignIPv4Address: IP address pool stats: total:56, assigned 27, pod(Name: masked-service-projector-6ffc8b489c-pcm2d, Namespace: default, Container )
2019-01-21T09:33:19Z [WARN] UnassignIPv4Address: Failed to find pod masked-service-projector-6ffc8b489c-pcm2d namespace default Container
2019-01-21T09:33:19Z [INFO] Send DelNetworkReply: IPv4Addr , DeviceNumber: 0, err: datastore: unknown pod
2019-01-21T09:33:19Z [INFO] Received DelNetwork for IP <nil>, Pod masked2-fragment-9f98b6ddf-92vsj, Namespace default, Container 1a159e21d3e7063f477f05d6cd6837c08edd6f7f772f3ded635d97cfe1c7d5f2
2019-01-21T09:33:19Z [DEBUG] UnAssignIPv4Address: IP address pool stats: total:56, assigned 27, pod(Name: masked2-fragment-9f98b6ddf-92vsj, Namespace: default, Container 1a159e21d3e7063f477f05d6cd6837c08edd6f7f772f3ded635d97cfe1c7d5f2)

@harshal-shah
Copy link

Hi @mogren we faced this issue again today on a very new node of our cluster. It had less than the prescribed maximum of 55 pods. Following is the event trail for a pod that was not getting IP address for a few minutes and then it got working fine.

Events:
  Type     Reason                  Age                From                                                  Message
  ----     ------                  ----               ----                                                  -------
  Normal   Scheduled               11m                default-scheduler                                     Successfully assigned default/masked-68c4cf5845-62ngd to ip-172-23-69-148.eu-west-1.compute.internal
  Warning  FailedCreatePodSandBox  10m                kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "160c971186b055c2cf3dacc3d76765123f9b97c5e91998d1e086306146888583" network for pod "masked-68c4cf5845-62ngd": NetworkPlugin cni failed to set up pod "masked-68c4cf5845-62ngd_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused", failed to clean up sandbox container "160c971186b055c2cf3dacc3d76765123f9b97c5e91998d1e086306146888583" network for pod "masked-68c4cf5845-62ngd": NetworkPlugin cni failed to teardown pod "masked-68c4cf5845-62ngd_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused"]
  Warning  FailedCreatePodSandBox  10m                kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "40bb89f933f46ac5a08267bc091e3572e137adeaf721d221b4bac55c86b98c66" network for pod "masked-68c4cf5845-62ngd": NetworkPlugin cni failed to set up pod "masked-68c4cf5845-62ngd_default" network: add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  10m                kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "ace11f03d5f908014d45b75efd2fe4105f208f42ef3d84e6cd97d7648b62990a" network for pod "masked-68c4cf5845-62ngd": NetworkPlugin cni failed to set up pod "masked-68c4cf5845-62ngd_default" network: add cmd: failed to assign an IP address to container
  Normal   SandboxChanged          10m (x3 over 10m)  kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 10m                kubelet, ip-172-23-69-148.eu-west-1.compute.internal  pulling image "quay.io/hellofresh/state-machine:0.86.0"
  Normal   Pulled                  8m16s              kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Successfully pulled image "quay.io/hellofresh/state-machine:0.86.0"
  Normal   Created                 8m15s              kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Created container
  Normal   Started                 8m14s              kubelet, ip-172-23-69-148.eu-west-1.compute.internal  Started container

I am also sending the tarball with logs to you separately. Hope this helps.

@mogren
Copy link
Contributor

mogren commented Jan 31, 2019

@harshal-shah Thanks for reporting this. We have gone through the logs and found some minor issues, but have not managed to determine the root cause yet. The previous logs from the 22:nd had a lot more data and I'll keep going through them to see if we can figure out why ipamd gets restarted.

@paksv
Copy link

paksv commented Feb 8, 2019

facing the same issue. In my case I just followed the tutorial and created VPC, cluster and 1 worker node. Now I cannot create any pods :(.

Here's what I get from kubectl get events:

7s 7s 1 my-podd-1.158172f02248703c Pod Normal Scheduled default-scheduler Successfully assigned default/my-podd-1 to ip-192-168-145-72.eu-west-1.compute.internal

6s 6s 1 my-podd-1.158172f04749da90 Pod Warning FailedCreatePodSandBox kubelet, ip-192-168-145-72.eu-west-1.compute.internal Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "4e42f734b8575d5c2554f55eef1cc215727809913ace8073564a3a4fc34cfd14" network for pod "my-podd-1": NetworkPlugin cni failed to set up pod "my-podd-1_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused", failed to clean up sandbox container "4e42f734b8575d5c2554f55eef1cc215727809913ace8073564a3a4fc34cfd14" network for pod "my-podd-1": NetworkPlugin cni failed to teardown pod "my-podd-1_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused"]

6s 6s 1 my-podd-1.158172f078d9e780 Pod Normal SandboxChanged kubelet, ip-192-168-145-72.eu-west-1.compute.internal Pod sandbox changed, it will be killed and re-created.

@nithu0115
Copy link
Contributor

nithu0115 commented Feb 11, 2019

@paksv - I encountered the same issue.

It looks like my pods that are created (including aws-node CNI pod) aren't able to communicated with API server via the cross account eni's.

Upon checking the ingress rules of the security groups associated to the cluster, I found
443 rule from the Worker nodes security group was missing. I manually added the ingress rule to allow 443 from worker nodes on my cluster security group which solved my issue.

@tabern tabern added this to the v1.4 milestone Mar 1, 2019
@tabern tabern added the priority/P1 Must be staffed and worked currently or soon. Is a candidate for next release label Mar 1, 2019
@jw-maynard
Copy link

Running into the same issue here in a deployment. If I delete the pod the next one created usually starts up fine.

@mogren
Copy link
Contributor

mogren commented Jun 5, 2019

Resolving since v1.5.0 is released. This version contains a lot of changes in how we allocate ENIs and that we stopped force detaching them.

If using too many ENIs is an issue, consider setting the WARM_IP_TARGET environment variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/P1 Must be staffed and worked currently or soon. Is a candidate for next release
Projects
None yet
Development

No branches or pull requests