Kubernetes (K3S) POD gets "ENOTFOUND" after 5-20 hours of airing time #107866

JanJannos · 2022-01-30T15:08:31Z

What happened?

I'm running my Backend on Kubernetes on around 250 pods under 15 deployments , backend in written in NODEJS.

Sometimes after X number of hours (5<X<30) I'm getting ENOTFOUND in one of the PODS , as follows:

{
  "name": "main",
  "hostname": "entrypoint-sdk-54c8788caa-aa3cj",
  "pid": 19,
  "level": 50,
  "error": {
    "errno": -3008,
    "code": "ENOTFOUND",
    "syscall": "getaddrinfo",
    "hostname": "employees-service"
  },
  "msg": "Failed calling getEmployee",
  "time": "2022-01-28T13:44:36.549Z",
  "v": 0
}

I'm running s stress test on the Backend of YY number of users per second , but I'm keeping this stress level steady and not changing it, and then it happens out of nowhere with no specific reason.

Kubernetes is K3S Server Version: v1.22.3+k3s1

Any idea what might cause this weird ENOTFOUND ?

What did you expect to happen?

That the services would talk to each other without any ENOTFOUND

How can we reproduce it (as minimally and precisely as possible)?

Run K3S on Bare Metal

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: v1.22.3+k3s1
Server Version: v1.22.3+k3s1

Cloud provider

Bare Metal

OS version

# On Linux:
$ cat /etc/os-release

ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"

$ uname -a
Linux dev1 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2022-01-30T15:08:37Z

@JanJannos: There are no sig labels on this issue. Please add an appropriate label by using one of the following commands:

/sig <group-name>
/wg <group-name>
/committee <group-name>

Please see the group list for a listing of the SIGs, working groups, and committees available.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2022-01-30T15:08:38Z

@JanJannos: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

aojea · 2022-01-30T17:12:52Z

This seems related to the K3S distribution external-secrets/kubernetes-external-secrets#860 (comment)
Please , open the issue in their repository
/close

k8s-ci-robot · 2022-01-30T17:13:12Z

@aojea: Closing this issue.

In response to this:

This seems related to the K3S distribution external-secrets/kubernetes-external-secrets#860 (comment)
Please , open the issue in their repository
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

JanJannos added the kind/bug Categorizes issue or PR as related to a bug. label Jan 30, 2022

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 30, 2022

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 30, 2022

k8s-ci-robot closed this as completed Jan 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes (K3S) POD gets "ENOTFOUND" after 5-20 hours of airing time #107866

Kubernetes (K3S) POD gets "ENOTFOUND" after 5-20 hours of airing time #107866

JanJannos commented Jan 30, 2022 •

edited

k8s-ci-robot commented Jan 30, 2022

k8s-ci-robot commented Jan 30, 2022

aojea commented Jan 30, 2022

k8s-ci-robot commented Jan 30, 2022

Kubernetes (K3S) POD gets "ENOTFOUND" after 5-20 hours of airing time #107866

Kubernetes (K3S) POD gets "ENOTFOUND" after 5-20 hours of airing time #107866

Comments

JanJannos commented Jan 30, 2022 • edited

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

k8s-ci-robot commented Jan 30, 2022

k8s-ci-robot commented Jan 30, 2022

aojea commented Jan 30, 2022

k8s-ci-robot commented Jan 30, 2022

JanJannos commented Jan 30, 2022 •

edited