Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3S returning 504 to some pods and not others #7135

Closed
HoloPanio opened this issue Mar 22, 2023 · 0 comments
Closed

K3S returning 504 to some pods and not others #7135

HoloPanio opened this issue Mar 22, 2023 · 0 comments

Comments

@HoloPanio
Copy link

Environmental Info:
K3s Version: v1.25.7+k3s1

Node(s) CPU architecture, OS, and Version:

  • Linux us-mky01 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 18 02:06:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Linux us-mky02 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 18 02:06:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: I have one cluster with one external node. I use netmaker on both servers, but the server that hosts the netmaker instance is not on either of these servers.

Describe the bug:
When I start up my k3s cluster, things seem to be working normally and I don't have an issue all day, but the next day I begin getting gateway timeout errors. Sometimes it only times out to pods on the external node, and will return a broken webpage from the node on the cluster server; other times it will just always return a 504 gateway timeout error.

I am able to resolve the bug for the day by simply restarting the k3s cluster service every day (systemctl restart k3s)

Steps To Reproduce:

  • Installed K3s: $ curl -sfL https://get.k3s.io | sh -
  • Added K3S external node using netmaker ip schema

Cluster (Redacted public ip address):
image

External Node:
image

Ingress Config (same for all services with different service name)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: duxcore-api-ingress
  annotations:
    ingress.kubernetes.io/ssl-redirect: "false"
spec:
  tls:
    - secretName: traefik-cert
  rules:
    - host: api.duxcore.co
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: duxcore-api
                port:
                  number: 3000

Api Deployment Yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: duxcore-api
spec:
  selector:
    matchLabels:
      app: duxcore-api
  replicas: 8
  template:
    metadata:
      labels:
        app: duxcore-api
    spec:
      containers:
        - name: duxcore-api
          image: ghcr.io/duxcore/api:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 3000
      imagePullSecrets:
        - name: github-container-registry

Website Deployment yaml (uses same ingress as api with different service name):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: duxcore-website
spec:
  selector:
    matchLabels:
      app: duxcore-website
  replicas: 4
  template:
    metadata:
      labels:
        app: duxcore-website
    spec:
      containers:
        - name: duxcore-website
          image: ghcr.io/duxcore/website:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 3000
      imagePullSecrets:
        - name: github-container-registry

Expected behavior:

I expected k3s ingress to be resolving and load balancing correctly

Actual behavior:

It will work as expected for a period of time, but at a point it will just straight up stop working correctly, but can be fixed by restarting k3s cluster service once.

Additional context / logs:

The first problem that I thought it might've been was that it was dns because I was getting a warning that I had too many DNS servers, after resolving that issue it still happened again. I do not know where to pull the logs from because. I am not sure what the root of the issues is. The web server logs look as expected with little problem.

@HoloPanio HoloPanio changed the title K3S returning 504 to some pods and not others (sometimes) K3S returning 504 to some pods and not others Mar 22, 2023
@k3s-io k3s-io locked and limited conversation to collaborators Mar 22, 2023
@brandond brandond converted this issue into discussion #7139 Mar 22, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant