K3S returning 504 to some pods and not others #7135

HoloPanio · 2023-03-22T14:00:21Z

Environmental Info:
K3s Version: v1.25.7+k3s1

Node(s) CPU architecture, OS, and Version:

Linux us-mky01 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 18 02:06:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Linux us-mky02 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 18 02:06:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: I have one cluster with one external node. I use netmaker on both servers, but the server that hosts the netmaker instance is not on either of these servers.

Describe the bug:
When I start up my k3s cluster, things seem to be working normally and I don't have an issue all day, but the next day I begin getting gateway timeout errors. Sometimes it only times out to pods on the external node, and will return a broken webpage from the node on the cluster server; other times it will just always return a 504 gateway timeout error.

I am able to resolve the bug for the day by simply restarting the k3s cluster service every day (systemctl restart k3s)

Steps To Reproduce:

Installed K3s: $ curl -sfL https://get.k3s.io | sh -
Added K3S external node using netmaker ip schema

Cluster (Redacted public ip address):

External Node:

Ingress Config (same for all services with different service name)

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: duxcore-api-ingress
  annotations:
    ingress.kubernetes.io/ssl-redirect: "false"
spec:
  tls:
    - secretName: traefik-cert
  rules:
    - host: api.duxcore.co
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: duxcore-api
                port:
                  number: 3000

Api Deployment Yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: duxcore-api
spec:
  selector:
    matchLabels:
      app: duxcore-api
  replicas: 8
  template:
    metadata:
      labels:
        app: duxcore-api
    spec:
      containers:
        - name: duxcore-api
          image: ghcr.io/duxcore/api:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 3000
      imagePullSecrets:
        - name: github-container-registry

Website Deployment yaml (uses same ingress as api with different service name):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: duxcore-website
spec:
  selector:
    matchLabels:
      app: duxcore-website
  replicas: 4
  template:
    metadata:
      labels:
        app: duxcore-website
    spec:
      containers:
        - name: duxcore-website
          image: ghcr.io/duxcore/website:latest
          imagePullPolicy: Always
          ports:
            - containerPort: 3000
      imagePullSecrets:
        - name: github-container-registry

Expected behavior:

I expected k3s ingress to be resolving and load balancing correctly

Actual behavior:

It will work as expected for a period of time, but at a point it will just straight up stop working correctly, but can be fixed by restarting k3s cluster service once.

Additional context / logs:

The first problem that I thought it might've been was that it was dns because I was getting a warning that I had too many DNS servers, after resolving that issue it still happened again. I do not know where to pull the logs from because. I am not sure what the root of the issues is. The web server logs look as expected with little problem.

The text was updated successfully, but these errors were encountered:

HoloPanio changed the title ~~K3S returning 504 to some pods and not others (sometimes)~~ K3S returning 504 to some pods and not others Mar 22, 2023

k3s-io locked and limited conversation to collaborators Mar 22, 2023

brandond converted this issue into discussion #7139 Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

K3S returning 504 to some pods and not others #7135

K3S returning 504 to some pods and not others #7135

HoloPanio commented Mar 22, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

K3S returning 504 to some pods and not others #7135

K3S returning 504 to some pods and not others #7135

Comments

HoloPanio commented Mar 22, 2023

This issue was moved to a discussion.