Skip to content

Restarted workers in kong pod use stale config if database is unreachable #14373

Open
@genev450

Description

@genev450

Is there an existing issue for this?

  • I have searched the existing issues

Kong version ($ kong version)

Kong 3.9.0

Current Behavior

After changing kong configuration (services or routes) everything works fine. But after re-creating kong workers, they may start working with outdated configuration. Problems occur if the database wasn't available when workers were started.
I think it may be somehow related to this issue: #9090

Expected Behavior

Refresh cache on every configuration change. Then even if the DB is unavailable, new workers will be able to get the current settings.

Steps To Reproduce

Prepare:

  • Start new kong from docker-compose
    KONG_DATABASE=postgres docker compose --profile database up -d

  • Do some configuration (service + route)

For example, I have service:

{
  "data": [
    {
      "created_at": 1742394714,
      "updated_at": 1742394714,
      "protocol": "https",
      "host": "httpbin.org",
      "id": "a8450bb5-9547-454d-b5fc-6afa21a7051f",
      "write_timeout": 60000,
      "retries": 5,
      "name": "http_service",
      "tags": null,
      "ca_certificates": null,
      "tls_verify_depth": null,
      "read_timeout": 60000,
      "client_certificate": null,
      "enabled": true,
      "port": 443,
      "connect_timeout": 60000,
      "tls_verify": null,
      "path": "/get"
    }
  ],
  "next": null
}

and route:

{
  "data": [
    {
      "created_at": 1742394778,
      "updated_at": 1742394778,
      "protocols": [
        "http",
        "https"
      ],
      "methods": null,
      "paths": [
        "/old-route"
      ],
      "hosts": null,
      "id": "bceadacc-5fa1-4ea9-b3e7-46ee4ecdc080",
      "https_redirect_status_code": 426,
      "snis": null,
      "name": "get",
      "strip_path": true,
      "tags": [],
      "service": {
        "id": "a8450bb5-9547-454d-b5fc-6afa21a7051f"
      },
      "path_handling": "v0",
      "regex_priority": 0,
      "headers": null,
      "response_buffering": true,
      "sources": null,
      "preserve_host": false,
      "request_buffering": true,
      "destinations": null
    }
  ],
  "next": null
}

  • Fully stop kong
    docker-compose down

Reproducing:

  1. Run kong again. From this point kong started with a non-empty configuration

It works, everything is fine:

curl localhost:8000/old-route

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin.org",
    "User-Agent": "curl/7.81.0",
    "X-Amzn-Trace-Id": "Root=1-67dad5cb-6e2de9f104d8a7f31616c199",
    "X-Forwarded-Host": "localhost",
    "X-Forwarded-Path": "/old-route",
    "X-Forwarded-Prefix": "/old-route",
    "X-Kong-Request-Id": "25a0364788fda37edb21870c84f6da0a"
  },
  "url": "https://localhost/old-route"
}
  1. Change the service or route. Let's change the route from /old-route to /new-route

curl -s localhost:8000/old-route

{
  "message": "no Route matched with those values",
  "request_id": "a571a878d06f01710d1c7d2b69d00d7d"
}

curl -s localhost:8000/new-route

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin.org",
    "User-Agent": "curl/7.81.0",
    "X-Amzn-Trace-Id": "Root=1-67dada99-13bb8a4e72ac63990d33cb7e",
    "X-Forwarded-Host": "localhost",
    "X-Forwarded-Path": "/new-route",
    "X-Forwarded-Prefix": "/new-route",
    "X-Kong-Request-Id": "674e7064b795d9590d464b5db4ee0d47"
  },
  "url": "https://localhost/old-route"
}
  1. Stop the database.

docker-compose stop db

  1. Kill workers.
ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
kong           1  0.0  0.2 383456 45132 ?        Ss   14:30   0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /var/run/kong -c nginx.conf
kong        1407  0.4  0.4 421232 78088 ?        S    14:30   0:03 nginx: worker process
kong        1408  0.4  0.4 418124 72376 ?        S    14:30   0:03 nginx: worker process

kill 1407 1408

ps aux
kong           1  0.0  0.2 383456 45132 ?        Ss   14:30   0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /var/run/kong -c nginx.conf
kong        1482  6.3  0.4 417080 70408 ?        S    14:54   0:00 nginx: worker process
kong        1483  4.7  0.4 417144 70152 ?        S    14:54   0:00 nginx: worker process
  1. Now changes from step 2 will not work, but the old route/services is working

Trying new route
curl -s localhost:8000/new-route

{
  "message": "no Route matched with those values",
  "request_id": "427d1166a8fcab2f445e6b03e8771758"
}

Trying old route

curl -s localhost:8000/old-route

{
  "args": {},
  "headers": {
    "Accept": "*/*",
    "Host": "httpbin.org",
    "User-Agent": "curl/7.81.0",
    "X-Amzn-Trace-Id": "Root=1-67daddbf-4311ea5f1645e78d439efdb2",
    "X-Forwarded-Host": "localhost",
    "X-Forwarded-Path": "/old-route",
    "X-Forwarded-Prefix": "/old-route",
    "X-Kong-Request-Id": "654cfc35eea9aa7bd96e2a76aae86c6c"
  },
  "url": "https://localhost/old-route"
}

kong logs:
2025/03/19 15:07:43 [alert] 1482#0: *3505 [lua] init.lua:1152: rewrite(): unsafe request processing due to earlier initialization errors; this node must be restarted (failed to build the router: could not load routes: [postgres] temporary failure in name resolution)

  1. And if the database becomes available, nothing changes before the worker is killed again

It seems that when kong starts, it saves the config from the database to the cache and never update it, and if the database is unavailable, new workers will get the stale cached config.

Anything else?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions