Skip to content

Retry Loop never resolving Race Condition #2112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jnovack opened this issue Apr 22, 2025 · 0 comments
Open

Retry Loop never resolving Race Condition #2112

jnovack opened this issue Apr 22, 2025 · 0 comments

Comments

@jnovack
Copy link

jnovack commented Apr 22, 2025

Describe the bug
If the server container starts before the redis container (read: before the dns entry of the redis container hits the DNS server), the server container will never resolve the IP of the redis container and will result in an infinite loop and never resolving the DNS entry.

To Reproduce
Steps to reproduce the behavior:

  1. docker stack deploy checkmate.yml within Docker Swarm.
  2. See error

This occurs naturally as Docker Swarm does not respect depends-on from docker-compose as you intend.

Expected behavior
I am expecting the server container to down itself after a pre-defined limit of failed attempts contacting either the redis or mongo end-points.

If 5 backed-off retries (immediately, 0.5 sec, 1 sec, 2 sec, 4 sec) fail, the container should exit with error. It is the responsibility of the orchestration engine (swarm, kubernetes, etc) to respawn the container, which gives the other containers plenty of time to be functional and registered.

Screenshots

checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    | Error: getaddrinfo ENOTFOUND redis
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    |     at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:109:26) {
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    |   errno: -3008,
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    |   code: 'ENOTFOUND',
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    |   syscall: 'getaddrinfo',
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    |   hostname: 'redis'
checkmate_server.1.fa0rb3gwhxyj@swarm-master-04    | }

Work Around

  1. docker service scale checkmate_server=0 && docker service scale checkmate_server=1
ajhollid added a commit that referenced this issue May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant