Skip to content

Reboot and network race conditions - "Status: Exited Exited (255)" #3236

@MrTengil

Description

@MrTengil

To Reproduce

  1. Spin up Dokploy on some server.
  2. Deploy docker compose containing bunch of services.
  3. Add atleast one of the services to external dokploy-network
  4. Deploy it and make sure it's up and running
  5. Force shutdown of server (not gracefully)
  6. Boot up the server.
  7. Check for any none starting applications: docker ps -a -f "status=exited"

Current vs. Expected behavior

Expectations: All of my deployed containers to spin up again with no issue after the server forcefully shuts down and boots up.
Reality: Some of my containers (the ones connected to dokploy-network) sometimes exits with status code 255
Even Dokploys own services fails and exits on 255 but new are spun up.

Exited:

  • Exited (255) 23 minutes ago 6379/tcp dokploy-redis.1.<id_redis>
  • Exited (255) 23 minutes ago 0.0.0.0:3000->3000/tcp, [::]:3000->3000/tcp dokploy.1.<id_dokploy>
  • Exited (255) 23 minutes ago 0.0.0.0:3000->3000/tcp, [::]:3000->3000/tcp dokploy-postgres.1.<id_postgres>

Up:

  • Up 23 minutes 6379/tcp dokploy-redis.1.<id_redis_up>
  • Up 23 minutes 0.0.0.0:3000->3000/tcp, [::]:3000->3000/tcp dokploy.1.<id_dokploy_up>
  • Up 23 minutes 5432/tcp dokploy-postgres.1.<id_postgres_up>

If I SSH into the server and restart the exited containers with docker restart $(docker ps -q -a -f "status=exited") they spin up just fine.

Looking at the logs, sudo journalctl -u docker -n 100 --no-pager (I chose not to show due to privacy) shows I'm having some fatal errors. However I'm not too sure what to make of those logs. However it looks like there is some issue with the fact that the network does not exist before trying to spin up those services.

There are no logs, docker logs <container_id>, in either containers which do not spin up.

This never happens if I stop the service and redeploy inside of docker.

I have tried following:

  • restart: always
  • depends_on
  • Health checks

I really wish I don't have to SSH into the server and restart the containers manually every time the server restarts. It does a forceful restart every time we reconfigure the resources on the server.

Provide environment information

Operating System:
  OS: Ubuntu 22.04
  Arch: AMD64
Dokploy version: 0.26.1
VPS Provider: Glesys
Deploying: Compose with some services.

Which area(s) are affected? (Select all that apply)

Docker, Traefik, Docker Compose, Swarm

Are you deploying the applications where Dokploy is installed or on a remote server?

Remote server

Additional context

Due to security reason I cannot share any docker composes nor logs for now.

Will you send a PR to fix it?

Maybe, need help

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions