Skip to content

[Bug]: The fleet idle_duration property is not respected on run apply #3106

@peterschmidt85

Description

@peterschmidt85

Steps to reproduce

  1. Define an elastic fleet (nodes set to, e.g., 0..2) with idle_duration set to a non-default, e.g. 1h
  2. Create the fleet
  3. Run something on this fleet, and see instance provisioned
  4. Wait for 5 min

Actual behaviour

  1. See the fleet instance is terminated after 5 min of idle time

Expected behaviour

  1. The instance should be only terminated after the specified idle_duration (e.g. 1h that was set)

dstack version

0.19.29

Server logs

[11:12:43] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.009657s. Status: 200
           DEBUG    dstack._internal.server.services.runner.client:437 shim version: 5648 (latest) (API v2)
           DEBUG    dstack._internal.server.services.runner.client:437 shim version: 5648 (latest) (API v2)
           DEBUG    dstack._internal.server.services.runner.client:437 shim version: 5648 (latest) (API v2)
           DEBUG    dstack._internal.server.background.tasks.process_instances:753 Instance default-fleet-3 check: reachable=True health_status=HEALTHY message=None
           DEBUG    dstack._internal.server.background.tasks.process_instances:753 Instance default-fleet-0 check: reachable=True health_status=HEALTHY message=None
           DEBUG    dstack._internal.server.background.tasks.process_instances:753 Instance default-fleet-2 check: reachable=True health_status=HEALTHY message=None
[11:12:45] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007610s. Status: 200
[11:12:47] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008801s. Status: 200
[11:12:49] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.010839s. Status: 200
           DEBUG    dstack._internal.server.services.runner.client:437 shim version: 5648 (latest) (API v2)
           DEBUG    dstack._internal.server.background.tasks.process_instances:753 Instance default-fleet-1 check: reachable=True health_status=HEALTHY message=None
[11:12:51] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005884s. Status: 200
[11:12:53] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005520s. Status: 200
[11:12:55] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007050s. Status: 200
[11:12:57] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007643s. Status: 200
           INFO     dstack._internal.server.background.tasks.process_instances:244 Instance default-fleet-3 idle duration expired: idle time 304s. Terminating
           INFO     dstack._internal.server.background.tasks.process_instances:244 Instance default-fleet-0 idle duration expired: idle time 303s. Terminating
[11:12:58] DEBUG    dstack._internal.server.services.runner.client:437 shim version: 5648 (latest) (API v2)
           DEBUG    dstack._internal.server.background.tasks.process_instances:753 Instance default-fleet-2 check: reachable=True health_status=HEALTHY message=None
[11:12:59] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005763s. Status: 200
[11:13:01] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006631s. Status: 200
           INFO     dstack._internal.server.background.tasks.process_instances:244 Instance default-fleet-1 idle duration expired: idle time 308s. Terminating
[11:13:03] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008135s. Status: 200
[11:13:05] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008310s. Status: 200
[11:13:07] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005859s. Status: 200
[11:13:09] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005901s. Status: 200
[11:13:11] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008436s. Status: 200
[11:13:12] DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 66.201.7.150
           INFO     dstack._internal.server.background.tasks.process_instances:244 Instance default-fleet-2 idle duration expired: idle time 313s. Terminating
           DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 89.169.113.5
           DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 89.169.103.248
[11:13:13] DEBUG    dstack._internal.server.background.tasks.process_instances:945 Instance default-fleet-1 termination in progress: Requested instance deletion. Will wait for deletion before deleting
                    the boot disk. Instance state was: RUNNING
           DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007218s. Status: 200
           DEBUG    dstack._internal.server.background.tasks.process_instances:945 Instance default-fleet-0 termination in progress: Requested instance deletion. Will wait for deletion before deleting
                    the boot disk. Instance state was: RUNNING
           DEBUG    dstack._internal.server.background.tasks.process_instances:945 Instance default-fleet-3 termination in progress: Requested instance deletion. Will wait for deletion before deleting
                    the boot disk. Instance state was: RUNNING
[11:13:15] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.004047s. Status: 200
[11:13:17] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006713s. Status: 200
[11:13:19] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.009223s. Status: 200
[11:13:21] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005080s. Status: 200
[11:13:23] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005599s. Status: 200
[11:13:25] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006019s. Status: 200
[11:13:27] DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 195.242.10.107
           DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006627s. Status: 200
           DEBUG    dstack._internal.server.background.tasks.process_instances:945 Instance default-fleet-2 termination in progress: Requested instance deletion. Will wait for deletion before deleting
                    the boot disk. Instance state was: RUNNING
[11:13:29] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006346s. Status: 200
[11:13:31] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007675s. Status: 200
[11:13:33] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005842s. Status: 200
[11:13:35] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007353s. Status: 200
[11:13:37] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008929s. Status: 200
[11:13:39] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.009682s. Status: 200
[11:13:41] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.007881s. Status: 200
[11:13:43] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006099s. Status: 200
[11:13:45] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.006654s. Status: 200
[11:13:47] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.008366s. Status: 200
[11:13:49] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005361s. Status: 200
[11:13:51] DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.003927s. Status: 200
[11:13:53] DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 89.169.103.248
           DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 89.169.113.5
           DEBUG    dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 66.201.7.150
           DEBUG    dstack._internal.server.background.tasks.process_instances:945 Instance default-fleet-1 termination in progress: Waiting for instance deletion before deleting the boot disk.
                    Instance state: DELETING
           DEBUG    dstack._internal.server.app:259 Processed request POST http://127.0.0.1:3000/api/project/main/fleets/list in 0.005925s. Status: 200
           INFO     dstack._internal.server.background.tasks.process_instances:969 Instance default-fleet-0 terminated
           INFO     dstack._internal.server.background.tasks.process_instances:969 Instance default-fleet-3 terminated

Additional information

In the database, the idle_duration is set to 3600:

select * from fleets where name = "my-idle-cluster";
                   id = 1df5c786461e465aa41d0fc0522d5838
                 name = my-idle-cluster
           project_id = b37112f856a84887b0df193462f98dea
           created_at = 2025-09-16 09:31:05.210792
    last_processed_at = 2025-09-16 09:31:05.210796
              deleted = 0
           deleted_at = 
               status = ACTIVE
       status_message = 
                 spec = {"configuration":{"type":"fleet","name":"my-idle-cluster","env":{},"ssh_config":null,"nodes":{"min":0,"max":2},"placement":null,"reservation":null,"resources":{"cpu":{"min":2,"max":null},"memory":{"min":8.0,"max":null},"shm_size":null,"gpu":null,"disk":{"size":{"min":100.0,"max":null}}},"blocks":1,"backends":["nebius"],"regions":null,"availability_zones":null,"instance_types":null,"spot_policy":null,"retry":null,"max_price":null,"idle_duration":3600,"tags":null},"configuration_path":".local/fleet.dstack.yml","profile":{"backends":null,"regions":null,"availability_zones":null,"instance_types":null,"reservation":null,"spot_policy":"auto","retry":null,"max_duration":null,"stop_duration":null,"max_price":1.0,"creation_policy":null,"idle_duration":null,"utilization_policy":null,"startup_order":null,"stop_criteria":null,"schedule":null,"fleets":null,"tags":null,"name":"my-profile","default":true},"autocreated":false}
consolidation_attempt = 0
 last_consolidated_at = 

Apparently, the idle_duration property is not used. See

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingfleets

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions