-
Notifications
You must be signed in to change notification settings - Fork 205
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Steps to reproduce
Start a run with a short utilization_policy.time_window.
> cat .dstack.yml
type: dev-environment
ide: vscode
utilization_policy:
min_gpu_utilization: 50
time_window: 10s
resources:
gpu: 1..
> dstack apply -yWait until time_window elapses.
Actual behaviour
The run keeps running.
Expected behaviour
The run is terminated OR dstack does not allow to set a time_window that is too short or too long to work properly.
dstack version
master
Server logs
[23:27:01] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
INFO dstack._internal.server.background.tasks.process_runs:336 run(10058a)warm-badger-2: run status has changed PROVISIONING -> RUNNING
[23:27:04] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:09] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:14] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:19] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:24] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:28] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:33] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:38] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:44] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:49] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:27:55] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:28:01] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samples
[23:28:06] DEBUG dstack._internal.server.background.tasks.process_running_jobs:694 job(0c5465)warm-badger-2-0-0: GPU utilization check: not enough samplesAdditional information
If time_window is comparable to the metrics collection interval, there will never be 2 metric points within the window.
Short time_window does not make sense for real workloads but is likely to be used when testing utilization_policy.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working