-
Notifications
You must be signed in to change notification settings - Fork 218
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Steps to reproduce
- Provision
2x MI300X 26x Xeon Platinum 8470on Hot Aisle. - Terminate it.
Actual behaviour
Stuck in terminating for 15 minutes.
> dstack fleet
FLEET INSTANCE BACKEND RESOURCES PRICE STATUS CREATED
funny-gecko 0 hotaisle (us-michigan-1) cpu=26 mem=448GB disk=12288GB MI300X:192GB:2 $3.98 terminating 20 mins ago
Then marked as terminated, although it actually keeps running on Hot Aisle.
The reason is that Hot Aisle VMs now have a min reservation period during which they cannot be deleted.
Expected behaviour
Keep the instance in terminating until we are able to delete it.
Also do our best to communicate this particularity to the user. Some possibilities include describing it in the docs, adding offer notes to the run plan, asking for an additional confirmation in the CLI before terminating instances with an unelapsed reservation period.
dstack version
0.19.34
Server logs
[17:17:41] ERROR dstack._internal.server.background.tasks.process_instances:962 Failed all attempts to terminate instance funny-gecko-0. Please
terminate the instance manually to avoid unexpected charges. Error: HTTPError('400 Client Error: Bad Request for url:
https://admin.hotaisle.app/api/teams/team-name/virtual_machines/enc1-gpuvm005/')
Traceback (most recent call last):
File "/dstack/src/dstack/_internal/server/background/tasks/process_instances.py", line 941, in _terminate
await run_async(
File "/dstack/src/dstack/_internal/utils/common.py", line 21, in run_async
return await asyncio.get_running_loop().run_in_executor(None, func_with_args)
File "/usr/lib64/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/dstack/src/dstack/_internal/core/backends/hotaisle/compute.py", line 126, in terminate_instance
self.api_client.terminate_virtual_machine(vm_name)
File "/dstack/src/dstack/_internal/core/backends/hotaisle/api_client.py", line 83, in
terminate_virtual_machine
response.raise_for_status()
File "/dstack/venv/lib64/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url:
https://admin.hotaisle.app/api/teams/team-name/virtual_machines/enc1-gpuvm005/Additional information
We expressed concerns to Hot Aisle regarding the limitations of min reservation periods. They will discuss it internally
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working