Fix Verda spot offers marked unavailable due to on-demand-only availability check#3928
Merged
Merged
Conversation
…bility check VerdaCompute._get_offers_with_availability queried instance availability without the is_spot parameter (returning on-demand inventory only) and keyed the availability map by (instance_name, region), ignoring the spot dimension. As a result, spot offers (e.g. B200 spot) inherited on-demand availability and were marked NOT_AVAILABLE whenever the on-demand variant was unavailable. Such offers are then dropped during provisioning, which requests offers with exclude_not_available=True, so they were never provisioned. Query both spot and on-demand availability and key the map by (instance_name, region, is_spot) so each offer is matched against the correct inventory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
|
It will be released with 0.20.23, likely later this week |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Fixes spot offers on the
verdabackend being markedNOT_AVAILABLEand never provisioned.VerdaCompute._get_offers_with_availabilityqueried instance availability without theis_spotparameter (so the Verda API returned on-demand inventory only) and keyed the availability map by(instance_name, region), ignoring the spot dimension. As a result, spot offers (e.g. B200 spot) inherited on-demand availability and were markedNOT_AVAILABLEwhenever the on-demand variant was unavailable. Provisioning requests offers withexclude_not_available=True, so these offers were dropped and never provisioned.The fix queries both spot and on-demand availability and keys the map by
(instance_name, region, is_spot), so each offer is matched against the correct inventory.Reproduction
See linked issue. With B200 available as spot but not on-demand,
dstack offer --backend verda --gpu B200 --spotshowed B200 spot as "not available", and provisioning found no offers.Tests
Added
TestGetOffersWithAvailabilityintests/.../verda/test_compute.pycovering:is_spot=Trueandis_spot=Falseavailability are queried.These tests fail against the previous code and pass with the fix.
ruff check/ruff formatclean.Closes #3927
This PR was written primarily by Claude Code (with maintainer review of the diagnosis and fix).