-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: check non-multiples of slots per pod for kubernetes rm [MD-403] #9393
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9393 +/- ##
==========================================
+ Coverage 48.99% 49.02% +0.02%
==========================================
Files 1235 1235
Lines 160191 160199 +8
Branches 2780 2780
==========================================
+ Hits 78482 78533 +51
+ Misses 81534 81491 -43
Partials 175 175
Flags with carried forward coverage won't be shown. Click here to find out more.
|
✅ Deploy Preview for determined-ui canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm modulo style feedback
@@ -736,6 +736,55 @@ func TestROCmPodsService(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestValidateResources(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
@@ -269,6 +269,10 @@ func (k *kubernetesResourcePool) ValidateResources( | |||
k.reschedule = true | |||
|
|||
fulfillable := k.maxSlotsPerPod >= msg.Slots | |||
|
|||
if !msg.IsSingleNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it'd be nice to have this return specific errors rather than the caller trying to describe what could possibly go wrong in this function.
and then assignResources could reuse the error defs, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i thought about this but didn't want to get too carried away with refactors. anyway, good call, done.
12e5668
to
835a281
Compare
835a281
to
969377b
Compare
e0ec4e5
to
3280d1c
Compare
Ticket
Error out earlier for Kubernetes clusters when slots requested are not a multiple of
max_slots_per_pod
Description
Test Plan
On a kubernetes cluster with
max_slots_per_pod
configured as > 1, submit an experiment withslots_per_trial
as a non-multiple ofmax_slots_per_pod
(i.e.slots_per_trial=2
,max_slots_per_pod=3
). experiment should error out upon creation.Checklist
docs/release-notes/
.See Release Note for details.