-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preemption fixes and improvements #2264
Conversation
… severinson/improved-preemption
… severinson/gang-preemption
… severinson/preemption-fixes
@@ -189,59 +188,6 @@ func TestDeleteQueuedJob(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestDeleteJobShouldSetJobObjectToExpire(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we rely on being able to return leases for deleted jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not conceptually but possibly inadvertently - I can think of edge cases where it could happen - however as long as the server treats it as a noop it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #2264 +/- ##
==========================================
+ Coverage 57.41% 57.44% +0.02%
==========================================
Files 223 223
Lines 27913 27946 +33
==========================================
+ Hits 16027 16053 +26
- Misses 10642 10649 +7
Partials 1244 1244
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
@@ -259,24 +205,6 @@ func TestDeleteWithSomeMissingJobs(t *testing.T) { | |||
}) | |||
} | |||
|
|||
func TestReturnLeaseForDeletedJobShouldKeepJobDeleted(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this Test still pass with your changes?
Most likely this was added as previously I imagine that wasn't the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since making returning lease for deleted jobs a no-op, this passes.
internal/armada/server/lease.go
Outdated
} | ||
} | ||
} | ||
if jobs, err := q.jobRepository.GetExistingJobsByIds(jobIdsToDelete); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this section actually needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I've removed it.
This pr changes how resources are accounted for in lease.go. We currently send resources allocated per queue separately. With this pr, we change it so that the server aggregates the resource usage per queue based on the jobs the executor has reported exists in the cluster. This is necessary to ensure resource accounting is correct now that we send all jobs to the executor with the same priority class; without this change, all resource usage is reported at the armada-default priority class (the class all jobs are marked as when sent to the executor now).
Removes the job deletion grace period. Jobs are now deleted immediately when calling delete instead of setting an expiry on the key. This is necessary to ensure preempted jobs don't factor into the resources used by a queue.