Skip to content

Commit

Permalink
csi: running allocations for the same job should block claims
Browse files Browse the repository at this point in the history
When the scheduler checks feasibility for CSI volumes, the check is
fairly loose: earlier versions of the same job are not counted as
active claims. This allows the scheduler to place new allocations
for the new version of a job, under the assumption that we'll replace
the existing allocations and their volume claims.

But when the alloc runner claims the volume, we need to enforce the
active claims even if they're for allocations of an earlier version of
the job. Otherwise we'll try to mount a volume that's currently being
unmounted, and this will cause replacement allocations to frequently
fail.

This commit correctly enforces maximum volume claims for writers.
  • Loading branch information
tgross committed Feb 23, 2022
1 parent 3aedd9b commit 3128710
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 9 deletions.
8 changes: 5 additions & 3 deletions nomad/csi_endpoint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -243,9 +243,11 @@ func TestCSIVolumeEndpoint_Claim(t *testing.T) {
// Create an initial volume claim request; we expect it to fail
// because there's no such volume yet.
claimReq := &structs.CSIVolumeClaimRequest{
VolumeID: id0,
AllocationID: alloc.ID,
Claim: structs.CSIVolumeClaimWrite,
VolumeID: id0,
AllocationID: alloc.ID,
Claim: structs.CSIVolumeClaimWrite,
AccessMode: structs.CSIVolumeAccessModeMultiNodeSingleWriter,
AttachmentMode: structs.CSIVolumeAttachmentModeFilesystem,
WriteRequest: structs.WriteRequest{
Region: "global",
Namespace: structs.DefaultNamespace,
Expand Down
7 changes: 1 addition & 6 deletions nomad/structs/csi.go
Original file line number Diff line number Diff line change
Expand Up @@ -554,12 +554,7 @@ func (v *CSIVolume) claimWrite(claim *CSIVolumeClaim, alloc *Allocation) error {
}

if !v.HasFreeWriteClaims() {
// Check the blocking allocations to see if they belong to this job
for _, a := range v.WriteAllocs {
if a != nil && (a.Namespace != alloc.Namespace || a.JobID != alloc.JobID) {
return ErrCSIVolumeMaxClaims
}
}
return ErrCSIVolumeMaxClaims
}

// Allocations are copy on write, so we want to keep the id but don't need the
Expand Down

0 comments on commit 3128710

Please sign in to comment.