Conversation
e24e66d
to
0c3914e
Compare
register("volume", runVolume, ` | ||
usage: flynn volume | ||
flynn volume show [--json] <id> | ||
flynn volume decommission <id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this command be scoped to the app by default?
controller/scheduler/host.go
Outdated
Type: VolumeEventTypeDestroy, | ||
} | ||
} | ||
ch <- e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this select on h.stop
as well?
// and this volume doesn't exist on that host | ||
if job.HostID != "" && vol.HostID != job.HostID { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is exclusivity enforced in flynn-host too? There can technically be two schedulers scheduling jobs at once under failure conditions.
@@ -1035,6 +1274,10 @@ func (s *Scheduler) HandleInternalStateRequest(req *InternalStateRequest) { | |||
req.State.Formations[key.String()] = &f | |||
} | |||
|
|||
for id, vol := range s.volumes { | |||
req.State.Volumes[id] = &(*vol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for &(*vol)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to copy it to create a "snapshot" of the scheduler's state to pass back to the caller.
5fc5d22
to
e9fedf0
Compare
@titanous comments addressed. |
e9fedf0
to
db239f9
Compare
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Useful for the scheduler creating volumes which it needs to track. Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
db239f9
to
d714bca
Compare
This pull request is a continuation of the work already done on the persistent-volumes branch to implement persistent volumes as per #2438.
Summary:
volumes
andjob_volumes
tables in the controller databaseVolume.JobID
)flynn volume decommission ID
which causes the volume to not be attached to any new jobsblocked
state waiting for either the host to come back up or the volume to be decommissioned. This then means if a host is rebooted which has data for a process type, the job just stays down until the host comes back rather than being restarted on a different host with an empty volume, but if the host has really gone away then it is up to the operator to decommission the volume and unblock the job to be scheduled on a different hostThings which need to be considered but not included in this PR:
flynn volume backup
andflynn volume restore
so that if a host is lost and volumes need to be decommissioned in order to move jobs to other hosts, operators can first restore volumes so that the unblocked job doesn't have to start with a completely empty volume