Skip to content

Commit

Permalink
Design doc for RestoreItemAction wait for AdditionalItems to be ready
Browse files Browse the repository at this point in the history
Signed-off-by: Scott Seago <sseago@redhat.com>
  • Loading branch information
sseago committed Aug 26, 2020
1 parent 718a94a commit b330fa9
Showing 1 changed file with 169 additions and 0 deletions.
169 changes: 169 additions & 0 deletions design/wait-for-additional-items.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Wait for AdditionalItems to be ready on Restore

When a velero `RestoreItemAction` plugin returns a list of resources
via `AdditionalItems`, velero restores these resources before
restoring the current resource. There is a race condition here, as it
is possible that after running the restore on these returned items,
the current item's restore might execute before the additional items
are available. Depending on the nature of the dependency between the
current item and the additional items, this could cause the restore of
the current item to fail.

## Goals

- Enable Velero to ensure that Additional items returned by a restore
plugin's `Execute` func are ready before restoring the current item
- Augment the RestoreItemAction plugin interface to allow the plugins
to determine when an additional item is ready, since doing so
requires knowledge specific to the resource type.

## Background

Because Velero does not wait after restoring additoonal items to
restore the current item, in some cases the current item restore will
fail if the additional items are not yet ready. Velero (and the
`RestoreItemAction` plugins) need to implement this "wait until ready"
functionality.

## High-Level Design

After each RestoreItemAction execute call (and following restore of
any returned addional items) , we need to wait for these returned
Additional Items to be ready before restoring the current item. In
order to do this, we also need to extend the `RestoreItemAction`
interface to allow the plugin which returned additional items to
determine whether they are ready.

## Detailed Design

### `restoreItem` Changes

When each `RestoreItemAction` `Execute()` call returns, the
`RestoreItemActionExecuteOutput` struct contains a slice of
`AdditionalItems` which must be restored before this item can be
restored. After restoring these items, Velero needs to wait for them
to be ready before moving on to the next item. Right after looping
over the additional items at
https://github.com/vmware-tanzu/velero/blob/main/pkg/restore/restore.go#L960-L991

we still have a reference to the additional items (`GroupResource` and
namespaced name), as well as a reference to the `RestoreItemAction`
plugin which required it.

At this point we need to call a func similar to `crdAvailable` which
we will call `itemsAvailable`
https://github.com/vmware-tanzu/velero/blob/main/pkg/restore/restore.go#L623

Instead of the one minute CRD timeout, we'll use
`resourceTerminatingTimeout`, and instead of the
`IsUnstructuredCRDReady` call, we'll call `AreAdditionalItemsReady` on
the same `RestoreItemAction` plugin which returned the item, passing
in the same `AdditionalItems` slice as an argument (with items which
failed to restore filtered out). If `AreAdditionalItemsReady` returns
an error, then `itemsAvailable` will propagate the error, and
`restoreItem` will handle it the same way it handles an error return
on restoring an additional item. If the timeout is reached without
ready returning true, velero will continue on to attempt restore of
the current item.

### `RestoreItemActionExecuteOutput` changes

A new boolean field will be added to
`RestoreItemActionExecuteOutput`. If `WaitForAdditionalItems` is true,
then `restoreItem` will call `itemsAvailable` which will invoke the
plugin func `AreAdditionalItemsReady` and wait until the func returns
true or the timeout is reached. If `WaitForAdditionalItems` is false
(the default case), then current velero behavior will be
followed. Existing plugins which do not need to signal to wait for
`AdditionalItems` won't need to change their `Execute()` functions.

In addition, a new func, `WithItemsWait()` will be added to
`RestoreItemActionExecuteOutput` similar to `WithoutRestore()` which
will set the `WaitForAdditionalItems` bool to `true`.

```
// RestoreItemActionExecuteOutput contains the output variables for the ItemAction's Execution function.
type RestoreItemActionExecuteOutput struct {
// UpdatedItem is the item being restored mutated by ItemAction.
UpdatedItem runtime.Unstructured
// AdditionalItems is a list of additional related items that should
// be restored.
AdditionalItems []ResourceIdentifier
// SkipRestore tells velero to stop executing further actions
// on this item, and skip the restore step. When this field's
// value is true, AdditionalItems will be ignored.
SkipRestore bool
// WaitForAdditionalItems determines whether velero will wait
// until AreAdditionalItemsReady returns true before restoring
// this item. If this field's value is true, then after restoring
// the returned AdditionalItems, velero will not restore this item
// until AreAdditionalItemsReady returns true or the timeout is
// reached. Otherwise, AreAdditionalItemsReady is not called.
WaitForAdditionalItems bool
}
```
One change from the initial issue discussion is that there are no
proposed changes to `RestoreItemActionExecuteOutput`. Rather than adding
a new `WaitForAdditionalItems` boolean similar to

### `RestoreItemAction` plugin interface changes

In order to implement the `AreAdditionalItemsReady` plugin func, there
are two different approaches we could take.

The first would be to simply add another entry to the
`RestoreItemAction` interface:
```
type RestoreItemAction interface {
// AppliesTo returns information about which resources this action should be invoked for.
// A RestoreItemAction's Execute function will only be invoked on items that match the returned
// selector. A zero-valued ResourceSelector matches all resources.
AppliesTo() (ResourceSelector, error)
// Execute allows the ItemAction to perform arbitrary logic with the item being restored,
// including mutating the item itself prior to restore. The item (unmodified or modified)
// should be returned, along with an optional slice of ResourceIdentifiers specifying additional
// related items that should be restored, a warning (which will be logged but will not prevent
// the item from being restored) or error (which will be logged and will prevent the item
// from being restored) if applicable.
Execute(input *RestoreItemActionExecuteInput) (*RestoreItemActionExecuteOutput, error)
// AreAdditionalItemsReady allows the ItemAction to communicate whether the passed-in
// slice of AdditionalItems (previously returned by Execute())
// are ready. Returns true if all items are ready, and false otherwise
AreAdditionalItemsReady(AdditionalItems []ResourceIdentifier) (bool, error)
}
```

The downside of this approach is that it is not backwards compatible,
and every `RestoreItemAction` plugin will have to implement the new
func, simply to return `true` in most cases, since the plugin will
either never return `AdditionalItems` from Execute or not have any
special readiness requirements.

The alternative to this would be to define an additional interface for
the optional func, leaving the `RestoreItemAction` interface alone.
```
type RestoreItemActionReadyCheck interface {
// AreAdditionalItemsReady allows the ItemAction to communicate whether the passed-in
// slice of AdditionalItems (previously returned by Execute())
// are ready. Returns true if all items are ready, and false otherwise
AreAdditionalItemsReady(restore *api.Restore, AdditionalItems []ResourceIdentifier) (bool, error)
}
```
In this case, existing plugins which do not need this functionality
can remain as-is, while plugins which want to make use of this
functionality will just need to implement the optional func. With the
optional interface approach, `itemsAvailable` will only wait if the
plugin can be type-asserted to the new interface:
```
if actionWithReadyCheck, ok := action.(RestoreItemActionReadyCheck); ok {
// wait for ready/timeout
} else {
return true, nil
}
```

0 comments on commit b330fa9

Please sign in to comment.