Skip to content

Use or remove "managed" datastore driver interface #12398

@brandond

Description

@brandond

We have a "managed" cluster datastore driver interface, but it isn't well used used. The interface is defined here:

type Driver interface {
SetControlConfig(config *config.Control) error
IsInitialized() (bool, error)
Register(handler http.Handler) (http.Handler, error)
Reset(ctx context.Context, reboostrap func() error) error
IsReset() (bool, error)
ResetFile() string
Start(ctx context.Context, clientAccessInfo *clientaccess.Info) error
Restore(ctx context.Context) error
EndpointName() string
Snapshot(ctx context.Context) (*SnapshotResult, error)
ReconcileSnapshotData(ctx context.Context) error
GetMembersClientURLs(ctx context.Context) ([]string, error)
RemoveSelf(ctx context.Context) error
}

There are many issues with this interface:

  • We have only one implementation of this interface: embedded etcd. At one point dqlite was planned to provide another implementation, but this never went anywhere.
  • The interface does not cover all the things we need to do with the cluster datastore, and a large number of things just call around it into the etcd package directly.
  • Cluster bootstrap code assumes that the presence of a managed driver indicates managed etcd, and tries to either communicate with another etcd cluster member, or start up a temporary etcd instance to extract the bootstrap data. This should all be handled by the driver interface.

Examples of the leakiness of the abstraction abound in the cluster startup code, for example:

if c.managedDB != nil {
if c.config.DisableETCD {
// secondary server with etcd disabled, start the etcd proxy so that we can attempt to use it
// when reconciling.
if err := c.startEtcdProxy(ctx); err != nil {
return pkgerrors.WithMessage(err, "failed to start etcd proxy")
}
} else if isInitialized && !clusterReset {
// For secondary servers with etcd, first attempt to connect and reconcile using the join URL.
// This saves on having to start up a temporary etcd just to extract bootstrap data.

We should either use this interface, or get rid of it

From a future-proofing perspective, I think that the best path forward would be to enforce its use:

  • Add a mock driver implementation that can be used in tests to validate that we are not bypassing the interface anywhere
  • Rework kine to function as a managed driver implementation, returning NotImplemented errors as necessary.
  • Rework managed etcd (including snapshot save/reset/restore) to function only through the interface, and remove anything that pokes directly at etcd

Metadata

Metadata

Assignees

Labels

area/etcdpriority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.

Type

Projects

Status

Accepted

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions