Skip to content

Latest commit

 

History

History
3513 lines (2843 loc) · 179 KB

spec.md

File metadata and controls

3513 lines (2843 loc) · 179 KB

Container Storage Interface (CSI)

Authors:

Notational Conventions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" are to be interpreted as described in RFC 2119 (Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997).

The key words "unspecified", "undefined", and "implementation-defined" are to be interpreted as described in the rationale for the C99 standard.

An implementation is not compliant if it fails to satisfy one or more of the MUST, REQUIRED, or SHALL requirements for the protocols it implements. An implementation is compliant if it satisfies all the MUST, REQUIRED, and SHALL requirements for the protocols it implements.

Terminology

Term Definition
Volume A unit of storage that will be made available inside of a CO-managed container, via the CSI.
Block Volume A volume that will appear as a block device inside the container.
Mounted Volume A volume that will be mounted using the specified file system and appear as a directory inside the container.
CO Container Orchestration system, communicates with Plugins using CSI service RPCs.
SP Storage Provider, the vendor of a CSI plugin implementation.
RPC Remote Procedure Call.
Node A host where the user workload will be running, uniquely identifiable from the perspective of a Plugin by a node ID.
Plugin Aka “plugin implementation”, a gRPC endpoint that implements the CSI Services.
Plugin Supervisor Process that governs the lifecycle of a Plugin, MAY be the CO.
Workload The atomic unit of "work" scheduled by a CO. This MAY be a container or a collection of containers.

Objective

To define an industry standard “Container Storage Interface” (CSI) that will enable storage vendors (SP) to develop a plugin once and have it work across a number of container orchestration (CO) systems.

Goals in MVP

The Container Storage Interface (CSI) will

  • Enable SP authors to write one CSI compliant Plugin that “just works” across all COs that implement CSI.
  • Define API (RPCs) that enable:
    • Dynamic provisioning and deprovisioning of a volume.
    • Attaching or detaching a volume from a node.
    • Mounting/unmounting a volume from a node.
    • Consumption of both block and mountable volumes.
    • Local storage providers (e.g., device mapper, lvm).
    • Creating and deleting a snapshot (source of the snapshot is a volume).
    • Provisioning a new volume from a snapshot (reverting snapshot, where data in the original volume is erased and replaced with data in the snapshot, is out of scope).
  • Define plugin protocol RECOMMENDATIONS.
    • Describe a process by which a Supervisor configures a Plugin.
    • Container deployment considerations (CAP_SYS_ADMIN, mount namespace, etc.).

Non-Goals in MVP

The Container Storage Interface (CSI) explicitly will not define, provide, or dictate:

  • Specific mechanisms by which a Plugin Supervisor manages the lifecycle of a Plugin, including:
    • How to maintain state (e.g. what is attached, mounted, etc.).
    • How to deploy, install, upgrade, uninstall, monitor, or respawn (in case of unexpected termination) Plugins.
  • A first class message structure/field to represent "grades of storage" (aka "storage class").
  • Protocol-level authentication and authorization.
  • Packaging of a Plugin.
  • POSIX compliance: CSI provides no guarantee that volumes provided are POSIX compliant filesystems. Compliance is determined by the Plugin implementation (and any backend storage system(s) upon which it depends). CSI SHALL NOT obstruct a Plugin Supervisor or CO from interacting with Plugin-managed volumes in a POSIX-compliant manner.

Solution Overview

This specification defines an interface along with the minimum operational and packaging recommendations for a storage provider (SP) to implement a CSI compatible plugin. The interface declares the RPCs that a plugin MUST expose: this is the primary focus of the CSI specification. Any operational and packaging recommendations offer additional guidance to promote cross-CO compatibility.

Architecture

The primary focus of this specification is on the protocol between a CO and a Plugin. It SHOULD be possible to ship cross-CO compatible Plugins for a variety of deployment architectures. A CO SHOULD be equipped to handle both centralized and headless plugins, as well as split-component and unified plugins. Several of these possibilities are illustrated in the following figures.

                             CO "Master" Host
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    | Controller |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

                            CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    |    Node    |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

Figure 1: The Plugin runs on all nodes in the cluster: a centralized
Controller Plugin is available on the CO master host and the Node
Plugin is available on all of the CO Nodes.
                            CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    | Controller |  |
|  |            +--+-------->   Plugin   |  |
|  +------------+  |        +------------+  |
|                  |                        |
|                  |                        |
|                  |        +------------+  |
|                  |        |    Node    |  |
|                  +-------->   Plugin   |  |
|                           +------------+  |
|                                           |
+-------------------------------------------+

Figure 2: Headless Plugin deployment, only the CO Node hosts run
Plugins. Separate, split-component Plugins supply the Controller
Service and the Node Service respectively.
                            CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    | Controller |  |
|  |            +----------->    Node    |  |
|  +------------+           |   Plugin   |  |
|                           +------------+  |
|                                           |
+-------------------------------------------+

Figure 3: Headless Plugin deployment, only the CO Node hosts run
Plugins. A unified Plugin component supplies both the Controller
Service and Node Service.
                            CO "Node" Host(s)
+-------------------------------------------+
|                                           |
|  +------------+           +------------+  |
|  |     CO     |   gRPC    |    Node    |  |
|  |            +----------->   Plugin   |  |
|  +------------+           +------------+  |
|                                           |
+-------------------------------------------+

Figure 4: Headless Plugin deployment, only the CO Node hosts run
Plugins. A Node-only Plugin component supplies only the Node Service.
Its GetPluginCapabilities RPC does not report the CONTROLLER_SERVICE
capability.

Volume Lifecycle

   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+             +---v----+---+             +-+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

Figure 5: The lifecycle of a dynamically provisioned volume, from
creation to destruction.
   CreateVolume +------------+ DeleteVolume
 +------------->|  CREATED   +--------------+
 |              +---+----^---+              |
 |       Controller |    | Controller       v
+++         Publish |    | Unpublish       +++
|X|          Volume |    | Volume          | |
+-+             +---v----+---+             +-+
                | NODE_READY |
                +---+----^---+
               Node |    | Node
              Stage |    | Unstage
             Volume |    | Volume
                +---v----+---+
                |  VOL_READY |
                +---+----^---+
               Node |    | Node
            Publish |    | Unpublish
             Volume |    | Volume
                +---v----+---+
                | PUBLISHED  |
                +------------+

Figure 6: The lifecycle of a dynamically provisioned volume, from
creation to destruction, when the Node Plugin advertises the
STAGE_UNSTAGE_VOLUME capability.
    Controller                  Controller
       Publish                  Unpublish
        Volume  +------------+  Volume
 +------------->+ NODE_READY +--------------+
 |              +---+----^---+              |
 |             Node |    | Node             v
+++         Publish |    | Unpublish       +++
|X| <-+      Volume |    | Volume          | |
+++   |         +---v----+---+             +-+
 |    |         | PUBLISHED  |
 |    |         +------------+
 +----+
   Validate
   Volume
   Capabilities

Figure 7: The lifecycle of a pre-provisioned volume that requires
controller to publish to a node (`ControllerPublishVolume`) prior to
publishing on the node (`NodePublishVolume`).
       +-+  +-+
       |X|  | |
       +++  +^+
        |    |
   Node |    | Node
Publish |    | Unpublish
 Volume |    | Volume
    +---v----+---+
    | PUBLISHED  |
    +------------+

Figure 8: Plugins MAY forego other lifecycle steps by contraindicating
them via the capabilities API. Interactions with the volumes of such
plugins is reduced to `NodePublishVolume` and `NodeUnpublishVolume`
calls.

The above diagrams illustrate a general expectation with respect to how a CO MAY manage the lifecycle of a volume via the API presented in this specification. Plugins SHOULD expose all RPCs for an interface: Controller plugins SHOULD implement all RPCs for the Controller service. Unsupported RPCs SHOULD return an appropriate error code that indicates such (e.g. CALL_NOT_IMPLEMENTED). The full list of plugin capabilities is documented in the ControllerGetCapabilities and NodeGetCapabilities RPCs.

Container Storage Interface

This section describes the interface between COs and Plugins.

RPC Interface

A CO interacts with an Plugin through RPCs. Each SP MUST provide:

  • Node Plugin: A gRPC endpoint serving CSI RPCs that MUST be run on the Node whereupon an SP-provisioned volume will be published.
  • Controller Plugin: A gRPC endpoint serving CSI RPCs that MAY be run anywhere.
  • In some circumstances a single gRPC endpoint MAY serve all CSI RPCs (see Figure 3 in Architecture).
syntax = "proto3";
package csi.v1;

import "google/protobuf/descriptor.proto";
import "google/protobuf/timestamp.proto";
import "google/protobuf/wrappers.proto";

option go_package = 
  "github.com/container-storage-interface/spec/lib/go/csi";

extend google.protobuf.EnumOptions {
  // Indicates that this enum is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_enum = 1060;
}
extend google.protobuf.EnumValueOptions {
  // Indicates that this enum value is OPTIONAL and part of an
  // experimental API that may be deprecated and eventually removed
  // between minor releases.
  bool alpha_enum_value = 1060;
}
extend google.protobuf.FieldOptions {
  // Indicates that a field MAY contain information that is sensitive
  // and MUST be treated as such (e.g. not logged).
  bool csi_secret = 1059;

  // Indicates that this field is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_field = 1060;
}
extend google.protobuf.MessageOptions {
  // Indicates that this message is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_message = 1060;
}
extend google.protobuf.MethodOptions {
  // Indicates that this method is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_method = 1060;
}
extend google.protobuf.ServiceOptions {
  // Indicates that this service is OPTIONAL and part of an experimental
  // API that may be deprecated and eventually removed between minor
  // releases.
  bool alpha_service = 1060;
}

There are three sets of RPCs:

  • Identity Service: Both the Node Plugin and the Controller Plugin MUST implement this sets of RPCs.
  • Controller Service: The Controller Plugin MUST implement this sets of RPCs.
  • Node Service: The Node Plugin MUST implement this sets of RPCs.
service Identity {
  rpc GetPluginInfo(GetPluginInfoRequest)
    returns (GetPluginInfoResponse) {}

  rpc GetPluginCapabilities(GetPluginCapabilitiesRequest)
    returns (GetPluginCapabilitiesResponse) {}

  rpc Probe (ProbeRequest)
    returns (ProbeResponse) {}
}

service Controller {
  rpc CreateVolume (CreateVolumeRequest)
    returns (CreateVolumeResponse) {}

  rpc DeleteVolume (DeleteVolumeRequest)
    returns (DeleteVolumeResponse) {}

  rpc ControllerPublishVolume (ControllerPublishVolumeRequest)
    returns (ControllerPublishVolumeResponse) {}

  rpc ControllerUnpublishVolume (ControllerUnpublishVolumeRequest)
    returns (ControllerUnpublishVolumeResponse) {}

  rpc ValidateVolumeCapabilities (ValidateVolumeCapabilitiesRequest)
    returns (ValidateVolumeCapabilitiesResponse) {}

  rpc ListVolumes (ListVolumesRequest)
    returns (ListVolumesResponse) {}

  rpc GetCapacity (GetCapacityRequest)
    returns (GetCapacityResponse) {}

  rpc ControllerGetCapabilities (ControllerGetCapabilitiesRequest)
    returns (ControllerGetCapabilitiesResponse) {}

  rpc CreateSnapshot (CreateSnapshotRequest)
    returns (CreateSnapshotResponse) {}

  rpc DeleteSnapshot (DeleteSnapshotRequest)
    returns (DeleteSnapshotResponse) {}

  rpc ListSnapshots (ListSnapshotsRequest)
    returns (ListSnapshotsResponse) {}

  rpc ControllerExpandVolume (ControllerExpandVolumeRequest)
    returns (ControllerExpandVolumeResponse) {}

  rpc ControllerGetVolume (ControllerGetVolumeRequest)
    returns (ControllerGetVolumeResponse) {
        option (alpha_method) = true;
    }

  rpc ControllerModifyVolume (ControllerModifyVolumeRequest)
    returns (ControllerModifyVolumeResponse) {
        option (alpha_method) = true;
    }
}

service GroupController {
  rpc GroupControllerGetCapabilities (
        GroupControllerGetCapabilitiesRequest)
    returns (GroupControllerGetCapabilitiesResponse) {}

  rpc CreateVolumeGroupSnapshot(CreateVolumeGroupSnapshotRequest)
    returns (CreateVolumeGroupSnapshotResponse) {
    }

  rpc DeleteVolumeGroupSnapshot(DeleteVolumeGroupSnapshotRequest)
    returns (DeleteVolumeGroupSnapshotResponse) {
    }

  rpc GetVolumeGroupSnapshot(
        GetVolumeGroupSnapshotRequest)
    returns (GetVolumeGroupSnapshotResponse) {
    }
}

service SnapshotMetadata {
  option (alpha_service) = true;

  rpc GetMetadataAllocated(GetMetadataAllocatedRequest)
    returns (stream GetMetadataAllocatedResponse) {}

  rpc GetMetadataDelta(GetMetadataDeltaRequest)
    returns (stream GetMetadataDeltaResponse) {}
}

service Node {
  rpc NodeStageVolume (NodeStageVolumeRequest)
    returns (NodeStageVolumeResponse) {}

  rpc NodeUnstageVolume (NodeUnstageVolumeRequest)
    returns (NodeUnstageVolumeResponse) {}

  rpc NodePublishVolume (NodePublishVolumeRequest)
    returns (NodePublishVolumeResponse) {}

  rpc NodeUnpublishVolume (NodeUnpublishVolumeRequest)
    returns (NodeUnpublishVolumeResponse) {}

  rpc NodeGetVolumeStats (NodeGetVolumeStatsRequest)
    returns (NodeGetVolumeStatsResponse) {}


  rpc NodeExpandVolume(NodeExpandVolumeRequest)
    returns (NodeExpandVolumeResponse) {}


  rpc NodeGetCapabilities (NodeGetCapabilitiesRequest)
    returns (NodeGetCapabilitiesResponse) {}

  rpc NodeGetInfo (NodeGetInfoRequest)
    returns (NodeGetInfoResponse) {}
}

Concurrency

In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call “in-flight” per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The plugin SHOULD handle this as gracefully as possible. The error code ABORTED MAY be returned by the plugin in this case (see the Error Scheme section for details).

Field Requirements

The requirements documented herein apply equally and without exception, unless otherwise noted, for the fields of all protobuf message types defined by this specification. Violation of these requirements MAY result in RPC message data that is not compatible with all CO, Plugin, and/or CSI middleware implementations.

Size Limits

CSI defines general size limits for fields of various types (see table below). The general size limit for a particular field MAY be overridden by specifying a different size limit in said field's description. Unless otherwise specified, fields SHALL NOT exceed the limits documented here. These limits apply for messages generated by both COs and plugins.

Size Field Type
128 bytes string
4 KiB map<string, string>
REQUIRED vs. OPTIONAL
  • A field noted as REQUIRED MUST be specified, subject to any per-RPC caveats; caveats SHOULD be rare.
  • A repeated or map field listed as REQUIRED MUST contain at least 1 element.
  • A field noted as OPTIONAL MAY be specified and the specification SHALL clearly define expected behavior for the default, zero-value of such fields.

Scalar fields, even REQUIRED ones, will be defaulted if not specified and any field set to the default value will not be serialized over the wire as per proto3.

Timeouts

Any of the RPCs defined in this spec MAY timeout and MAY be retried. The CO MAY choose the maximum time it is willing to wait for a call, how long it waits between retries, and how many time it retries (these values are not negotiated between plugin and CO).

Idempotency requirements ensure that a retried call with the same fields continues where it left off when retried. The only way to cancel a call is to issue a "negation" call if one exists. For example, issue a ControllerUnpublishVolume call to cancel a pending ControllerPublishVolume operation, etc. In some cases, a CO MAY NOT be able to cancel a pending operation because it depends on the result of the pending operation in order to execute the "negation" call. For example, if a CreateVolume call never completes then a CO MAY NOT have the volume_id to call DeleteVolume with.

Error Scheme

All CSI API calls defined in this spec MUST return a standard gRPC status. Most gRPC libraries provide helper methods to set and read the status fields.

The status code MUST contain a canonical error code. COs MUST handle all valid error codes. Each RPC defines a set of gRPC error codes that MUST be returned by the plugin when specified conditions are encountered. In addition to those, if the conditions defined below are encountered, the plugin MUST return the associated gRPC error code.

Condition gRPC Code Description Recovery Behavior
Missing required field 3 INVALID_ARGUMENT Indicates that a required field is missing from the request. More human-readable information MAY be provided in the status.message field. Caller MUST fix the request by adding the missing required field before retrying.
Invalid or unsupported field in the request 3 INVALID_ARGUMENT Indicates that the one or more fields in this field is either not allowed by the Plugin or has an invalid value. More human-readable information MAY be provided in the gRPC status.message field. Caller MUST fix the field before retrying.
Permission denied 7 PERMISSION_DENIED The Plugin is able to derive or otherwise infer an identity from the secrets present within an RPC, but that identity does not have permission to invoke the RPC. System administrator SHOULD ensure that requisite permissions are granted, after which point the caller MAY retry the attempted RPC.
Operation pending for volume 10 ABORTED Indicates that there is already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off.
Call not implemented 12 UNIMPLEMENTED The invoked RPC is not implemented by the Plugin or disabled in the Plugin's current mode of operation. Caller MUST NOT retry. Caller MAY call GetPluginCapabilities, ControllerGetCapabilities, or NodeGetCapabilities to discover Plugin capabilities.
Not authenticated 16 UNAUTHENTICATED The invoked RPC does not carry secrets that are valid for authentication. Caller SHALL either fix the secrets provided in the RPC, or otherwise regalvanize said secrets such that they will pass authentication by the Plugin for the attempted RPC, after which point the caller MAY retry the attempted RPC.

The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by CO to end users.

The status details MUST be empty. In the future, this spec MAY require details to return a machine-parsable protobuf message if the status code is not OK to enable CO's to implement smarter error handling and fault resolution.

Secrets Requirements

Secrets MAY be required by plugin to complete a RPC request. A secret is a string to string map where the key identifies the name of the secret (e.g. "username" or "password"), and the value contains the secret data (e.g. "bob" or "abc123"). Each key MUST consist of alphanumeric characters, '-', '_' or '.'. Each value MUST contain a valid string. An SP MAY choose to accept binary (non-string) data by using a binary-to-text encoding scheme, like base64. An SP SHALL advertise the requirements for required secret keys and values in documentation. CO SHALL permit passing through the required secrets. A CO MAY pass the same secrets to all RPCs, therefore the keys for all unique secrets that an SP expects MUST be unique across all CSI operations. This information is sensitive and MUST be treated as such (not logged, etc.) by the CO.

Identity Service RPC

Identity service RPCs allow a CO to query a plugin for capabilities, health, and other metadata. The general flow of the success case MAY be as follows (protos illustrated in YAML for brevity):

  1. CO queries metadata via Identity RPC.
   # CO --(GetPluginInfo)--> Plugin
   request:
   response:
      name: org.foo.whizbang.super-plugin
      vendor_version: blue-green
      manifest:
        baz: qaz
  1. CO queries available capabilities of the plugin.
   # CO --(GetPluginCapabilities)--> Plugin
   request:
   response:
     capabilities:
       - service:
           type: CONTROLLER_SERVICE
  1. CO queries the readiness of the plugin.
   # CO --(Probe)--> Plugin
   request:
   response: {}

GetPluginInfo

message GetPluginInfoRequest {
  // Intentionally empty.
}

message GetPluginInfoResponse {
  // The name MUST follow domain name notation format
  // (https://tools.ietf.org/html/rfc1035#section-2.3.1). It SHOULD
  // include the plugin's host company name and the plugin name,
  // to minimize the possibility of collisions. It MUST be 63
  // characters or less, beginning and ending with an alphanumeric
  // character ([a-z0-9A-Z]) with dashes (-), dots (.), and
  // alphanumerics between. This field is REQUIRED.
  string name = 1;

  // This field is REQUIRED. Value of this field is opaque to the CO.
  string vendor_version = 2;

  // This field is OPTIONAL. Values are opaque to the CO.
  map<string, string> manifest = 3;
}
GetPluginInfo Errors

If the plugin is unable to complete the GetPluginInfo call successfully, it MUST return a non-ok gRPC code in the gRPC status.

GetPluginCapabilities

This REQUIRED RPC allows the CO to query the supported capabilities of the Plugin "as a whole": it is the grand sum of all capabilities of all instances of the Plugin software, as it is intended to be deployed. All instances of the same version (see vendor_version of GetPluginInfoResponse) of the Plugin SHALL return the same set of capabilities, regardless of both: (a) where instances are deployed on the cluster as well as; (b) which RPCs an instance is serving.

message GetPluginCapabilitiesRequest {
  // Intentionally empty.
}

message GetPluginCapabilitiesResponse {
  // All the capabilities that the controller service supports. This
  // field is OPTIONAL.
  repeated PluginCapability capabilities = 1;
}

// Specifies a capability of the plugin.
message PluginCapability {
  message Service {
    enum Type {
      UNKNOWN = 0;
      // CONTROLLER_SERVICE indicates that the Plugin provides RPCs for
      // the ControllerService. Plugins SHOULD provide this capability.
      // In rare cases certain plugins MAY wish to omit the
      // ControllerService entirely from their implementation, but such
      // SHOULD NOT be the common case.
      // The presence of this capability determines whether the CO will
      // attempt to invoke the REQUIRED ControllerService RPCs, as well
      // as specific RPCs as indicated by ControllerGetCapabilities.
      CONTROLLER_SERVICE = 1;

      // VOLUME_ACCESSIBILITY_CONSTRAINTS indicates that the volumes for
      // this plugin MAY NOT be equally accessible by all nodes in the
      // cluster. The CO MUST use the topology information returned by
      // CreateVolumeRequest along with the topology information
      // returned by NodeGetInfo to ensure that a given volume is
      // accessible from a given node when scheduling workloads.
      VOLUME_ACCESSIBILITY_CONSTRAINTS = 2;

      // GROUP_CONTROLLER_SERVICE indicates that the Plugin provides
      // RPCs for operating on groups of volumes. Plugins MAY provide
      // this capability.
      // The presence of this capability determines whether the CO will
      // attempt to invoke the REQUIRED GroupController service RPCs, as
      // well as specific RPCs as indicated by
      // GroupControllerGetCapabilities.
      GROUP_CONTROLLER_SERVICE = 3;

      // SNAPSHOT_METADATA_SERVICE indicates that the Plugin provides
      // RPCs to retrieve metadata on the allocated blocks of a single
      // snapshot, or the changed blocks between a pair of snapshots of
      // the same block volume.
      // The presence of this capability determines whether the CO will
      // attempt to invoke the OPTIONAL SnapshotMetadata service RPCs.
      SNAPSHOT_METADATA_SERVICE = 4 [(alpha_enum_value) = true];
    }
    Type type = 1;
  }

  message VolumeExpansion {
    enum Type {
      UNKNOWN = 0;

      // ONLINE indicates that volumes may be expanded when published to
      // a node. When a Plugin implements this capability it MUST
      // implement either the EXPAND_VOLUME controller capability or the
      // EXPAND_VOLUME node capability or both. When a plugin supports
      // ONLINE volume expansion and also has the EXPAND_VOLUME
      // controller capability then the plugin MUST support expansion of
      // volumes currently published and available on a node. When a
      // plugin supports ONLINE volume expansion and also has the
      // EXPAND_VOLUME node capability then the plugin MAY support
      // expansion of node-published volume via NodeExpandVolume.
      //
      // Example 1: Given a shared filesystem volume (e.g. GlusterFs),
      //   the Plugin may set the ONLINE volume expansion capability and
      //   implement ControllerExpandVolume but not NodeExpandVolume.
      //
      // Example 2: Given a block storage volume type (e.g. EBS), the
      //   Plugin may set the ONLINE volume expansion capability and
      //   implement both ControllerExpandVolume and NodeExpandVolume.
      //
      // Example 3: Given a Plugin that supports volume expansion only
      //   upon a node, the Plugin may set the ONLINE volume
      //   expansion capability and implement NodeExpandVolume but not
      //   ControllerExpandVolume.
      ONLINE = 1;

      // OFFLINE indicates that volumes currently published and
      // available on a node SHALL NOT be expanded via
      // ControllerExpandVolume. When a plugin supports OFFLINE volume
      // expansion it MUST implement either the EXPAND_VOLUME controller
      // capability or both the EXPAND_VOLUME controller capability and
      // the EXPAND_VOLUME node capability.
      //
      // Example 1: Given a block storage volume type (e.g. Azure Disk)
      //   that does not support expansion of "node-attached" (i.e.
      //   controller-published) volumes, the Plugin may indicate
      //   OFFLINE volume expansion support and implement both
      //   ControllerExpandVolume and NodeExpandVolume.
      OFFLINE = 2;
    }
    Type type = 1;
  }

  oneof type {
    // Service that the plugin supports.
    Service service = 1;
    VolumeExpansion volume_expansion = 2;
  }
}
GetPluginCapabilities Errors

If the plugin is unable to complete the GetPluginCapabilities call successfully, it MUST return a non-ok gRPC code in the gRPC status.

Probe

A Plugin MUST implement this RPC call. The primary utility of the Probe RPC is to verify that the plugin is in a healthy and ready state. If an unhealthy state is reported, via a non-success response, a CO MAY take action with the intent to bring the plugin to a healthy state. Such actions MAY include, but SHALL NOT be limited to, the following:

  • Restarting the plugin container, or
  • Notifying the plugin supervisor.

The Plugin MAY verify that it has the right configurations, devices, dependencies and drivers in order to run and return a success if the validation succeeds. The CO MAY invoke this RPC at any time. A CO MAY invoke this call multiple times with the understanding that a plugin's implementation MAY NOT be trivial and there MAY be overhead incurred by such repeated calls. The SP SHALL document guidance and known limitations regarding a particular Plugin's implementation of this RPC. For example, the SP MAY document the maximum frequency at which its Probe implementation SHOULD be called.

message ProbeRequest {
  // Intentionally empty.
}

message ProbeResponse {
  // Readiness allows a plugin to report its initialization status back
  // to the CO. Initialization for some plugins MAY be time consuming
  // and it is important for a CO to distinguish between the following
  // cases:
  //
  // 1) The plugin is in an unhealthy state and MAY need restarting. In
  //    this case a gRPC error code SHALL be returned.
  // 2) The plugin is still initializing, but is otherwise perfectly
  //    healthy. In this case a successful response SHALL be returned
  //    with a readiness value of `false`. Calls to the plugin's
  //    Controller and/or Node services MAY fail due to an incomplete
  //    initialization state.
  // 3) The plugin has finished initializing and is ready to service
  //    calls to its Controller and/or Node services. A successful
  //    response is returned with a readiness value of `true`.
  //
  // This field is OPTIONAL. If not present, the caller SHALL assume
  // that the plugin is in a ready state and is accepting calls to its
  // Controller and/or Node services (according to the plugin's reported
  // capabilities).
  .google.protobuf.BoolValue ready = 1;
}
Probe Errors

If the plugin is unable to complete the Probe call successfully, it MUST return a non-ok gRPC code in the gRPC status. If the conditions defined below are encountered, the plugin MUST return the specified gRPC error code. The CO MUST implement the specified error recovery behavior when it encounters the gRPC error code.

Condition gRPC Code Description Recovery Behavior
Plugin not healthy 9 FAILED_PRECONDITION Indicates that the plugin is not in a healthy/ready state. Caller SHOULD assume the plugin is not healthy and that future RPCs MAY fail because of this condition.
Missing required dependency 9 FAILED_PRECONDITION Indicates that the plugin is missing one or more required dependency. Caller MUST assume the plugin is not healthy.

Controller Service RPC

CreateVolume

A Controller Plugin MUST implement this RPC call if it has CREATE_DELETE_VOLUME controller capability. This RPC will be called by the CO to provision a new volume on behalf of a user (to be consumed as either a block device or a mounted filesystem).

This operation MUST be idempotent. If a volume corresponding to the specified volume name already exists, is accessible from accessibility_requirements, and is compatible with the specified capacity_range, volume_capabilities, parameters and mutable_parameters in the CreateVolumeRequest, the Plugin MUST reply 0 OK with the corresponding CreateVolumeResponse.

The parameters field SHALL contain opaque volume attributes to be specified at creation time. The mutable_parameters field SHALL contain opaque volume attributes that are defined at creation time but MAY also be changed during the lifetime of the volume via a subsequent ControllerModifyVolume RPC. Values specified in mutable_parameters MUST take precedence over the values from parameters.

Plugins MAY create 3 types of volumes:

  • Empty volumes. When plugin supports CREATE_DELETE_VOLUME OPTIONAL capability.
  • From an existing snapshot. When plugin supports CREATE_DELETE_VOLUME and CREATE_DELETE_SNAPSHOT OPTIONAL capabilities.
  • From an existing volume. When plugin supports cloning, and reports the OPTIONAL capabilities CREATE_DELETE_VOLUME and CLONE_VOLUME.

If CO requests a volume to be created from existing snapshot or volume and the requested size of the volume is larger than the original snapshotted (or cloned volume), the Plugin can either refuse such a call with OUT_OF_RANGE error or MUST provide a volume that, when presented to a workload by NodePublish call, has both the requested (larger) size and contains data from the snapshot (or original volume). Explicitly, it's the responsibility of the Plugin to resize the filesystem of the newly created volume at (or before) the NodePublish call, if the volume has VolumeCapability access type MountVolume and the filesystem resize is required in order to provision the requested capacity.