Skip to content

Latest commit

 

History

History
326 lines (288 loc) · 13 KB

bucket-class-crd.md

File metadata and controls

326 lines (288 loc) · 13 KB

NooBaa Operator /

BucketClass CRD

The BucketClass CRD represents a structure that defines bucket policies relating to data placement, namespace properties, replication policies and more.

Note that placement-bucketclass and namespace-bucketclass both use the same CR, and the difference lies inside the bucket class' spec section, more specifically the presence of either the placementPolicy or namespacePolicy key.

Definitions

Placement Policy

A placement bucket class defines a policy for standard buckets - i.e. NooBaa buckets that are backed by backingstores. The data placement capabilities are built as a multi-layer structure, here are the layers bottom-up:

  • Spread Layer - list of backing-stores, aggregates the storage of multiple stores.
  • Mirroring Layer - list of spread-layers, async-mirroring to all mirrors, with locality optimization (will allocate on the closest region to the source endpoint), mirroring requires at least two backing-stores.
  • Tiering Layer - list of mirroring-layers, push cold data to next tier.

Namespace Policy

A namespace bucket class defines a policy for namespace buckets - i.e. NooBaa buckets that are backed by namespacestores. There are several types of namespace policies:

  • Single - a single namespace store is used for both read and write operations on the target bucket
  • Multi - a single namespace store is used for write operations, and a list of namespace stores can be used for read operations
  • Cache - functions similarly to Single, except with an additional TTL key, which dictates the time-to-live of the cached data

Time-to-live (TTL)

Cache bucketclasses work by saving read objects in a chosen backingstore, which leads to faster access times in the future. In order to make sure that the cached object is not out of sync with the one in the remote target, an ETag comparison might be run upon read, depending on the TTL that the user chooses. The TTL can fall in one of three categories:

  • Negative (e.g. -1) - when the user knows there are no out of band writes, they can use a negative TTL, which means no revalidations are done; if the object is in the cache - it is returned without an ETag comparison. This is the most performant option.
  • Zero (0) - the cache will always compare the object's ETag before returning it. This option has a performance cost of getting the ETag from the remote target on each object read. This is the least performant option.
  • Positive (denoted in milliseconds, e.g. 3600000 equals to an hour) - once an object was read and saved in the cache, the chosen amount of time will have to pass prior to the object's ETag being compared again.

Replication Policy

It is possible to set a bucketclass-wide replication policy, that will be inherited and used by all future buckets created under that bucketclass.

A replication policy is a JSON-compliant string which defines an array of rules -

  • Each rule is an object containing a rule_id, a destination bucket, and an optional filter key that contains a prefix field.
  • When a filter with prefix is provided - only objects keys that match the prefix will be replicated

Replication policy:

A bucket-class will define a replication policy for all future NooBaa buckets who will utilize it. The policy is a JSON-compliant array of rules (examples are provided at the bottom of this section)

  • Each rule is an object that contains the following keys:
    • rule_id - which identifies the rule
    • destination_bucket - which dictates the target NooBaa buckets that the objects will be copied to
    • (optional) {"filter": {"prefix": <>}} - if the user wishes to filter the objects that are replicated, the value of this field can be set to a prefix string
    • (optional, log-based optimization, see below) sync_deletions - can be set to a boolean value to indicate whether deletions should be replicated
    • (optional, log-based optimization, see below) sync_versions - can be set to a boolean value to indicate whether object versions should be replicated

In addition, when the bucketclass is backed by namespacestores, each policy can be set to optimize replication by utilizing logs (configured and supplied by the user, currently only supports AWS S3 and Azure Blob):

  • (optional) log_replication_info - an object that contains data related to log-based replication optimization -
    • (optional on AWS) endpoint_type - this field can be set to an appropriate endpoint type (currently, only AZURE is supported)
    • (necessary on AWS) {"logs_location": {"logs_bucket": <>}} - this field should be set to the location of the AWS S3 server access logs

An example of an AWS replication policy with log optimization:

'{"rules":[{"rule_id":"aws-rule-1", "destination_bucket":"first.bucket", "filter": {"prefix": "a."}}], "log_replication_info": {"logs_location": {"logs_bucket": "logsarehere"}}}'

An example of an Azure replication policy with log optimization:

'{"rules":[{"rule_id":"azure-rule-1", "sync_deletions": true, "sync_versions": false, "destination_bucket":"first.bucket"}], "log_replication_info": {"endpoint_type": "AZURE"}}'

These policies can also be saved as files and passed to the NooBaa CLI. In that case, please note it's necessary to omit the outer single quotes.

Definitions

Constraints:

  • A backing store name may appear in more than one bucket class but may not appear more than once in a certain bucket class.
  • The operator CLI currently only supports a single tier placement policy for a bucket class.
  • Thus, YAML must be used to create a bucket class with a placement policy that has multiple tiers.
  • Upon creating standard buckets, the user will first need to create a placement bucketclass which contains a placemant policy.
  • Upon creating namespace buckets, the user will first need to create a namespace bucketclass which contains a namespace policy.
  • A namespace bucket class of type cache must contain both a placement and a namespace policy.
  • A namespace bucket class of type single/multi must contain a namespace policy.
  • Placement policy is case sensitive and should be of value Mirror or Spread when more than one backingstore is provided.
  • Namespace policy is case sensitive and should be of values Single, Multi or Cache.

Reconciliation

  • The operator will verify that the bucket class is valid - i.e. that the backingstores and namespacestores exist and can be accessed and used.
  • Changes to a bucket class spec will be propagated to buckets that were instantiated from it.
  • Other than that the bucket class is passive, just waiting there for new buckets to use it.

Resource Status

It is possible to check a resource's status in several ways, including:

  • kubectl get bucketclass -A <NAME> -o yaml (will retrieve bucketclasses from all cluster namespaces)
  • kubectl describe bucketclass <NAME>
  • noobaa bucketclass status <NAME>

Below is an example of a healthy bucket class' status, as retrieved with the first command:

apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: noobaa-default-class
  namespace: app-namespace
spec:
  ...
status:
  conditions:
  - lastHeartbeatTime: "2019-11-05T13:50:50Z"
    lastTransitionTime: "2019-11-07T07:03:58Z"
    message: noobaa operator completed reconcile - bucket class is ready
    reason: BucketClassPhaseReady
    status: "True"
    type: Available
  - lastHeartbeatTime: "2019-11-05T13:50:50Z"
    lastTransitionTime: "2019-11-07T07:03:58Z"
    message: noobaa operator completed reconcile - bucket class is ready
    reason: BucketClassPhaseReady
    status: "False"
    type: Progressing
  - lastHeartbeatTime: "2019-11-05T13:50:50Z"
    lastTransitionTime: "2019-11-05T13:50:50Z"
    message: noobaa operator completed reconcile - bucket class is ready
    reason: BucketClassPhaseReady
    status: "False"
    type: Degraded
  - lastHeartbeatTime: "2019-11-05T13:50:50Z"
    lastTransitionTime: "2019-11-07T07:03:58Z"
    message: noobaa operator completed reconcile - bucket class is ready
    reason: BucketClassPhaseReady
    status: "True"
    type: Upgradeable
  phase: Ready

Examples

Please note that CLI (noobaa) examples need NooBaa to run under app-namespace, despite the fact bucketclasses are supported in all namespaces

Single tier, single backing store, Spread placement:

noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs --placement Spread
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - bs
      placement: Spread

Single tier, two backing stores, Spread placement:

noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs1,bs2 --placement Spread
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - bs1
      - bs2
      placement: Spread

Single tier, two backing stores, Mirror placement:

noobaa -n app-namespace bucketclass create placement-bucketclass bc --backingstores bs1,bs2 --placement Mirror
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - bs1
      - bs2
      placement: Mirror

Two tiers (only achievable by applying a YAML at the moment) - single backing stores per tier, Spread placement in tiers:

apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - bs1
      placement: Spread
    - backingStores:
      - bs2
      placement: Spread

Two tiers (only achievable by applying a YAML at the moment) - two backing stores per tier, Spread placement in first tier and Mirror in second tier:

apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - bs1
      - bs2
      placement: Spread
    - backingStores:
      - bs3
      - bs4
      placement: Mirror

Namespace bucketclass, a single read and write resource in Azure:

noobaa -n app-namespace bucketclass create namespace-bucketclass single bc --resource azure-blob-ns
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  namespacePolicy:
    type: Single
    single: 
      resource: azure-blob-ns

Namespace bucketclass, a single write resource in AWS, multiple read resources in AWS and Azure:

noobaa -n app-namespace bucketclass create namespace-bucketclass multi bc --write-resource aws-s3-ns --read-resources aws-s3-ns,azure-blob-ns 
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  namespacePolicy:
    type: Multi
    multi:
      writeResource: aws-s3-ns 
      readResources:
      - aws-s3-ns
      - azure-blob-ns

Namespace bucketclass, cache stored in noobaa-default-backing-store, objects are read from and written to IBM COS:

noobaa -n app-namespace bucketclass create namespace-bucketclass cache bc --hub-resource ibm-cos-ns --ttl 36000 --backingstores noobaa-default-backing-store
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  namespacePolicy:
    type: Cache
    cache:
      caching:
        ttl: 36000
      hubResource: ibm-cos-ns 
  placementPolicy:
    tiers:
    - backingStores:
      - noobaa-default-backing-store

Namespace bucketclass with replication to first.bucket:

/path/to/json-file.json is the path to a JSON file which defines the replication policy

noobaa -n app-namespace bucketclass create namespace-bucketclass single bc --resource azure-blob-ns --replication-policy=/path/to/json-file.json
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  name: bc
  namespace: app-namespace
spec:
  namespacePolicy:
    type: Single
    single: 
      resource: azure-blob-ns
  replicationPolicy: [{ "rule_id": "rule-1", "destination_bucket": "first.bucket", "filter": {"prefix": "ba"}}]

Bucket class in a namespace other than the NooBaa system namespace. <TARGET-NOOBAA-SYSTEM-NAMESPACE> is the namespace where the NooBaa system is deployed:

apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  labels:
    noobaa-operator: <TARGET-NOOBAA-SYSTEM-NAMESPACE>
    app: noobaa
  name: bc
  namespace: app-namespace
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - noobaa-test-backing-store