Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 63: Added support for ephemeral storage #215

Merged
merged 26 commits into from
Aug 14, 2020

Conversation

Prabhaker24
Copy link
Contributor

@Prabhaker24 Prabhaker24 commented Jul 8, 2020

Signed-off-by: prabhaker24 prabhaker.saxena@dell.com

Change log description

This pr is to enable zookeeper cluster to be able to use ephemeral storage, At any point, only one storage can be there either ephemeral or persistent and if none is specified zookeeper cluster will use persistent storage.

Purpose of the change

fixes #63

What the code does

This PR allows users to create zookeepers with ephemeral storage instead of PVs.

To enable ephemeral storage add this to the manifest,

spec:
    storageType: ephemeral   

How to verify it

  1. Create a ZK cluster with storageType as Ephemeral, in this case, Ephemeral storage should get created
  2. Create a ZK cluster with storageType as persistence, in this case, persistence storage should get created
  3. Create a ZK cluster with no storageType, in this case, Persistence storage should get created.
  4. Try upgrading the operator keeping the ZK cluster same operator should be able to reflect the correct persistence configuration post-upgrade.
    Note:- In case of ephemeral storage, the cluster may not be able to come back up if more than quorum number of nodes are restarted simultaneously.
    I have verified all the above 4 scenarios

prabhaker24 added 4 commits July 8, 2020 16:58
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@codecov-commenter
Copy link

codecov-commenter commented Jul 8, 2020

Codecov Report

Merging #215 into master will increase coverage by 0.23%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #215      +/-   ##
==========================================
+ Coverage   83.37%   83.60%   +0.23%     
==========================================
  Files          11       11              
  Lines        1203     1220      +17     
==========================================
+ Hits         1003     1020      +17     
  Misses        137      137              
  Partials       63       63              
Impacted Files Coverage Δ
...g/apis/zookeeper/v1beta1/zookeepercluster_types.go 98.30% <100.00%> (+0.08%) ⬆️
...kg/apis/zookeeper/v1beta1/zz_generated.deepcopy.go 100.00% <100.00%> (ø)
...er/zookeepercluster/zookeepercluster_controller.go 60.73% <100.00%> (ø)
pkg/zk/generators.go 98.31% <100.00%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 410c7e8...1db7f35. Read the comment docs.

prabhaker24 added 2 commits July 9, 2020 16:03
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@Prabhaker24 Prabhaker24 requested a review from anishakj July 9, 2020 10:41
@Prabhaker24 Prabhaker24 marked this pull request as ready for review July 9, 2020 10:41
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@@ -43,10 +43,18 @@ spec:
syncLimit: {{ .Values.config.syncLimit }}
quorumListenOnAllIPs: {{ .Values.config.quorumListenOnAllIPs }}
{{- end }}
{{- if .Values.ehemeral.enabled }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be ephemeral
Also this check should be removed since the value of ephemeral.enabled needs to be set in all cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{{- if .Values.ehemeral.enabled }}
ephemeral:
enabled: {{ .Values.ephemeral.enabled }}
emptydirvolumesource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the check before this.
add the following if block before setting value for emptydirvolumesource

ephemeral:
   enabled: {{ .Values.ephemeral.enabled }}
   {{- if .Values.ephemeral.emptydirvolumesource }}
   emptydirvolumesource:
      medium: {{ .Values.ephemeral.emptydirvolumesource.medium }}
      sizeLimit: {{ .Values.ephemeral.emptydirvolumesource.sizeLimit }}
   {{- end }}

Also the check in front of persistence can be changed to be created only when ephemeral.enabled is set to false

@@ -26,6 +26,12 @@ config: {}
# syncLimit: 2
# quorumListenOnAllIPs: false

ephemeral:
enabled: false
#emptydirvolumesource:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't comment emptydirvolumesource
Instead set it to

   emptydirvolumesource: {}
      #medium: ""/Memory
      #sizeLimit: 15Gi

If user wishes to provide a value he can remove {} and uncomment the options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -95,7 +89,7 @@ spec:
fieldPath: metadata.name
- name: OPERATOR_NAME
value: zookeeper-operator
image: pravega/zookeeper-operator:latest
image: testzkop/zookeeper-operator-testimages:v0.2.8-rc0-4-dirty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be using test images here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it.

@@ -67,6 +67,10 @@ type ZookeeperClusterSpec struct {
// PersistentVolumeClaimSpec and VolumeReclaimPolicy can be specified in here.
Persistence *Persistence `json:"persistence,omitempty"`

//Ephemeral is the configuration which helps create ephemeral storage
//At anypoint only one of Persistence or Ephemeral should be present in the manifest
Ephemeral *Ephemeral `json:"ephemeral,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this under storage, we can provide multiple storage options under strorage like ephemeral and persistent storage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the structure and added storage struct and moved persistence and ephemeral in it

prabhaker24 added 6 commits July 15, 2020 20:14
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
…into Issue-63-enabling-ephemeral-storage
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
sizeLimit: {{ .Values.ephemeral.emptydirvolumesource.sizeLimit }}
{{- end }}
{{- end }}
{{- if eq .Values.storageType.type "persistence" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of {{- else }} and not as part of a separate if block. Only then would this be treated as the default even when no value is provided for storageType.type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -25,13 +25,24 @@ config: {}
# tickTime: 2000
# syncLimit: 2
# quorumListenOnAllIPs: false
storageType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename this strorage and make persistence and ephemeral as part of this block

storage:
   type: persistence
   persistence:
      storageClassName: standard
      reclaimPolicy: Delete
      volumeSize: 20Gi
    ephemeral:
       emptydirvolumesource: {}
          medium: Memory
          sizeLimit: 15Gi  

volumeSize: 20Gi

ephemeral:
enabled: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the enabled field from here. Whether ephemeral will be enabled/not will be determined by the value of storage.type field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

prabhaker24 added 2 commits July 17, 2020 09:35
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Copy link
Contributor

@HoustonPutman HoustonPutman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I love the new structure of the storage section! Thanks so much for taking care of this @Prabhaker24

reclaimPolicy: Delete
volumeSize: 20Gi

emptydirvolumesource: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ephemeral.emptydirvolumesource?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are correct I have changed it.

| `persistence.reclaimPolicy` | Reclaim policy for persistent volumes | `Delete` |
| `persistence.storageClassName` | Storage class for persistent volumes | `standard` |
| `persistence.volumeSize` | Size of the volume requested for persistent volumes | `20Gi` |
| `ephemeral.enabled` | whether ephemeral storage is enabled or not | `false` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is no longer needed, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed it

Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@Prabhaker24
Copy link
Contributor Author

Overall I love the new structure of the storage section! Thanks so much for taking care of this @Prabhaker24

Thanks @HoustonPutman

Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
| `persistence.reclaimPolicy` | Reclaim policy for persistent volumes | `Delete` |
| `persistence.storageClassName` | Storage class for persistent volumes | `standard` |
| `persistence.volumeSize` | Size of the volume requested for persistent volumes | `20Gi` |
| `ephemeral.emptydirvolumesource.medium` | What type of storage medium should back the directory. | `""` |
| `ephemeral.emptydirvolumesource.sizeLimit` | Total amount of local storage required for the EmptyDir volume. | `nil` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil is not a valid default value in helm. Please leave it empty or give an empty string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 35 to 42
persistence:
storageClassName: standard
## specifying reclaim policy for PersistentVolumes
## accepted values - Delete / Retain
reclaimPolicy: Delete
volumeSize: 20Gi
storageClassName: standard
# specifying reclaim policy for PersistentVolumes
# accepted values - Delete / Retain
reclaimPolicy: Delete
volumeSize: 20Gi

ephemeral:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add one more level of indentation so that the blocks persistence and ephemeral come within the storage block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


ephemeral:
emptydirvolumesource: {}
#medium: ""/Memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of mentioning Memory as an accepted value for medium as /Memory, please add a comment in the previous line which lists the accepted values, and keep just the default value here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -62,6 +62,7 @@ func testZookeeperCluster(t *testing.T) {
"testUpgradeCluster": testUpgradeCluster,
"testCreateRecreateCluster": testCreateRecreateCluster,
"testScaleCluster": testScaleCluster,
"testephemeralstorage": testephemeralstorage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to testEphemeralStorage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Copy link
Contributor

@SrishT SrishT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@anishakj anishakj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please change the file zookeeper_v1beta1_zookeepercluster_cr.yaml to reflect spec changes

README.md Outdated
spec:
replicas: 3
storage:
storagetype: ephemeral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to storageType

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

prabhaker24 added 2 commits July 24, 2020 15:27
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@Prabhaker24
Copy link
Contributor Author

Could you please change the file zookeeper_v1beta1_zookeepercluster_cr.yaml to reflect spec changes

Thanks for reminding this, I have additionally changed the tag to 0.2.8 as well.

README.md Show resolved Hide resolved
@@ -610,7 +610,7 @@ func (r *ReconcileZookeeperCluster) yamlConfigMap(instance *zookeeperv1beta1.Zoo
}

func (r *ReconcileZookeeperCluster) reconcileFinalizers(instance *zookeeperv1beta1.ZookeeperCluster) (err error) {
if instance.Spec.Persistence.VolumeReclaimPolicy != zookeeperv1beta1.VolumeReclaimPolicyDelete {
if instance.Spec.Storage.Persistence != nil && instance.Spec.Storage.Persistence.VolumeReclaimPolicy != zookeeperv1beta1.VolumeReclaimPolicyDelete {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the check for " instance.Spec.Storage.Persistence != nil"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in case if the storage is of type Ephemeral, "instance.Spec.Storage.Persistence" will be nil and in that case, this statement will panic

pkg/zk/generators.go Outdated Show resolved Hide resolved
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
@Prabhaker24 Prabhaker24 changed the title added support for ephemeral storage Issue 63: Added support for ephemeral storage Jul 28, 2020
prabhaker24 added 2 commits August 3, 2020 10:46
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Copy link
Contributor

@anishakj anishakj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

README.md Outdated
NAME REPLICAS READY REPLICAS VERSION DESIRED VERSION INTERNAL ENDPOINT EXTERNAL ENDPOINT AGE
example 3 3 0.2.7 0.2.7 10.100.200.18:2181 N/A 94s
```
>Note: User should only provide either persistence or ephemeral in the spec, if none of the values is specified default is persistence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "persistent" and not "persistence" storage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rephrased this statement.

README.md Outdated
```
>Note: User should only provide either persistence or ephemeral in the spec, if none of the values is specified default is persistence

>Note: We don't guarantee Data Recovery in case of Ephemeral Storage, so the users should keep in mind that in the case of zookeeper pod restarts user might lose the data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not about data recovery but about cluster recovery. In case of ephemeral storage, the cluster may not be able to come back up if >quorum number of nodes are restarted simultaneously and that is exactly what we should state here.
Did we test if restarts on < quorum number of nodes works with ephemeral storage. In theory it should work, but we should test it and clarify accordingly in the doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @pbelgundi you are right I have changed the documentation and I have tested that in case we are restarting less than quorum number of nodes with ephemeral storage we are able to get ensemble.

persistence:
reclaimPolicy: Delete
reclaimPolicy: Retain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing this? This has implications for Pravega and we need it to be Delete for Pravega to work on reinstall without reinstalling Zookeeper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created two sample cr for both pravega and ecs and mentioned in the cr folder

s.Ephemeral.EmptyDirVolumeSource = v1.EmptyDirVolumeSource{}
changed = true
}
if !strings.EqualFold(s.StorageType, "ephemeral") && s.Persistence != nil && s.Persistence.withDefaults() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace this "if" with "else" for the earlier "if"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have restructured this part of the code for readability.

prabhaker24 added 2 commits August 13, 2020 23:48
Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
…into Issue-63-enabling-ephemeral-storage
```
>Note: User should only provide value for either the field persistence or ephemeral in the spec if none of the values is specified default is persistence

>Note: In case of ephemeral storage, the cluster may not be able to come back up if more than quorum number of nodes are restarted simultaneously.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also mention data loss on node restart.

Copy link
Contributor

@pbelgundi pbelgundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with just 1 minor comment.

Signed-off-by: prabhaker24 <prabhaker.saxena@dell.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Zookeepers to be run with ephemeral storage.
6 participants