[BACKPORT][v1.5.3][BUG] Failing to mount encrypted volumes v1.5.2 #7048

github-actions · 2023-11-06T06:15:21Z

backport #7045

longhorn-io-github-bot · 2023-11-06T10:47:59Z

Pre Ready-For-Testing Checklist

Where is the reproduce steps/test steps documented?
The reproduce steps/test steps are at:

#7045 (comment)

Is there a workaround for the issue? If so, where is it documented?
The workaround is at:
Does the PR include the explanation for the fix or the feature?
Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
The PR is at

longhorn/longhorn-manager#2282

Which areas/issues this PR might have potential impacts on?
Area: encrypted rwx volume
Issues

roger-ryao · 2023-11-13T03:54:11Z

Verified on v1.5.x-head 20231113

longhorn v1.5.x-head a5807b8
longhorn-manager v1.5.x-head longhorn/longhorn-manager@b53422e

The test steps

#7045 (comment)

Add storage class kubectl apply -f https://raw.githubusercontent.com/clemenko/k8s_yaml/master/longhorn_encryption.yml

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-crypto-per-volume
provisioner: driver.longhorn.io
allowVolumeExpansion: true
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "80" # 2880 - 48 hours in minutes
  fromBackup: ""
  encrypted: "true"
  # per volume secret which utilizes the `pvc.name` and `pvc.namespace` template parameters
  csi.storage.k8s.io/provisioner-secret-name: ${pvc.name}
  csi.storage.k8s.io/provisioner-secret-namespace: ${pvc.namespace}
  csi.storage.k8s.io/node-publish-secret-name: ${pvc.name}
  csi.storage.k8s.io/node-publish-secret-namespace: ${pvc.namespace}
  csi.storage.k8s.io/node-stage-secret-name: ${pvc.name}
  csi.storage.k8s.io/node-stage-secret-namespace: ${pvc.namespace}

Deploy an App that Requires Encryption kubectl apply -f https://raw.githubusercontent.com/clemenko/fleet/main/flask/flask.yaml

# clemenko
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flask
  labels:
    app: flask
spec:
  replicas: 8
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
  selector:
    matchLabels:
      app: flask
  template:
    metadata:
      labels:
        app: flask
    spec:
      containers:
      - name: flask
        securityContext:
          allowPrivilegeEscalation: false
        image: clemenko/flask_simple
        #command: [ "/bin/sh", "-c", "sleep 3003003240204242" ]
        ports:
        - containerPort: 5000
        imagePullPolicy: Always
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis
        args: ["--appendonly", "yes"]
        securityContext:
          allowPrivilegeEscalation: false
          seLinuxOptions:
            level: "s0:c123,c456"
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: redis-data
          mountPath: /data
          subPath: 
      volumes:
      - name: redis-data
        persistentVolumeClaim:
          claimName: redis
---

apiVersion: v1
kind: Secret
metadata:
  name: redis
stringData:
  CRYPTO_KEY_VALUE: "flaskisthebestdemoapplication"
  CRYPTO_KEY_PROVIDER: "secret"

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: redis
  labels:
    app: redis
spec:
  storageClassName: "longhorn-crypto-per-volume"
  accessModes: 
    - ReadWriteMany
  resources:
    requests:
      storage: 500Mi
---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: flask
    kubernetes.io/name: "flask"
  name: flask
spec:
  selector:
    app: flask
  ports:
  - name: flask
    protocol: TCP
    port: 5000
    targetPort: 5000
---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: redis
    kubernetes.io/name: "redis"
  name: redis
spec:
  selector:
    app: redis
  ports:
  - name: redis
    protocol: TCP
    port: 6379
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: flask
spec:
  rules:
  - host: flask.rfed.io
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: flask
            port:
              number: 5000
# ---
#apiVersion: ui.cattle.io/v1
#kind: NavLink
#metadata:
#  name: flask
#spec:
#  label: Flask
#  target: _blank
#  toService:
#    name: flask
#    namespace: flask
#    port: '5000'
#    scheme: http

Result Passed

The deployments and services are running successfully

khushboo-rancher · 2023-11-13T20:04:21Z

@roger-ryao Could you please test few more test cases?

Expansion of encrypted volume.
Attaching/detaching of encrypted volume. e.g. - Killing IM pod, rebooting attached node.
Restoring of encrypted volumes and crashing the IM pod when the restore is in progress.

Note: Leverage the automation scripts with encrypted volume if possible.

roger-ryao · 2023-11-14T08:32:45Z

reopen an

@roger-ryao Could you please test few more test cases?

Expansion of encrypted volume.

Attaching/detaching of encrypted volume. e.g. - Killing IM pod, rebooting attached node.

Restoring of encrypted volumes and crashing the IM pod when the restore is in progress.

Note: Leverage the automation scripts with encrypted volume if possible.

Reopen ticket and test a few more test cases

derekbit · 2023-11-14T08:48:17Z

@roger-ryao
Can you create e2e tests for your manual tests? We can implement them later. Thank you.

roger-ryao · 2023-11-14T09:12:16Z

Verified on v1.5.3-rc1 20231114

longhorn v1.5.3-rc1 464f60c
longhorn-manager v1.5.3-rc1 longhorn/longhorn-manager@0a679f6

Result Passed

1. Expansion of encrypted volume.
2. Attaching/detaching of encrypted volume. e.g. rebooting attached node 5 times.
3. Restoring of encrypted volumes and crashing the IM pod when the restore is in progress.
1. Restoring Encrypted Volumes and Deleting All Instance Manager Pods in Progress: In v1.5.x, use the following command to delete all Instance Manager pods when a restore is in progress, and you observe the restore volume's replicas in a "Failed" state with "robustness" marked as "faulted," and the state as "Detached"
  kubectl -n longhorn-system delete pods -l longhorn.io/component=instance-manager --wait
1. Restoring Encrypted Volumes and Deleting One Instance Manager Pod ： We could observe one of the restored volume's replicas transitioning to a Failed state with robustness marked as Unknown, and the state as Detached. Despite this, you can still mount pods with this restored volume. The Failed replica will be automatically deleted, and a new replica will be generated, the data remains consistent.
  kubectl get pod -n longhorn-system -l longhorn.io/component -o wide | grep instance-manager | grep -E 'w1' | awk '{print $1}' | xargs kubectl -n longhorn-system delete pod

@roger-ryao Can you create e2e tests for your manual tests? We can implement them later. Thank you.

Hi @derekbit
We can track e2e tests at #7055.

derekbit · 2023-11-14T09:34:55Z

Verified on v1.5.3-rc1 20231114

longhorn v1.5.3-rc1 464f60c

longhorn-manager v1.5.3-rc1 longhorn/longhorn-manager@0a679f6

Result Passed

1. Expansion of encrypted volume.

2. Attaching/detaching of encrypted volume. e.g. rebooting attached node 5 times.

3. Restoring of encrypted volumes and crashing the IM pod when the restore is in progress.

@roger-ryao Can you create e2e tests for your manual tests? We can implement them later. Thank you.

Hi @derekbit We can track e2e tests at #7055.

Thank you. Can you help add encrypted volume restore?

roger-ryao · 2023-11-14T09:56:12Z

Thank you. Can you help add encrypted volume restore?

Hi @derekbit
Create the ticket #7097

github-actions bot added this to the v1.5.3 milestone Nov 6, 2023

github-actions bot assigned derekbit Nov 6, 2023

innobead added the regression/1.5.2 Regression in <version> label Nov 6, 2023

innobead assigned roger-ryao Nov 6, 2023

roger-ryao closed this as completed Nov 13, 2023

roger-ryao reopened this Nov 14, 2023

roger-ryao mentioned this issue Nov 14, 2023

[TEST] Implement/backport volume encryption tests for 1.5.x branch #7055

Open

3 tasks

roger-ryao closed this as completed Nov 14, 2023

roger-ryao mentioned this issue Nov 14, 2023

[TEST] Implement encryption volume backup and restore test #7097

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BACKPORT][v1.5.3][BUG] Failing to mount encrypted volumes v1.5.2 #7048

[BACKPORT][v1.5.3][BUG] Failing to mount encrypted volumes v1.5.2 #7048

github-actions bot commented Nov 6, 2023

longhorn-io-github-bot commented Nov 6, 2023 •

edited by derekbit

roger-ryao commented Nov 13, 2023

khushboo-rancher commented Nov 13, 2023

roger-ryao commented Nov 14, 2023

derekbit commented Nov 14, 2023

roger-ryao commented Nov 14, 2023 •

edited

derekbit commented Nov 14, 2023

roger-ryao commented Nov 14, 2023

[BACKPORT][v1.5.3][BUG] Failing to mount encrypted volumes v1.5.2 #7048

[BACKPORT][v1.5.3][BUG] Failing to mount encrypted volumes v1.5.2 #7048

Comments

github-actions bot commented Nov 6, 2023

longhorn-io-github-bot commented Nov 6, 2023 • edited by derekbit

Pre Ready-For-Testing Checklist

roger-ryao commented Nov 13, 2023

khushboo-rancher commented Nov 13, 2023

roger-ryao commented Nov 14, 2023

derekbit commented Nov 14, 2023

roger-ryao commented Nov 14, 2023 • edited

derekbit commented Nov 14, 2023

roger-ryao commented Nov 14, 2023

longhorn-io-github-bot commented Nov 6, 2023 •

edited by derekbit

roger-ryao commented Nov 14, 2023 •

edited