Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifiction on Multiple Deployments notes #210

Closed
reefland opened this issue Jun 15, 2022 · 17 comments
Closed

Clarifiction on Multiple Deployments notes #210

reefland opened this issue Jun 15, 2022 · 17 comments

Comments

@reefland
Copy link

The main README.md has the following comments:

Multiple Deployments

You may install multiple deployments of each/any driver. It requires the following:

    Use a new helm release name for each deployment
    Make sure you have a unique csiDriver.name in the values file
    Use unqiue names for your storage classes (per cluster)
    Use a unique parent dataset (ie: don't try to use the same parent across deployments or clusters)

I have two independent K3s clusters with nothing shared but TrueNAS backend for iSCSI and NFS. Two independent ArgoCD repositories are used to build each. I'll call them DEV and TEST. Essentially deployment file are identical between them other than ingress route names, IP addresses, etc. mostly minor items. Ansible is used to build each of them.

Regarding Democratic-CSI each cluster points to a different ZFS parent dataset but otherwise they are the same deployment. There is only ONE deployment of ISCSI provider and NFS provider per cluster. All other TrueNAS connectivity between the two is the same same user ID for SSH, same iSCSI connection configuration.

What I observed on a deployment of Kube Prometheus Stack to DEV - flawless, everything worked as expected. no issues. Created 3 iSCSI PV in the correct parent dataset,

I then went to deploy it to TEST, and it got a little weird:

  • It created three ZVOLs (one for each PV) in the correct (different) parent dataset.
  • Each of the three PVC in TEST has status Bound and points and a volume name in the correct parent dataset.

Based on that I assumed all was OK. However, when I log into TEST ENV Grafana which worked, the data is weird. I see my Work in Progress dashboards from the DEV environment that have yet to be deployed to TEST. These dashboards were not part of the ArgoCD deployment for TEST, its somehow picking them up from the database.

I check on TrueNAS, the status of the 3 ZVOLs created for TEST which are in Bond status... and each of them are still just 88Kib in size and not growing. They are clearly not being used.

Both clusters generated the same claim names (expected), with same StorageClass, Reclaim Policy, Access Mode, Capacity. 6 PVs got created, each with different volume name, each of the volume names line up to a ZVOL in the correct parent dataset and all six report "bound" status. It seems like everything got created correctly, but at some layer the CSI is confused.

I'm wondering if its Make sure you have a unique csiDriver.name in the values file is that unique per cluster? or need to be unique across all clusters?

@reefland
Copy link
Author

These are the TEST PVCs:

$ k get pvc -n monitoring
NAME                                                                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0   Bound    pvc-b356d1d3-79bb-4737-bd83-ceac4dca3466   3Gi        RWO            freenas-iscsi-csi   3h28m
kube-prometheus-stack-grafana                                                                          Bound    pvc-7bb3ad17-c24f-478c-8e1c-56de92181748   5Gi        RWO            freenas-iscsi-csi   3h28m
prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0           Bound    pvc-13f299a8-b547-49ca-b663-a91b9aa82c43   50Gi       RWO            freenas-iscsi-csi   3h28m

Respective TEST PVs:

$ k get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                                                             STORAGECLASS        REASON   AGE
pvc-13f299a8-b547-49ca-b663-a91b9aa82c43   50Gi       RWO            Delete           Bound    monitoring/prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0           freenas-iscsi-csi            3h28m
pvc-7bb3ad17-c24f-478c-8e1c-56de92181748   5Gi        RWO            Delete           Bound    monitoring/kube-prometheus-stack-grafana                                                                          freenas-iscsi-csi            3h28m
pvc-b356d1d3-79bb-4737-bd83-ceac4dca3466   3Gi        RWO            Delete           Bound    monitoring/alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0   freenas-iscsi-csi            3h28m

Example of the TEST PVC from Prometheus:

Volumes:
  prometheus-kube-prometheus-stack-prometheus-db:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0
    ReadOnly:   false

DEV PVCs:

$ k get pvc -n monitoring
NAME                                                                                                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0   Bound    pvc-ee4f9764-d189-41b8-b269-4ad332d47019   3Gi        RWO            freenas-iscsi-csi   3d21h
prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0           Bound    pvc-1e9354df-2fe7-42af-afb5-5373c60d3d30   50Gi       RWO            freenas-iscsi-csi   2d6h
kube-prometheus-stack-grafana                                                                          Bound    pvc-d3036ecd-7b1d-4547-a8e8-0bfc543c30af   5Gi        RWO            freenas-iscsi-csi   28h

Hours later, still no signs of disk usage on TEST:

root@truenas[~]# zfs list -r  main/k8s/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/k8s/iscsi/v                                            464K  24.4T      200K  /mnt/main/k8s/iscsi/v
main/k8s/iscsi/v/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43    88K  24.4T       88K  -
main/k8s/iscsi/v/pvc-7bb3ad17-c24f-478c-8e1c-56de92181748    88K  24.4T       88K  -
main/k8s/iscsi/v/pvc-b356d1d3-79bb-4737-bd83-ceac4dca3466    88K  24.4T       88K  -

DEV space is being consumed:

root@truenas[~# zfs list -r  main/kts/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/kts/iscsi/v                                           2.72G  24.4T      200K  /mnt/main/kts/iscsi/v
main/kts/iscsi/v/pvc-1e9354df-2fe7-42af-afb5-5373c60d3d30  2.71G  24.4T     2.71G  -
main/kts/iscsi/v/pvc-d3036ecd-7b1d-4547-a8e8-0bfc543c30af  4.09M  24.4T     4.09M  -
main/kts/iscsi/v/pvc-ee4f9764-d189-41b8-b269-4ad332d47019  1.94M  24.4T     1.94M  -

Also the two clusters are fighting over the same PVs. I have observed multiple times where the dashboard in TEST stops working, to even showing no dashboards available.... but then DEV starts working for a while. And they flip flop now and then.

so when its not winning, I see restarts in the iscsi csi-driver logs:

{"host":"k3s03","level":"info","message":"new request - driver: FreeNASSshDriver method: NodeGetVolumeStats call: {\"_events\":{},\"_eventsCount\":0,\"call\":{\"_events\":{},\"_eventsCount\":5,\"stream\":{\"_readableState\":{\"objectMode\":false,\"highWaterMark\":16384,\"buffer\":{\"head\":null,\"tail\":null,\"length\":0},\"length\":0,\"pipes\":[],\"flowing\":true,\"ended\":true,\"endEmitted\":true,\"reading\":false,\"constructed\":true,\"sync\":false,\"needReadable\":false,\"emittedReadable\":false,\"readableListening\":false,\"resumeScheduled\":false,\"errorEmitted\":false,\"emitClose\":true,\"autoDestroy\":false,\"destroyed\":false,\"errored\":null,\"closed\":false,\"closeEmitted\":false,\"defaultEncoding\":\"utf8\",\"awaitDrainWriters\":null,\"multiAwaitDrain\":false,\"readingMore\":false,\"dataEmitted\":true,\"decoder\":null,\"encoding\":null},\"_events\":{},\"_eventsCount\":8,\"_writableState\":{\"objectMode\":false,\"highWaterMark\":16384,\"finalCalled\":false,\"needDrain\":false,\"ending\":false,\"ended\":false,\"finished\":false,\"destroyed\":false,\"decodeStrings\":false,\"defaultEncoding\":\"utf8\",\"length\":0,\"writing\":false,\"corked\":0,\"sync\":true,\"bufferProcessing\":false,\"writecb\":null,\"writelen\":0,\"afterWriteTickInfo\":null,\"buffered\":[],\"bufferedIndex\":0,\"allBuffers\":true,\"allNoop\":true,\"pendingcb\":0,\"constructed\":true,\"prefinished\":false,\"errorEmitted\":false,\"emitClose\":true,\"autoDestroy\":false,\"errored\":null,\"closed\":false,\"closeEmitted\":false},\"allowHalfOpen\":true},\"handler\":{\"type\":\"unary\",\"path\":\"/csi.v1.Node/NodeGetVolumeStats\"},\"options\":{},\"cancelled\":false,\"deadlineTimer\":{\"_idleTimeout\":119987,\"_idlePrev\":{\"expiry\":187150,\"id\":-9007199254740985,\"msecs\":119987,\"priorityQueuePosition\":2},\"_idleStart\":67163,\"_timerArgs\":[null],\"_repeat\":null,\"_destroyed\":false},\"deadline\":1655334671751,\"wantTrailers\":false,\"metadataSent\":false,\"canPush\":false,\"isPushPending\":false,\"bufferedMessages\":[],\"messagesToPush\":[],\"maxSendMessageSize\":-1,\"maxReceiveMessageSize\":4194304},\"metadata\":{\"user-agent\":[\"grpc-go/1.40.0\"],\"x-forwarded-host\":[\"/var/lib/kubelet/plugins/org.democratic-csi.iscsi/csi.sock\"]},\"request\":{\"volume_id\":\"pvc-13f299a8-b547-49ca-b663-a91b9aa82c43\",\"volume_path\":\"/var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount\",\"staging_target_path\":\"\"},\"cancelled\":false}","service":"democratic-csi","timestamp":"2022-06-15T23:09:11.766Z"}
executing mount command: findmnt --mountpoint /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount --output source,target,fstype,label,options -b -J --nofsroot
executing mount command: findmnt --mountpoint /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount --output source,target,fstype,label,options -b -J --nofsroot
executing filesystem command: realpath /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount
executing mount command: findmnt --mountpoint /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount/block_device --output source,target,fstype,label,options -b -J --nofsroot
failed to execute filesystem command: realpath /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount, response: {"code":1,"stdout":"","stderr":"realpath: /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount: Input/output error\n","timeout":false}
{"date":"Wed Jun 15 2022 23:09:11 GMT+0000 (Coordinated Universal Time)","error":{"code":"ERR_UNHANDLED_REJECTION"},"exception":true,"host":"k3s03","level":"error","message":"uncaughtException: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason \"#<Object>\".\nUnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason \"#<Object>\".","os":{"loadavg":[0.14,0.66,0.86],"uptime":75003.5},"process":{"argv":["/usr/local/bin/node","/home/csi/app/bin/democratic-csi","--csi-version=1.5.0","--csi-name=org.democratic-csi.iscsi","--driver-config-file=/config/driver-config-file.yaml","--log-level=info","--csi-mode=node","--server-socket=/csi-data/csi.sock.internal"],"cwd":"/home/csi/app","execPath":"/usr/local/bin/node","gid":0,"memoryUsage":{"arrayBuffers":93473,"external":17906038,"heapTotal":19595264,"heapUsed":17430936,"rss":49528832},"pid":1,"uid":0,"version":"v16.14.2"},"service":"democratic-csi","stack":"UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason \"#<Object>\".","timestamp":"2022-06-15T23:09:11.811Z","trace":[]}
running server shutdown, exit code: UnhandledPromiseRejection: This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason "#<Object>".
executing mount command: findmnt --mountpoint /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount --output source,target,fstype,label,options -b -J --nofsroot
executing mount command: findmnt --mountpoint /var/lib/kubelet/pods/87b6f2eb-917a-4bf2-861b-507723cf6545/volumes/kubernetes.io~csi/pvc-13f299a8-b547-49ca-b663-a91b9aa82c43/mount --output source,target,fstype,label,options,avail,size,used -b -J --nofsroot
{"host":"k3s03","level":"info","message":"new response - driver: FreeNASSshDriver method: NodeGetVolumeStats response: {\"usage\":[{\"available\":48720650240,\"total\":53660876800,\"used\":4940226560,\"unit\":\"BYTES\"}]}","service":"democratic-csi","timestamp":"2022-06-15T23:09:11.832Z"}
grpc server gracefully closed all connections
server fully shutdown, exiting

@travisghansen
Copy link
Member

Welcome! Your understanding is firm on how things work and what the ramifications are. I’m guessing nfs is working fine and so the issue is isolated only to iscsi.

The issue with iscsi is that the target/extent names are in a ‘global’ namespace because you’re not actually strictly speaking at that layer using zfs semantics. To alleviate this issue see lines 72-74 around here: https://github.com/democratic-csi/democratic-csi/blob/master/examples/freenas-iscsi.yaml#L73

Review the notes there and be cognizant of the limits, and then I would set the prefix or suffix to be your cluster identifier (ie: dev or test) and you shouldn’t see fighting any longer.

Be sure to delete all the current assets and start clean before starting your test as well. I would delete in k8s and then also manually ensure on the storage system that assets look sane as you may have some weirdness based on what has transpired to this point.

@travisghansen
Copy link
Member

Actually unless the pvs have the same name that would be pretty unlikely to collide as well. Can you send over the pv yaml from both clusters (remove any sensitive info)? That may help me pinpoint what could be going on.

@reefland
Copy link
Author

I was already uninstalling and cleaning up. Completed a full cleanup on both clusters. CSI cleanup did not need any manual work, everything deleted on its own:

root@truenas[~]# zfs list -r  main/k8s/iscsi/v
NAME               USED  AVAIL     REFER  MOUNTPOINT
main/k8s/iscsi/v   200K  24.4T      200K  /mnt/main/k8s/iscsi/v
root@truenas[~]# zfs list -r  main/kts/iscsi/v
NAME               USED  AVAIL     REFER  MOUNTPOINT
main/kts/iscsi/v   200K  24.4T      200K  /mnt/main/kts/iscsi/v

Uninstalled CSI iSCSI Helm as well.

Now I have Ansible render the nameSuffix as the name of the parent datasets with any "/" converted to "-". Such as:

  namePrefix: "csi-"
  nameSuffix: "-main-kts"

Now the iSCSI target and extent names, look different:

csi-monitoring-kube-prometheus-stack-grafana-main-kts
csi-monitoring-kube-prometheus-stack-grafana-main-k8s

I let ArgoCD do its magic to redeploy the deltas and all ZVOls started to show growth:

root@truenas[~]# zfs list -r  main/k8s/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/k8s/iscsi/v                                           34.9M  24.4T      200K  /mnt/main/k8s/iscsi/v
main/k8s/iscsi/v/pvc-16534375-4566-46bf-9512-88611ee38b69  32.0M  24.4T     32.0M  -
main/k8s/iscsi/v/pvc-81520914-a593-433d-b215-4ca56317694e   720K  24.4T      720K  -
main/k8s/iscsi/v/pvc-c168a209-ff09-41c0-997b-5457307e7b62  1.99M  24.4T     1.99M  -
root@truenas[~]# zfs list -r  main/kts/iscsi/v
NAME                                                        USED  AVAIL     REFER  MOUNTPOINT
main/kts/iscsi/v                                           30.6M  24.4T      200K  /mnt/main/kts/iscsi/v
main/kts/iscsi/v/pvc-df3f31df-1154-40e7-b1d4-c6796ffce202  2.00M  24.4T     2.00M  -
main/kts/iscsi/v/pvc-e48fa856-92df-425c-94bc-4305df7951e2  27.7M  24.4T     27.7M  -
main/kts/iscsi/v/pvc-fddf0e2e-79f5-4eee-97f7-0ddc3fca7931   720K  24.4T      720K  -

I'll let things run overnight and see if any driver pods got restarted.

@travisghansen
Copy link
Member

Do you have nameTemplate set? If so put here what it is/was set to.

@reefland
Copy link
Author

I do not. it's whatever the default is.

No restarts on the driver pods overnight. Everything seems stable.

Do you still want the pv yaml ?

@travisghansen
Copy link
Member

Yeah and send over a fully rendered config too. I think the template must be set because normally the iscsi assets have the pvc id in there. Using namespace/name is helpful but will cause issues if the lengths of those grow (allowed lengths in k8s will exceed the allowed lengths of the iscsi names).

@reefland
Copy link
Author

Prometheus DB PV:

$ k get pv pvc-16534375-4566-46bf-9512-88611ee38b69 -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: org.democratic-csi.iscsi
  creationTimestamp: "2022-06-16T03:16:28Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-16534375-4566-46bf-9512-88611ee38b69
  resourceVersion: "51346854"
  uid: 9cdd119e-1a66-4056-b334-e7524173db5b
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 50Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0
    namespace: monitoring
    resourceVersion: "51346804"
    uid: 16534375-4566-46bf-9512-88611ee38b69
  csi:
    controllerExpandSecretRef:
      name: controller-expand-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    controllerPublishSecretRef:
      name: controller-publish-secret-freenas-iscsi-csi-democratic-csi-iscs
      namespace: democratic-csi
    driver: org.democratic-csi.iscsi
    fsType: xfs
    nodePublishSecretRef:
      name: node-publish-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    nodeStageSecretRef:
      name: node-stage-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    volumeAttributes:
      interface: ""
      iqn: iqn.2005-10.org.freenas.ctl:csi-monitoring-prometheus-kube-prometheus-stack-prometheus-db-prometheus-kube-prometheus-stack-prometheus-0-main-k8s
      lun: "0"
      node_attach_driver: iscsi
      portal: truenas.[DOMAIN_REDACTED]:3260
      portals: ""
      provisioner_driver: freenas-iscsi
      storage.kubernetes.io/csiProvisionerIdentity: 1655349034999-8081-org.democratic-csi.iscsi
    volumeHandle: pvc-16534375-4566-46bf-9512-88611ee38b69
  persistentVolumeReclaimPolicy: Retain
  storageClassName: freenas-iscsi-csi
  volumeMode: Filesystem
status:
  phase: Bound

Prometheus AlertManager PV:

$ k get pv pvc-81520914-a593-433d-b215-4ca56317694e -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: org.democratic-csi.iscsi
  creationTimestamp: "2022-06-16T03:16:24Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-81520914-a593-433d-b215-4ca56317694e
  resourceVersion: "51346783"
  uid: 203c9a12-52e2-4924-ae15-54235f2c57e8
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 3Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0
    namespace: monitoring
    resourceVersion: "51346662"
    uid: 81520914-a593-433d-b215-4ca56317694e
  csi:
    controllerExpandSecretRef:
      name: controller-expand-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    controllerPublishSecretRef:
      name: controller-publish-secret-freenas-iscsi-csi-democratic-csi-iscs
      namespace: democratic-csi
    driver: org.democratic-csi.iscsi
    fsType: xfs
    nodePublishSecretRef:
      name: node-publish-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    nodeStageSecretRef:
      name: node-stage-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    volumeAttributes:
      interface: ""
      iqn: iqn.2005-10.org.freenas.ctl:csi-monitoring-alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0-main-k8s
      lun: "0"
      node_attach_driver: iscsi
      portal: truenas.[DOMAIN_REDACTED]:3260
      portals: ""
      provisioner_driver: freenas-iscsi
      storage.kubernetes.io/csiProvisionerIdentity: 1655349034999-8081-org.democratic-csi.iscsi
    volumeHandle: pvc-81520914-a593-433d-b215-4ca56317694e
  persistentVolumeReclaimPolicy: Retain
  storageClassName: freenas-iscsi-csi
  volumeMode: Filesystem
status:
  phase: Bound

Grafana PV:

$ k get pv pvc-c168a209-ff09-41c0-997b-5457307e7b62 -oyaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: org.democratic-csi.iscsi
  creationTimestamp: "2022-06-16T03:16:21Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-c168a209-ff09-41c0-997b-5457307e7b62
  resourceVersion: "51346700"
  uid: 5a7c4f28-506b-4615-a5a5-d55e0c18c6d5
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: kube-prometheus-stack-grafana
    namespace: monitoring
    resourceVersion: "51346484"
    uid: c168a209-ff09-41c0-997b-5457307e7b62
  csi:
    controllerExpandSecretRef:
      name: controller-expand-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    controllerPublishSecretRef:
      name: controller-publish-secret-freenas-iscsi-csi-democratic-csi-iscs
      namespace: democratic-csi
    driver: org.democratic-csi.iscsi
    fsType: xfs
    nodePublishSecretRef:
      name: node-publish-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    nodeStageSecretRef:
      name: node-stage-secret-freenas-iscsi-csi-democratic-csi-iscsi
      namespace: democratic-csi
    volumeAttributes:
      interface: ""
      iqn: iqn.2005-10.org.freenas.ctl:csi-monitoring-kube-prometheus-stack-grafana-main-k8s
      lun: "0"
      node_attach_driver: iscsi
      portal: truenas.[DOMAIN_REDACTED]:3260
      portals: ""
      provisioner_driver: freenas-iscsi
      storage.kubernetes.io/csiProvisionerIdentity: 1655349034999-8081-org.democratic-csi.iscsi
    volumeHandle: pvc-c168a209-ff09-41c0-997b-5457307e7b62
  persistentVolumeReclaimPolicy: Retain
  storageClassName: freenas-iscsi-csi
  volumeMode: Filesystem
status:
  phase: Bound

@reefland
Copy link
Author

I pulled this from ArgoCD showing the values being applied to the Helm Chart.

project: default
source:
  repoURL: 'https://democratic-csi.github.io/charts/'
  targetRevision: 0.13.1
  helm:
    values: |
      csiDriver:
        # should be globally unique for a given cluster
        name: "org.democratic-csi.iscsi"

      controller:
        driver:
          image: docker.io/democraticcsi/democratic-csi:v1.6.3
          #########################################################################################
          # The following will pull from the developer's "next" branch, for testing new features.
          # image: democraticcsi/democratic-csi:next
          # imagePullPolicy: Always
          # logLevel: debug

      node:
        driver:
          image: docker.io/democraticcsi/democratic-csi:v1.6.3
          #########################################################################################
          # The following will pull from the developer's "next" branch, for testing new features.
          # image: democraticcsi/democratic-csi:next
          # imagePullPolicy: Always
          # logLevel: debug
          #########################################################################################
          # To confirm if the "next" image is being used by the containers:
          # kubectl get pods -n democratic-csi -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s '[[:space:]]' '\n' | grep next | uniq
          #########################################################################################

      storageClasses:
      - name: freenas-iscsi-csi
        defaultClass: False
        reclaimPolicy: Retain
        volumeBindingMode: Immediate
        allowVolumeExpansion: True
        parameters:
          fsType: xfs
        mountOptions: []
        secrets:
          provisioner-secret:
          controller-publish-secret:
          node-stage-secret:
          node-publish-secret:
          controller-expand-secret:

      volumeSnapshotClasses: []

      driver:
        existingConfigSecret: democratic-csi-iscsi-driver-config
        config:
          driver: freenas-iscsi
  chart: democratic-csi
destination:
  server: 'https://kubernetes.default.svc'
  namespace: democratic-csi
syncPolicy:
  automated:
    prune: true
    selfHeal: true
  syncOptions:
    - Validate=true

@reefland
Copy link
Author

The nameTemplate seems to be the default value in the secret:

$ k get secret democratic-csi-iscsi-driver-config -n democratic-csi -o yaml -o jsonpath='{.data}' | cut -d'"' -f 4 |base64 --decode

driver: freenas-iscsi
instance_id:
httpConnection:
  protocol: https
  host: truenas.[REDACTED]
  port: 443
  apiKey: [REDACTED]
  allowInsecure: False
  #apiVersion: 2
sshConnection:
  host: truenas.[REDACTED]
  port: 22
  username: [REDACTED]
  privateKey: |
    -----BEGIN OPENSSH PRIVATE KEY-----
    [REDACTED]
    -----END OPENSSH PRIVATE KEY-----

zfs:
  # Set a comment on the zvol to identify what it belongs to
  datasetProperties:
    "org.freenas:description": "{{ parameters.[csi.storage.k8s.io/pvc/namespace] }}/{{ parameters.[csi.storage.k8s.io/pvc/name] }}"
  datasetParentName: main/k8s/iscsi/v
  detachedSnapshotsDatasetParentName: main/k8s/iscsi/s
  zvolCompression: lz4
  zvolDedup:
  zvolEnableReservation: False
  zvolBlocksize: 
iscsi:
  targetPortal: "truenas.[REDACTED]:3260"
  targetPortals: [] # [ "server[:port]", "server[:port]", ... ]
  interface:

  namePrefix: csi-
  nameSuffix: "-main-k8s"

  # Set a comment on the target and extent to identify what it belongs to
  extentCommentTemplate: "{{ parameters.[csi.storage.k8s.io/pvc/namespace] }}-{{ parameters.[csi.storage.k8s.io/pvc/name] }}"
  nameTemplate: "{{ parameters.[csi.storage.k8s.io/pvc/namespace] }}-{{ parameters.[csi.storage.k8s.io/pvc/name] }}"
  # add as many as needed
  targetGroups:
    # get the correct ID from the "portal" section in the UI
    - targetGroupPortalGroup: 1
      # get the correct ID from the "initiators" section in the UI
      targetGroupInitiatorGroup: 1
      # None, CHAP, or CHAP Mutual
      targetGroupAuthType: None
      # get the correct ID from the "Authorized Access" section of the UI
      # only required if using Chap
      targetGroupAuthGroup: 
  extentInsecureTpc: true
  extentXenCompat: false
  extentDisablePhysicalBlocksize: true
  extentBlocksize: 4096
  extentRpm: "5400"
  extentAvailThreshold: 0

@travisghansen
Copy link
Member

Yeah that is an example, not the default. Just comment that line altogether for anything even remotely production.

@travisghansen
Copy link
Member

Delete all pvs again and start over fyi.

@reefland
Copy link
Author

Done. The names of the Targets and Extents are much smaller now, each looks something like:

Such as:

csi-pvc-69940283-6e0a-4ff3-b2ff-98d08931c3f0-main-k8s

Instead of:

csi-monitoring-alertmanager-kube-prometheus-stack-alertmanager-db-alertmanager-kube-prometheus-stack-alertmanager-0-main-k8s

So where I think I went wrong originally was to uncomment the nameTemplate assuming it was the default value and leaving the namePrefix and nameSuffix unchanged. That combination results in a name that can be easily duplicated across clusters and stepping on each other on the TrueNAS backend.

My fault for the assumption about nameTemplate as it clearly says above it default is "{{ name }}"

I would recommend you include in the Multiple Deployments that the namePrefix and nameSuffix should probably be unique per cluster as well.

I assume this is good now?

@travisghansen
Copy link
Member

Yeah agreed on the additional note. Thanks for bringing it up and working through it!

Let’s leave it open until the doc is updated.

@travisghansen
Copy link
Member

Added some extra notes in the README and merged.

@reefland
Copy link
Author

Typo? does seem to be a link to #210: the issue (See [#210][i210]).

@travisghansen
Copy link
Member

Yeah I’ll see if I can get the syntax correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants