Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PWX-30455] feat: volume-mount override via custom mount #1168

Closed
wants to merge 8 commits into from

Conversation

Glitchfix
Copy link
Contributor

@Glitchfix Glitchfix commented Jul 26, 2023

What this PR does / why we need it:
override volume mounts in stc to mount it different host path

Which issue(s) this PR fixes (optional)
Closes #PWX-30455

Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
Copy link
Contributor

@nikita-bhatia nikita-bhatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Jul 27, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.02% 🎉

Comparison is base (8078b73) 75.82% compared to head (cdc561f) 75.85%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1168      +/-   ##
==========================================
+ Coverage   75.82%   75.85%   +0.02%     
==========================================
  Files          64       64              
  Lines       18100    18122      +22     
==========================================
+ Hits        13724    13746      +22     
  Misses       3403     3403              
  Partials      973      973              
Files Changed Coverage Δ
drivers/storage/portworx/deployment.go 96.02% <100.00%> (+0.06%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
@github-actions
Copy link

github-actions bot commented Aug 1, 2023

This PR is stale because it has been in review for 3 days with no activity.

Copy link
Collaborator

@zoxpx zoxpx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please add a "manual test" to prove that the code-change is working?

For example:

  1. add the following into the STC:
kind: StorageCluster
spec:
  volumes:
  - hostPath:
      path: /apps/px/cores
    mountPath: /var/cores
    name: px-cores
  1. get the POD-yaml from the resulting deployment, and
  2. show that the POD is using the overridden mount (either as the only mounting to /var/cores destination, or as the last one mounting to this destination).

return t.mountsFromVolInfo(volumeInfoList)
}

func (t *template) overrideVolumeMount(volInfos []volumeInfo) []volumeInfo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called overrideVolumeMount?

  • we are not overriding the internal mounts-list
  • instead, we are filtering (and extending) the list that was provided as input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not just filtering but also modifying the host path property inside the function

volPaths[volInfos[i].mountPath] = i
}

for i := range t.cluster.Spec.Volumes {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we'll take the volInfos first, then we'll append all the previous cluster.Spec.Volumes

This seems to be clashing with how the px-runc is working on the "lower level":

  • "default" mounts "go in" first
  • "custom mounts" will go after the "default mounts" -- and have the opportunity to override the defaults

Is this code behaving the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ticket requires allowing to existing mount paths with a custom hostpath
this function is triggered at last as this will allow us to add more mount paths as well as modify any existing mount paths that is utilized by our application

})
} else {
volInfos[existingVolIdx].hostPath = v.HostPath.String()
volInfos[existingVolIdx].mountPropagation = v.MountPropagation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we overriding only the host-path and the mount-propagation?

Shouldn't we at least copy the mount-options (i.e. read-only)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for existing mounts, if a stc yaml overrides the mount options then it could affect the functionality of porx
imagine someone makes the oci log file to be read only then we do not get to write the logs

@zoxpx
Copy link
Collaborator

zoxpx commented Aug 17, 2023

Ping?

@github-actions
Copy link

This PR is stale because it has been in review for 3 days with no activity.

@Glitchfix
Copy link
Contributor Author

Hi @zoxpx @ggriffiths

I have added the output of the requested changes as well

image: cnbuautomation800/px-operator:pwx30455

stc

# SOURCE: https://install.portworx.com/?operator=true&mc=false&kbver=1.25.0&ns=portworx&b=true&c=px-cluster-4d617504-5340-4a93-81d6-9255d71136c8&stork=true&csi=true&mon=true&tel=true&st=k8s&promop=true
kind: StorageCluster
apiVersion: core.libopenstorage.org/v1
metadata:
  name: px-30455
  namespace: px
  annotations:
    portworx.io/install-source: "https://install.portworx.com/?operator=true&mc=false&kbver=1.25.0&ns=portworx&b=true&c=px-cluster-4d617504-5340-4a93-81d6-9255d71136c8&stork=true&csi=true&mon=true&tel=true&st=k8s&promop=true"
spec:
  deleteStrategy:
    type: UninstallAndWipe
  volumes:
  - hostPath:
      path: /tmp/extra-mount
    mountPath: /tmp/extra-mount
    name: extra-mount
  - hostPath:
      path: /tmp/px-cores
    mountPath: /var/cores
    name: px-cores
  image: portworx/oci-monitor:2.13.7
  imagePullPolicy: Always
  kvdb:
    internal: true
  storage:
    useAll: true
  secretsProvider: k8s
  stork:
    enabled: true
    args:
      webhook-controller: "true"
  autopilot:
    enabled: true
  csi:
    enabled: true
  monitoring:
    telemetry:
      enabled: true
    prometheus:
      enabled: true
      exportMetrics: true

ls inside one of the node

 ~]# ls -la /tmp/extra-mount/
total 8
drwxr-xr-x   2 root root 4096 Aug 23 00:49 .
drwxrwxrwt. 15 root root 4096 Aug 23 00:59 ..
~]# ls -la /tmp/px-cores/
total 12
drwxr-xr-x   3 root root 4096 Aug 23 00:54 .
drwxrwxrwt. 15 root root 4096 Aug 23 01:00 ..
drw-r--r--   2 root root 4096 Aug 23 00:49 .alerts

@zoxpx
Copy link
Collaborator

zoxpx commented Aug 23, 2023

Uh.. that doesn't seem right (I'm glad we tested this)

  • the /tmp/px-cores should have a lot more files -- (same as our original /var/cores has)
  • so, looks like the "override" did not work correctly

Could you run jq .mounts /opt/pwx/oci/config.json on your node, and search for /cores?

  • for the override to work correctly, there should be only 1 mount present (source:/tmp/px-cores, destination:/var/cores)

Signed-off-by: Shivanjan Chakravorty <schakravorty@purestorage.com>
@zoxpx
Copy link
Collaborator

zoxpx commented Aug 23, 2023

Ok.. did some extra testing, using px-operator image cnbuautomation800/px-operator:pwx30455

TEST's input modification -- using 2 modifications with the same destination (/var/cores)

apiVersion: core.libopenstorage.org/v1
kind: StorageCluster
spec:
  volumes:
  - hostPath:
      path: /mnt/px-cores
    mountPath: /var/cores
    name: px-cores
  - hostPath:
      path: /mnt/px-cores22
    mountPath: /var/cores
    name: px-cores22

RESULT= FAIL: Portworx POX has

  • [minor issue] some excess host-paths (which is OK)
  • [CRITICAL] it is MISSING container-mount for user-px-cores22
kind: Pod
spec:
  containers:
  - name: portworx
    volumeMounts:
    - mountPath: /var/cores
      name: diagsdump
...
  volumes:
  - hostPath:
      path: /var/cores
      type: ""
    name: diagsdump
  - hostPath:
      path: /mnt/px-cores
      type: ""
    name: user-px-cores
  - hostPath:
      path: /mnt/px-cores22
      type: ""
    name: user-px-cores22

RESULT= FAIL: Actual portworx.service did NOT override the /var/cores mount

  • see jq .mounts /opt/pwx/oci/config.json on the node -- the mount is still the original /var/cores:/var/cores
[
  {
    "destination": "/var/cores",
    "type": "bind",
    "source": "/var/cores",
    "options": [
      "rbind",
      "rprivate"
    ]
  },
  ...
]

Please DO NOT merge this PR!

@Glitchfix
Copy link
Contributor Author

Glitchfix commented Aug 23, 2023

Hi, @zoxpx appoligies for the confusion

the later results I sent you uses this image

cnbuautomation800/px-operator:pwx30455-2

the previous image was missing this commit e105361

it uses the same storage cluster file and doesn't crash

~]# ls -la /tmp/px-cores/ /tmp/extra-mount/
/tmp/extra-mount/:
total 8
drwxr-xr-x   2 root root 4096 Aug 23 04:47 .
drwxrwxrwt. 15 root root 4096 Aug 23 05:00 ..

/tmp/px-cores/:
total 2768
drwxr-xr-x   4 root root    4096 Aug 23 04:51 .
drwxrwxrwt. 15 root root    4096 Aug 23 05:00 ..
drw-r--r--   2 root root    4096 Aug 23 04:47 .alerts
-rw-------   1 root root 3334144 Aug 23 04:48 core-ld-sig11-user0-group0-pid1651-time1692766095
-rw-r--r--   1 root root       0 Aug 23 04:51 pwx_fstrim.log
-rw-r--r--   1 root root    1941 Aug 23 04:59 px_cache_mon.log
-rw-r--r--   1 root root     969 Aug 23 04:59 px_cache_mon_watch.log
-rw-r--r--   1 root root      81 Aug 23 04:48 px_diag_watch.log
-rw-r--r--   1 root root   11390 Aug 23 04:58 px_etcd_watch.log
-rw-r--r--   1 root root     434 Aug 23 04:48 px_event_watch.log
-rw-r--r--   1 root root   52815 Aug 23 04:51 px_exec_watch.log
-rw-r--r--   1 root root      85 Aug 23 04:48 px_healthmon_watch.log
-rw-r--r--   1 root root    1011 Aug 23 04:48 px_info.log
-rw-r--r--   1 root root     241 Aug 23 04:48 px_patch_fs.log
-rw-r--r--   1 root root       0 Aug 23 04:48 px_reboot_diags.log
drwxr-xr-x   2 root root    4096 Aug 23 04:51 px_status

@harsh-px
Copy link
Contributor

Closing this as we will use #1226 for this fix.

@harsh-px harsh-px closed this Aug 23, 2023
@Glitchfix Glitchfix deleted the feat/PWX-30455 branch August 30, 2023 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants