Skip to content

Conversation

@elezar
Copy link
Member

@elezar elezar commented Oct 23, 2024

This change adds an init container to GFD that checks whether the /driver-root/etc/nvidia-imex/nodes_config.cfg file exits and copies it to /config/etc/nvidia-imex/nodes_config.cfg (where /config is a shared volume). This allows the main GFD container access to this config without requiring that the whole driver root be mounted.

The ability to set the IMEX nodes config path through the GFD_IMEX_NODES_CONFIG_FILE envvar has also been removed, instead relying on searching the following [/, /config, ${DRIVER_ROOT_CTR_PATH}] for the hardcoded path: /etc/nvidia-imex/nodes_config.cfg.

Testing

Using kind.

No imex file:

$ docker exec -ti k8s-device-plugin-cluster-worker bash -c "cat /etc/nvidia-imex/nodes_config.cfg"
cat: /etc/nvidia-imex/nodes_config.cfg: No such file or directory

Deploy GFD:

$ helm upgrade nvidia-device-plugin -i deployments/helm/nvidia-device-plugin     --namespace nvidia-device-plugin     --create-namespace     --set runtimeClassName=nvidia     --set devicePlugin.enabled=false     --set gfd.enabled=true
$ kubectl logs -n nvidia-device-plugin   nvidia-device-plugin-gpu-feature-discovery-76fsp    -c gpu-feature-discovery-imex-init
No IMEX nodes config path detected; Skipping

Confirm that the folder is not present in the container:

$ kubectl exec -ti -n nvidia-device-plugin   nvidia-device-plugin-gpu-feature-discovery-76fsp  -c gpu-feature-discovery-ctr -- ls /config/etc/nvidia-imex
ls: cannot access '/config/etc/nvidia-imex': No such file or directory
command terminated with exit code 2

Create the folder on the node:

$ docker exec -ti k8s-device-plugin-cluster-worker bash -c "mkdir -p /etc/nvidia-imex; echo 0.0.0.0 > /etc/nvidia-imex/nodes_config.cfg"
$ docker exec -ti k8s-device-plugin-cluster-worker bash -c "cat /etc/nvidia-imex/nodes_config.cfg"
0.0.0.0

Delelete the GFD pod.

$ kubectl delete pod -n nvidia-device-plugin   nvidia-device-plugin-gpu-feature-discovery-76fsp
$ kubectl logs -n nvidia-device-plugin   nvidia-device-plugin-gpu-feature-discovery-dlnxd  -c gpu-feature-discovery-imex-init
Copying IMEX nodes config

File is available in the main container:

$ kubectl exec -ti -n nvidia-device-plugin   nvidia-device-plugin-gpu-feature-discovery-dlnxd   -c gpu-feature-discovery-ctr -- cat /config/etc/nvidia-imex/nodes_config.cfg
0.0.0.0

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 23, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@elezar elezar force-pushed the add-imex-init-container branch from 179e7fa to 5b20d46 Compare October 23, 2024 12:55
@elezar elezar force-pushed the add-imex-init-container branch 2 times, most recently from d35857d to affb6fe Compare October 23, 2024 13:46
volumeMounts:
- name: config
mountPath: /config
mountPropagation: Bidirectional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be bidirectional? At worst I think we only need HostToContainer -- but I'm curious if we even care about future mounts in either direction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I was seeing was that the mount made in the init container was not visible to other containers using the empty volume. Maybe there is a better way to ensure that the mounts are shared than propagating this back to the empty volume on the host.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, yeah, that makes sense. The init container creates the mount, but if this mount is not reflected back on the host, then when it exits, the mount will be removed and never visible to the app container when it eventually starts up. Let's leave it this way for now and think a bit more about this for future releases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one issue that I did see was that the mount was not cleaned up. What is the best practice for doing this? (we don't do this for MPS either ... )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does it live after the pod has been shut down? I would assume the shutdown of the pod and the removal of the temporary space for the mounted volume would have cleaned it up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they hang terminating forever, or are they terminated / cleaned up after the "standard" 30s timeout?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just go back to a copying approach instead of a mounting? It means we will require a restart of the plugin if the config file changes on the host, but this is probably OK for this first iteration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's longer than 30s. I have one that's in terminating for more than 6 hours.

It could be that using the mounting logic that we have for the mps daemon doesn't have this issue since there is more complex logic being applied?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I'm happy to resort to copying and require a restart, alternatively we can do something like:

diff --git a/deployments/helm/nvidia-device-plugin/templates/daemonset-gfd.yml b/deployments/helm/nvidia-device-plugin/templates/daemonset-gfd.yml
index cb3f9d36..fd1ffb8a 100644
--- a/deployments/helm/nvidia-device-plugin/templates/daemonset-gfd.yml
+++ b/deployments/helm/nvidia-device-plugin/templates/daemonset-gfd.yml
@@ -214,6 +214,9 @@ spec:
         {{- end }}
           - name: config
             mountPath: /config
+          - name: driver-root-etc
+            mountPath: /driver-root/etc
+            readOnly: true
         {{- with .Values.resources }}
         resources:
           {{- toYaml . | nindent 10 }}
@@ -225,9 +228,9 @@ spec:
         - name: host-sys
           hostPath:
             path: "/sys"
-        - name: driver-root
+        - name: driver-root-etc
           hostPath:
-            path: {{ clean ( join "/" ( list "/" .Values.nvidiaDriverRoot ) ) | quote }}
+            path: {{ clean ( join "/" ( list "/" .Values.nvidiaDriverRoot "etc" ) ) | quote }}
       {{- if $options.hasConfigMap }}
         - name: available-configs
           configMap:

but this doesn't address the operator case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could work in the operator case if we updated both the host-root and drive-root mounts to /etc --> /host/etc and /run/nvidia/driver/etc --> /driver-root/etc respectively.

klueska
klueska previously approved these changes Oct 24, 2024
@elezar elezar force-pushed the add-imex-init-container branch from affb6fe to c655aa3 Compare October 24, 2024 13:41
@elezar elezar force-pushed the add-imex-init-container branch from c655aa3 to 3c38cec Compare October 25, 2024 13:51
@elezar elezar marked this pull request as ready for review October 25, 2024 13:53
@elezar elezar dismissed klueska’s stale review October 25, 2024 14:05

change in implementation

Comment on lines 64 to 66
command:
- "/bin/bash"
- "-c"
- |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you make this:

Suggested change
command:
- "/bin/bash"
- "-c"
- |
command: ["/bin/bash", "-c"]
args:
- |

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

echo "Copying IMEX nodes config"
mkdir -p /config/etc/nvidia-imex
cp /driver-root/etc/nvidia-imex/nodes_config.cfg /config/etc/nvidia-imex/nodes_config.cfg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make /etc/nvidia-imex/nodes_config.cfg a local envvar to this script instead of repeating it everywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in latest.


// imexNodesConfigFilePathSearchRoots returns a list of roots to search for the IMEX nodes config file.
func imexNodesConfigFilePathSearchRoots(config *spec.Config) []string {
// By default se search / and /config for config files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// By default se search / and /config for config files.
// By default, search / and /config for config files.

@elezar elezar force-pushed the add-imex-init-container branch from 3c38cec to 62ebcc2 Compare October 25, 2024 16:30
@elezar elezar force-pushed the add-imex-init-container branch from 62ebcc2 to c816dc7 Compare October 25, 2024 16:42
This change adds an init container to GFD that checks whether the
/driver-root/etc/nvidia-imex/nodes_config.cfg file exits and copies
it to /config/etc/nvidia-imex/nodes_config.cfg (where /config is
a shared volume used by the containers in the GFD pod.

This allows the main GFD container access to this file -- used for generating
IMEX domain labels -- without requiring continuouse. access to the driver root.
Note that since the file is copied, a change in this config file requires a restart
of the GFD pod.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
This change removes the IMEX nodes config option. Instead, the path to the
config file /etc/nvidia-imex/nodes_config.cfg is assumed with the following
roots being searched for this file: /, /config, ${DRIVER_ROOT_CTR_PATH}.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the add-imex-init-container branch from c816dc7 to 7fc6642 Compare October 25, 2024 16:44
@tariq1890
Copy link
Contributor

tariq1890 commented Oct 25, 2024

Just a thought (not blocking) , it feels a bit much to have a file with the path below:

config/etc/nvidia-imex/nodes_config.cfg

Since config/ is already the designated config directory of the GFD application, it would just be enough to copy the file from /driver-root/etc/nvidia-imex/nodes_config.cfg over to a file with a path like /config/nvidia-imex-nodes-config.cfg
. Or even /config/nvidia-imex/nodes_config.cfg

The /etc part seems a bit redundant.

@klueska
Copy link
Contributor

klueska commented Oct 25, 2024

Just a thought (not blocking) , it feels a bit much to have a file with the path below:

config/etc/nvidia-imex/nodes_config.cfg

Since config/ is already the designated config directory of the GFD application, it would just be enough to copy the file from /driver-root/etc/nvidia-imex/nodes_config.cfg over to a file with a path like /config/nvidia-imex-nodes-config.cfg . Or even /config/nvidia-imex/nodes_config.cfg

The /etc part seems a bit redundant.

It's done the way it is so that the filepath is the same for all search directores whether in /, /driver-root or /config

@elezar elezar merged commit c09799f into NVIDIA:main Oct 28, 2024
8 checks passed
@elezar elezar deleted the add-imex-init-container branch October 28, 2024 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants