Skip to content

Commit

Permalink
Add kustomization for using GPU plugin with fake devices
Browse files Browse the repository at this point in the history
Change GPU plugin NFD init container to run-time container:
* To work around kustomize inability to enforce correct init container order
* This is more likely how things will work once NFD drops support for hooks:
  kubernetes-sigs/node-feature-discovery#856

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
  • Loading branch information
eero-t committed Dec 12, 2022
1 parent a1a81e5 commit 7e7c54d
Show file tree
Hide file tree
Showing 6 changed files with 136 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: intel-gpu-plugin
spec:
template:
spec:
containers:
- name: intel-gpu-nfd
# convert generated sysfs content to NFD feature labels file
image: intel/intel-gpu-initcontainer:devel
imagePullPolicy: IfNotPresent
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop: [ "ALL" ]
volumeMounts:
- name: nfd-features
mountPath: /nfd
readOnly: false
workingDir: /usr/local/bin/gpu-sw
# needed until GPU plugin drops NFD hook usage due to:
# https://github.com/kubernetes-sigs/node-feature-discovery/issues/856
command: ["sh", "-c", "while true; do ./intel-gpu-nfdhook | tee /nfd/fake-gpu; sleep 99999; done"]
volumes:
- name: nfd-features
hostPath:
path: /etc/kubernetes/node-feature-discovery/features.d/
type: DirectoryOrCreate
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: intel-gpu-plugin
spec:
template:
spec:
initContainers:
- name: intel-gpu-initcontainer
$patch: delete
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: intel-gpu-plugin
spec:
template:
spec:
initContainers:
- name: fakedev-generator
# container runtime prevents writing to /sys & /dev,
# so volumes need to be mounted elsewhere
volumeMounts:
- name: devfs
mountPath: /tmp/fakedev/dev
readOnly: false
- name: sysfs
mountPath: /tmp/fakedev/sys
readOnly: false
# files are generated under CWD
workingDir: /tmp/fakedev
containers:
- name: intel-gpu-nfd
# expects sysfs here
volumeMounts:
- name: sysfs
mountPath: /host-sys
readOnly: true
- name: intel-gpu-plugin
args: [
"-prefix=/tmp/fakedev",
"-shared-dev-num=2",
"-enable-monitoring",
"-resource-manager"
]
# devfs host & container paths must match for everything to work
volumeMounts:
- name: devfs
mountPath: /tmp/fakedev/dev
readOnly: true
- name: sysfs
mountPath: /tmp/fakedev/sys
readOnly: true
volumes:
- name: devfs
hostPath:
path: /tmp/fakedev/dev
type: DirectoryOrCreate
- name: sysfs
emptyDir: {}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"Info": "8x 4 GiB DG1 [Iris Xe MAX Graphics] GPUs",
"DevCount": 8,
"DevMemSize": 4294967296,
"Capabilities": {
"platform": "fake_DG1"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: intel-gpu-plugin
spec:
template:
spec:
volumes:
- name: fake-conf
configMap:
name: fakedev-config
initContainers:
- name: fakedev-generator
image: intel/intel-gpu-fakedev:devel
securityContext:
runAsUser: 0
readOnlyRootFilesystem: false
allowPrivilegeEscalation: false
volumeMounts:
- name: fake-conf
mountPath: /config
readOnly: true
# generate fake sysfs / devfs files for GPU plugin based on config
command: ["/generator", "-json", "/config/fakedev.json", "-verbose"]
15 changes: 15 additions & 0 deletions deployments/gpu_plugin/overlays/fake_devices/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../fractional_resources
configMapGenerator:
- name: fakedev-config
files:
- fakedev-config.json
patches:
- fake-device-volumes.yaml
- generate-fake-devices.yaml
# NFD feature file changes is obsolete after GPU plugin moves away from NFD hooks
# https://github.com/kubernetes-sigs/node-feature-discovery/issues/856
- del-intel-gpu-initcontainer.yaml
- add-nfd-feature-file.yaml

0 comments on commit 7e7c54d

Please sign in to comment.