amg2023 is working, mixbench is too

For mixbench we need to decide if running on each GPU separately, and very quickly, is something we want to do to get basic metrics. Signed-off-by: vsoch <vsoch@users.noreply.github.com>
converged-computing · May 10, 2024 · 69fc728 · 69fc728
1 parent 8583392
commit 69fc728
Show file tree

Hide file tree

Showing 6 changed files with 276 additions and 17 deletions.
diff --git a/performance/README.md b/performance/README.md
@@ -18,7 +18,11 @@ The following containers / setups are working (tested) on some size (and may nee
  - [google/lammps](google/lammps)
  - [google/kripke](google/kripke)
  - [google/resnet](google/resnet)
+ - [google/amg2023](google/amg2023)
 
+The [google/mixbench](google/mixbench) might only run on nodes / GPU a la carte, so not sure if of interest.
+
+Next step is testing on a larger cluster.
 
 ### Not interested
 
@@ -30,13 +34,15 @@ The following containers / setups are working (tested) on some size (and may nee
 
 - **multi-gpu-models** needs testing on largest size to get slowest time, then do ETA for experiments
 - **lammps**: choose kokkos - next step is to bring up on larger cluster to get time for larger size. 
-- **kripke**: needs a new container built - the one that built and works doesn't work on the GPUs.
+- **kripke**: needs testing on larger size cluster to get upper limit
+- **resnet**
+- **amg2023**
 
 ### Next Hackathon Tasks
 
 For the next hackathon, my goal is to get as far with each container and minicluster setup as possible. Ideally I will have a mostly working container that needs some debugging. Here are notes for each - for this first set, the container has been tested alongside 4 GPU x 2 nodes.
 
-- **mixbench**: the cu file is not compatible, likely needs a tweak and rebuild.
+- **mixbench**: working, but do we want to use it if only one GPU?
 - **nekrs5000**: seems to work! But GPU utilization goes down - need to decide on what/how to run it.
 - **osu**: the point to point seems to work OK, not sure if all_reduce (or any collective) is a thing. I think we are supposed to use NCCL?
 - **mt-gem**: runs successfully on 2 nodes, 8 GPU, about 28 seconds.

diff --git a/performance/google/amg2023/size-2/README.md b/performance/google/amg2023/size-2/README.md
@@ -0,0 +1,129 @@
+# Amg2023 on Kubernetes
+
+> V100 nodes 
+
+```bash
+GOOGLE_PROJECT=myproject
+gcloud container clusters create test-cluster --threads-per-core=1 --accelerator type=nvidia-tesla-v100,count=4 --num-nodes=2 --machine-type=n1-standard-32 --region=us-central1-a --project=${GOOGLE_PROJECT} 
+```
+- pull to running time: 18 seconds (included layers from multi-gpu-benchmarks
+
+We have to be sure the nodes have gpu. This is wrong:
+
+```bash
+$ kubectl get nodes -o json | jq .items[].status.allocatable
+```
+```console
+{
+  "cpu": "15890m",
+  "ephemeral-storage": "47060071478",
+  "hugepages-1Gi": "0",
+  "hugepages-2Mi": "0",
+  "memory": "114327136Ki",
+  "pods": "110"
+}
+{
+  "cpu": "15890m",
+  "ephemeral-storage": "47060071478",
+  "hugepages-1Gi": "0",
+  "hugepages-2Mi": "0",
+  "memory": "114327136Ki",
+  "pods": "110"
+}
+```
+
+What I had to do is install a custom daemonset that would install the driver:
+
+```bash
+kubectl apply -f daemonset.yaml
+```
+
+Check the log to see it installed:
+
+```bash
+kubectl logs -n kube-system nvidia-driver-installer-94v2h 
+```
+```console
+I0427 01:52:44.079644    8897 modules.go:50] Updating host's ld cache
+```
+
+This is now correct!
+
+```bash
+$ kubectl get nodes -o json | jq .items[].status.allocatable
+```
+```console
+{
+  "cpu": "15890m",
+  "ephemeral-storage": "47060071478",
+  "hugepages-1Gi": "0",
+  "hugepages-2Mi": "0",
+  "memory": "114327136Ki",
+  "nvidia.com/gpu": "4",
+  "pods": "110"
+}
+{
+  "cpu": "15890m",
+  "ephemeral-storage": "47060071478",
+  "hugepages-1Gi": "0",
+  "hugepages-2Mi": "0",
+  "memory": "114327136Ki",
+  "nvidia.com/gpu": "4",
+  "pods": "110"
+}
+```
+
+Install the flux operator:
+
+```bash
+kubectl apply -f https://raw.githubusercontent.com/flux-framework/flux-operator/main/examples/dist/flux-operator.yaml
+```
+
+And create the minicluster:
+
+```bash
+kubectl apply -f minicluster.yaml
+```
+
+But this should be fine for now, we can test either, and in either we can re-generate the R to target the GPU. It's interactive, so shell inside:
+
+```bash
+kubectl exec -it flux-sample-0-qdckd bash
+. /etc/profile.d/z10_spack_environment.sh
+flux proxy local:///mnt/flux/view/run/flux/local bash
+```
+
+You should see the GPU in the resources:
+
+```bash
+flux resource list
+```
+```console
+# flux resource list
+     STATE NNODES   NCORES    NGPUS NODELIST
+      free      1        6        1 flux-sample-0
+ allocated      0        0        0 
+      down      0        0        0 
+```
+
+Single GPU example (in ./build)
+
+```bash
+# Local run 
+amg
+
+# Flux with one node
+flux run -N1 -n 4 -g 1 amg
+flux run -N2 -n 8 -g 1 amg
+
+# Two nodes and larger size
+flux run -N2 -n 8 -g 1 amg -P 4 2 1 -n 64 128 128
+```
+
+### Clean Up
+
+When you are done:
+
+```bash
+gcloud container clusters delete test-cluster --region=us-central1-a
+```
diff --git a/performance/google/amg2023/size-2/daemonset.yaml b/performance/google/amg2023/size-2/daemonset.yaml
@@ -0,0 +1,103 @@
+# Copyright 2017 Google Inc. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# The Dockerfile and other source for this daemonset are in
+# https://cos.googlesource.com/cos/tools/+/refs/heads/master/src/cmd/cos_gpu_installer/
+#
+# This is the same as ../../daemonset.yaml except that it assumes that the
+# docker image is present on the node instead of downloading from GCR. This
+# allows easier upgrades because GKE can preload the correct image on the
+# node and the daemonset can just use that image.
+
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: nvidia-driver-installer
+  namespace: kube-system
+  labels:
+    k8s-app: nvidia-driver-installer
+spec:
+  selector:
+    matchLabels:
+      k8s-app: nvidia-driver-installer
+  template:
+    metadata:
+      labels:
+        name: nvidia-driver-installer
+        k8s-app: nvidia-driver-installer
+    spec:
+      priorityClassName: system-node-critical
+      tolerations:
+      - operator: "Exists"
+      hostNetwork: true
+      hostPID: true
+      volumes:
+      - name: dev
+        hostPath:
+          path: /dev
+      - name: vulkan-icd-mount
+        hostPath:
+          path: /home/kubernetes/bin/nvidia/vulkan/icd.d
+      - name: nvidia-install-dir-host
+        hostPath:
+          path: /home/kubernetes/bin/nvidia
+      - name: root-mount
+        hostPath:
+          path: /
+      - name: cos-tools
+        hostPath:
+          path: /var/lib/cos-tools
+      - name: nvidia-config
+        hostPath:
+          path: /etc/nvidia
+      containers:
+      - image: "cos-nvidia-installer:fixed"
+        imagePullPolicy: Never
+        name: nvidia-driver-installer
+        resources:
+          requests:
+            cpu: 150m
+        securityContext:
+          privileged: true
+        env:
+          - name: NVIDIA_INSTALL_DIR_HOST
+            value: /home/kubernetes/bin/nvidia
+          - name: NVIDIA_INSTALL_DIR_CONTAINER
+            value: /usr/local/nvidia
+          - name: VULKAN_ICD_DIR_HOST
+            value: /home/kubernetes/bin/nvidia/vulkan/icd.d
+          - name: VULKAN_ICD_DIR_CONTAINER
+            value: /etc/vulkan/icd.d
+          - name: ROOT_MOUNT_DIR
+            value: /root
+          - name: COS_TOOLS_DIR_HOST
+            value: /var/lib/cos-tools
+          - name: COS_TOOLS_DIR_CONTAINER
+            value: /build/cos-tools
+        volumeMounts:
+        - name: nvidia-install-dir-host
+          mountPath: /usr/local/nvidia
+        - name: vulkan-icd-mount
+          mountPath: /etc/vulkan/icd.d
+        - name: dev
+          mountPath: /dev
+        - name: root-mount
+          mountPath: /root
+        - name: cos-tools
+          mountPath: /build/cos-tools
+        command:
+        - bash
+        - -c
+        - "/cos-gpu-installer install"
+
diff --git a/performance/google/amg2023/size-2/minicluster.yaml b/performance/google/amg2023/size-2/minicluster.yaml
@@ -0,0 +1,24 @@
+apiVersion: flux-framework.org/v1alpha2
+kind: MiniCluster
+metadata:
+  name: flux-sample
+spec:
+  size: 2
+  interactive: true
+
+  # This disables installing flux via the view
+  flux:
+    container:
+      disable: true
+
+  containers:
+  - image: ghcr.io/converged-computing/metric-amg2023:spack-slim
+    commands:
+      pre: |
+        . /etc/profile.d/z10_spack_environment.sh
+        echo "Regenerated resources"
+        flux R encode --hosts=${hosts} --cores=0-15 --gpu=0-3 > ${viewroot}/etc/flux/system/R
+        cat ${viewroot}/etc/flux/system/R        
+    resources:
+      limits:
+        nvidia.com/gpu: "4"
diff --git a/performance/google/amgx/size-2/README.md b/performance/google/amgx/size-2/README.md
@@ -1,6 +1,6 @@
 # Amgx on Kubernetes
 
-> V100 nodes
+> V100 nodes (note that this setup did not work and we fell back to [amg2023](../amg2023)
 
 Here we are trying to use v100 for [amgx](https://github.com/NVIDIA/AMGX). The [container is here](https://github.com/converged-computing/metrics-operator-experiments/pkgs/container/metric-amgx).
 

diff --git a/performance/google/mixbench/size-2/README.md b/performance/google/mixbench/size-2/README.md
@@ -1,19 +1,18 @@
 # Mixbench on Kubernetes
 
 > V100 nodes
-
 - https://github.com/ekondis/mixbench/tree/master
 
 ```bash
 GOOGLE_PROJECT=myproject
-gcloud container clusters create test-cluster --threads-per-core=1 --accelerator type=nvidia-tesla-v100,count=4 --num-nodes=2 --machine-type=n1-standard-32 --region=us-central1-a --project=${GOOGLE_PROJECT} 
+gcloud container clusters create test-cluster --threads-per-core=1 --accelerator type=nvidia-tesla-v100,count=4 --num-nodes=2 --machine-type=n1-standard-32 --region=us-central1-a --project=${GOOGLE_PROJECT}
 ```
 - pull to running time: 18 seconds (included layers from multi-gpu-benchmarks
 
 We have to be sure the nodes have gpu. This is wrong:
 
 ```bash
-$  kubectl get nodes -o json | jq .items[].status.allocatable
+$ kubectl get nodes -o json | jq .items[].status.allocatable
 ```
 ```console
 {
@@ -111,20 +110,18 @@ Single GPU example (in ./build)
 
 ```bash
 # Single GPU
-./examples/amgx_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
-
-# Multiple GPU (2,3) seem to work (still too fast, almost instant)
-mpirun --allow-run-as-root -n 2 examples/amgx_mpi_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
-mpirun --allow-run-as-root -n 3 examples/amgx_mpi_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
+mixbench-cuda
 
-# 4 GPU does not converge
-mpirun --allow-run-as-root -n 4 examples/amgx_mpi_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
+# In practice, this only seems to run on 1 GPU
+flux run -N1 -n 4 -g 1 mixbench-cuda
+flux run -N2 -n 8 -g 1 mixbench-cuda
+```
 
-# Flux with one node (does not converge with 4)
-flux run -N1 -n 3 -g 1 ./examples/amgx_mpi_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
+Note that it looks like this is intended to run on one node, so the correct means to run is probably just:
 
-# Two nodes (same, doesn't converge if greater than 3)
-flux run -N2 -n 3 -g 1 ./examples/amgx_mpi_capi -m ../examples/matrix.mtx -c ../src/configs/FGMRES_AGGREGATION.json
+```
+mixbench-cuda
+mixbench-cpu
 ```
 
 ### Clean Up