-
Notifications
You must be signed in to change notification settings - Fork 10
Add nvbandwidth sample #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
multinodeNVL/nvbandwidth/Dockerfile
Outdated
| @@ -0,0 +1,30 @@ | |||
| FROM nvcr.io/nvidia/cuda:12.5.0-devel-ubuntu22.04 AS builder | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: License header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also make the cuda version here an argument so that we can easily update to a newer one without having to change this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. need the license and the cuda version as ARG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klueska on other projects we have moved to defining the versions in the FROM line directly since this allows us to leverage dependabot automation to update our base images instead of doing this manually.
multinodeNVL/nvbandwidth/Dockerfile
Outdated
| cmake -DMULTINODE=1 . && \ | ||
| make -j$(nproc) | ||
|
|
||
| FROM mpioperator/openmpi:v0.6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required because the application needs MPI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my suggestion, because yes, all multi-node tests rely on MPI, and it lets us create jobs such as:
---
apiVersion: resource.nvidia.com/v1beta1
kind: ComputeDomain
metadata:
name: mpi-memcpy-imex-test-compute-domain
spec:
numNodes: 4
channel:
resourceClaimTemplate:
name: mpi-memcpy-imex-test-compute-domain-resource-claim
---
apiVersion: kubeflow.org/v2beta1
kind: MPIJob
metadata:
name: mpi-memcpy-imex-test
spec:
slotsPerWorker: 2
launcherCreationPolicy: WaitForWorkersReady
runPolicy:
cleanPodPolicy: Running
sshAuthMountPath: /home/mpiuser/.ssh
mpiReplicaSpecs:
Launcher:
replicas: 1
template:
metadata:
labels:
mpi-memcpy-dra-test-replica: mpi-launcher
spec:
containers:
- image: nvcr.io/nvidia/nvbandwidth:v0.7
name: mpi-launcher
securityContext:
runAsUser: 1000
command:
- mpirun
args:
- --bind-to
- core
- --map-by
- ppr:2:node
- -np
- "8"
- --report-bindings
- -q
- nvbandwidth
- -t
- multinode_device_to_device_memcpy_read_ce
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
Worker:
replicas: 4
template:
metadata:
labels:
mpi-memcpy-dra-test-replica: mpi-worker
spec:
containers:
- image: nvcr.io/nvidia/nvbandwidth:v0.7
name: mpi-worker
securityContext:
runAsUser: 1000
env:
command:
- /usr/sbin/sshd
args:
- -De
- -f
- /home/mpiuser/.sshd_config
resources:
limits:
nvidia.com/gpu: 2
claims:
- name: compute-domain
resourceClaims:
- name: compute-domain
resourceClaimTemplateName: mpi-memcpy-imex-test-compute-domain-resource-claim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One should note that this image may then need additional steps to be approved fro publication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YES. will do the approvals.
.github/workflows/image.yaml
Outdated
| PUSH_ON_BUILD: true | ||
| run: | | ||
| echo "nvbandwidth ${NVBANDWIDTH_VERSION}" | ||
| docker build --platform=linux/arm64 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a Makefile at multinodeNVL/nvbandwidth so users can reproduce this command if they want to build the image locally, and also so we can call the make target from CI to align with all our CI image builds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- also it should not build specifically for arm -- it should build a multi-arch image like all of our other projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dint add the amd build because we only tested it on ARM. and i was not sure how to test it on x86.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the MULTIARCH build similar as we have for other images across our projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can run one of the single-node tests. Here are the results of running on a DGXA100, for example:
kklues@luna-08:~/nvbandwidth$ docker run --rm -it --gpus all nvcr.io/nvidia/nvbandwidth:v0.7 nvbandwidth -t device_to_device_memcpy_read_ce
nvbandwidth Version: v0.7
Built from Git version: v0.7
MPI version: Open MPI v4.1.4, package: Debian OpenMPI, ident: 4.1.4, repo rev: v4.1.4, May 26, 2022
CUDA Runtime Version: 12070
CUDA Driver Version: 12070
Driver Version: 565.57.01
Process 0 (2f68f68068f0): device 0: NVIDIA A100-SXM4-40GB (00000000:07:00)
Running device_to_device_memcpy_read_ce.
memcpy CE GPU(row) -> GPU(column) bandwidth (GB/s)
0 1 2 3 4 5 6 7
0 N/A 266.54 267.13 266.40 266.37 266.27 267.88 267.39
1 277.42 N/A 278.47 278.58 271.68 267.18 266.98 267.62
2 278.58 277.56 N/A 278.59 277.56 267.23 268.16 266.20
3 278.77 278.86 278.22 N/A 278.49 278.76 278.77 266.96
4 278.84 277.63 278.76 277.65 N/A 278.39 278.86 269.22
5 278.76 278.86 278.49 278.74 277.65 N/A 278.76 278.67
6 278.86 278.58 276.19 267.67 267.44 267.20 N/A 267.24
7 278.69 278.94 278.86 278.95 278.67 278.86 278.39 N/A
SUM device_to_device_memcpy_read_ce 15372.40
NOTE: The reported results may not reflect the full capabilities of the platform.
Performance can vary with software drivers, hardware clocks, and system topology.
multinodeNVL/nvbandwidth/Dockerfile
Outdated
| @@ -0,0 +1,46 @@ | |||
| # Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025?
e94c3dc to
f853c5a
Compare
| cmake -DMULTINODE=1 . && \ | ||
| make -j$(nproc) | ||
|
|
||
| FROM mpioperator/openmpi:v0.6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the openmpi version configurable as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we try to use dependabot to update base container images where applicable, and if we wanted to leverage that for openmpi as well we would not want to do that.
multinodeNVL/nvbandwidth/Makefile
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of duplicating the make logic here, would it make sense to leverage the Makefile in deployments/container/ instead? This makefile follows the required pattern for all our other projects and also ensures that our base images are up to date.
Moving the Docker file from this folder to /deployments/container/nvbandwidth/Dockerfile.ubuntu
And applying the following diff to deployments/container/Makefile:
diff --git a/deployments/container/Makefile b/deployments/container/Makefile
index 4a74687..d25bb0f 100644
--- a/deployments/container/Makefile
+++ b/deployments/container/Makefile
@@ -80,7 +80,11 @@ endif
endif
build-%: DIST = $(*)
+ifneq ($(SAMPLE),nvbandwidth)
build-%: DOCKERFILE = $(CURDIR)/deployments/container/Dockerfile.$(DOCKERFILE_SUFFIX)
+else
+build-%: DOCKERFILE = $(CURDIR)/deployments/container/$(SAMPLE)/Dockerfile.$(DOCKERFILE_SUFFIX)
+endif
build-ubuntu%: DOCKERFILE_SUFFIX = ubuntu
allows the exisint CI to be leveraged with:
diff --git a/.github/workflows/image.yaml b/.github/workflows/image.yaml
index b066bf1..2b7de81 100644
--- a/.github/workflows/image.yaml
+++ b/.github/workflows/image.yaml
@@ -40,11 +40,14 @@ jobs:
- vectorAdd
- nbody
- deviceQuery
+ - nvbandwidth
exclude:
- dist: ubi8
sample: deviceQuery
- dist: ubi8
sample: nbody
+ - dist: ubi8
+ sample: nvbandwidth
steps:
- uses: actions/checkout@v4
@@ -86,30 +89,3 @@ jobs:
echo "${VERSION}"
SAMPLE=${{ matrix.sample }} make -f deployments/container/Makefile build-${{ matrix.dist }}
- build-nvbandwidth:
- runs-on: ubuntu-latest
- steps:
- - name: Check out code
- uses: actions/checkout@v4
- - name: Set up QEMU
- uses: docker/setup-qemu-action@v3
- with:
- image: tonistiigi/binfmt:master
- - name: Set up Docker Buildx
- uses: docker/setup-buildx-action@v3
- - name: Login to GitHub Container Registry
- uses: docker/login-action@v3
- with:
- registry: ghcr.io
- username: ${{ github.actor }}
- password: ${{ secrets.GITHUB_TOKEN }}
- - name: Build nvbandwidth image
- env:
- IMAGE_NAME: ghcr.io/nvidia/k8s-samples:nvbandwidth
- CUDA_VERSION: 12.5.0
- NVBANDWIDTH_VERSION: v0.7
- PUSH_ON_BUILD: true
- BUILD_MULTI_ARCH_IMAGES: true
- run: |
- echo "nvbandwidth ${NVBANDWIDTH_VERSION}"
- make -f multinodeNVL/nvbandwidth/Makefile build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See for example: #20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike other cuda-sample images, this image seems more version sensitive (CUDA, DRIVER and nvbandwidth release version). This should not be updated or built in the same way we do the other sample images. I agree with the structure alignment but they should be separate builds IMO.
I can update it based on carlos changes
main...ArangoGutierrez:k8s-samples:pr-19
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on Arjun's work from here https://github.com/NVIDIA/gpu-operator/pull/1262/files
Platform specific build seems like a better option
nvbandwidth ${COMMIT_SHORT_SHA}-amd64
| Building Docker image: ghcr.io/guptanswati/nvbandwidth:v0.7-485225f2-amd64
| DOCKER_BUILDKIT=1 \
| docker build --pull \
| --provenance=false --sbom=false \
| --platform=linux/aarch64 \
| --tag ghcr.io/guptanswati/nvbandwidth:v0.7-485225f2-amd64 \
| --build-arg CUDA_VERSION="12.5.0" \
| --build-arg NVBANDWIDTH_VERSION="v0.7" \
| --build-arg OPENMPI_VERSION="v0.6.0" \
| -f /home/nvidia/swati-imex-test/k8s-samples/multinodeNVL/nvbandwidth/Dockerfile \
| /home/nvidia/swati-imex-test/k8s-samples
[+] Building 0.3s (11/11) FINISHED docker:default
| => [internal] load build definition from Dockerfile 0.0s
| => => transferring dockerfile: 1.66kB 0.0s
| => [internal] load metadata for docker.io/mpioperator/openmpi:v0.6.0 0.3s
| => [internal] load metadata for nvcr.io/nvidia/cuda:12.5.0-devel-ubuntu2 0.1s
| => [internal] load .dockerignore 0.0s
| => => transferring context: 2B 0.0s
| => [builder 1/4] FROM nvcr.io/nvidia/cuda:12.5.0-devel-ubuntu22.04@sha25 0.0s
| => => resolve nvcr.io/nvidia/cuda:12.5.0-devel-ubuntu22.04@sha256:4b46c6 0.0s
| => [stage-1 1/2] FROM docker.io/mpioperator/openmpi:v0.6.0@sha256:6d6bf8 0.0s
| => CACHED [builder 2/4] RUN apt-get update && apt-get install -y --no-in 0.0s
| => CACHED [builder 3/4] WORKDIR /bandwidthtest 0.0s
| => CACHED [builder 4/4] RUN git clone --branch v0.7 --depth 1 --single-b 0.0s
| => CACHED [stage-1 2/2] COPY --from=builder /bandwidthtest/nvbandwidth/n 0.0s
| => exporting to image 0.0s
| => => exporting layers 0.0s
| => => writing image sha256:a45c0e75199c4a129a178b132ed254f61706e2842ee54 0.0s
| => => naming to ghcr.io/guptanswati/nvbandwidth:v0.7-485225f2-amd64 0.0s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't disagree that using platform-specific runners is a good option here -- especially since we're not cross compiling a language like go -- I would argue that we should do that for ALL our samples and not a single one. Does it make sense to push this forward as is, and then switch to native runners as a follow-up once we have ironed out the changes required to the makefiles and the CI as are being discussed in NVIDIA/gpu-operator#1262?
on:
Unlike other cuda-sample images, this image seems more version sensitive (CUDA, DRIVER and nvbandwidth release version). This should not be updated or built in the same way we do the other sample images. I agree with the structure alignment but they should be separate builds IMO.
To be clear. They are separate builds and all verisons are currently being specified in the Docker file. I know that in the past we have leveraged build args to set values so that these are more flexible. I believe that the intent is to be able to set these "globally" in CI variables, but that was never followed, and version bumps were file edits anyway. If these examples are so version dependent, we should define the versions in one place -- the Dockerfile, for example -- and then only change them there. This also allows users to more easily replicate the builds themselves.
|
@guptaNswati please also add a meaningful description to the PR. |
debc9ed to
485225f
Compare
|
Once the merge conflict is resolved, we can proceed |
60b2428 to
6c3323c
Compare
This change adds an nvbandwidth sample that can be used to test both single and multi-node GPU interconnectivity. The multi-arch images are generated with the following image root: nvcr.io/ghcr.io/nvidia/k8s-samples:nvbandwidth Signed-off-by: Swati Gupta <swatig@nvidia.com> Signed-off-by: Evan Lezar <elezar@nvidia.com>
6c3323c to
5410628
Compare
elezar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @guptaNswati
This is to build a container image for NVIDIA nvbandwidth GPU benchmarking tool. This helps us demo and test the multinode NVLINK functionality requiring MPI and IMEX and DRA.