-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: vsoch <vsoch@users.noreply.github.com> Co-authored-by: vsoch <vsoch@users.noreply.github.com>
- Loading branch information
Showing
5 changed files
with
231 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Mpich Example | ||
|
||
You should be able to create a MiniKube cluster, install the operator with creating the namespace: | ||
|
||
```bash | ||
$ minikube start | ||
$ kubectl create namespace flux-operator | ||
$ kubectl apply -f ../../dist/flux-operator.yaml | ||
``` | ||
|
||
You might want to pre-pull the container: | ||
|
||
```bash | ||
$ minikube ssh docker pull ghcr.io/rse-ops/mpich:tag-mamba | ||
``` | ||
|
||
And then create the MiniCluster: | ||
|
||
```bash | ||
$ kubectl create -f minicluster.yaml | ||
``` | ||
|
||
And watch the example run! | ||
|
||
```bash | ||
$ kubectl logs -n flux-operator flux-sample-0-5gjqt -f | ||
``` | ||
|
||
A successful run will show four MPI ranks... | ||
|
||
```console | ||
broker.info[0]: rc1.0: running /etc/flux/rc1.d/02-cron | ||
broker.info[0]: rc1.0: /etc/flux/rc1 Exited (rc=0) 0.5s | ||
broker.info[0]: rc1-success: init->quorum 0.543544s | ||
broker.info[0]: online: flux-sample-0 (ranks 0) | ||
broker.info[0]: online: flux-sample-[0-3] (ranks 0-3) | ||
broker.info[0]: quorum-full: quorum->run 0.369278s | ||
Hello, world! I am 1 of 4(Open MPI v4.0.3, package: Debian OpenMPI, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020, 87) | ||
Hello, world! I am 0 of 4(Open MPI v4.0.3, package: Debian OpenMPI, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020, 87) | ||
Hello, world! I am 2 of 4(Open MPI v4.0.3, package: Debian OpenMPI, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020, 87) | ||
Hello, world! I am 3 of 4(Open MPI v4.0.3, package: Debian OpenMPI, ident: 4.0.3, repo rev: v4.0.3, Mar 03, 2020, 87) | ||
broker.info[0]: rc2.0: flux submit -N 4 -n 4 --quiet --watch ./hello_cxx Exited (rc=0) 0.8s | ||
broker.info[0]: rc2-success: run->cleanup 0.843814s | ||
broker.info[0]: cleanup.0: flux queue stop --quiet --all --nocheckpoint Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup.1: flux cancel --user=all --quiet --states RUN Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup.2: flux queue idle --quiet Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup-success: cleanup->shutdown 0.320065s | ||
broker.info[0]: children-complete: shutdown->finalize 61.2525ms | ||
broker.info[0]: rc3.0: running /etc/flux/rc3.d/01-sched-fluxion | ||
broker.info[0]: rc3.0: /etc/flux/rc3 Exited (rc=0) 0.3s | ||
broker.info[0]: rc3-success: finalize->goodbye 0.310701s | ||
broker.info[0]: goodbye: goodbye->exit 0.037999ms | ||
``` | ||
|
||
And the job will be completed. | ||
|
||
```bash | ||
kubectl get -n flux-operator pods | ||
``` | ||
```console | ||
NAME READY STATUS RESTARTS AGE | ||
flux-sample-0-5gjqt 0/1 Completed 0 2m40s | ||
flux-sample-1-j4zlc 0/1 Completed 0 2m40s | ||
flux-sample-2-wdzz7 0/1 Completed 0 2m40s | ||
flux-sample-3-vp8rx 0/1 Completed 0 2m40s | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
apiVersion: flux-framework.org/v1alpha1 | ||
kind: MiniCluster | ||
metadata: | ||
name: flux-sample | ||
namespace: flux-operator | ||
spec: | ||
# Number of pods to create for MiniCluster | ||
size: 4 | ||
tasks: 4 | ||
|
||
# suppress all output except for test run | ||
logging: | ||
quiet: false | ||
|
||
# This is a list because a pod can support multiple containers | ||
containers: | ||
# The container URI to pull (currently needs to be public) | ||
- image: ghcr.io/rse-ops/mpich:tag-mamba | ||
|
||
# Note that there are many examples here! | ||
# flux run -n 4 ./hello_c | ||
# flux run -n 4 ./hello_cxx | ||
# flux run -n 4 ./connectivity_c | ||
# flux run -n 4 ./hello_usempi | ||
# flux run -n 4 ./ring_c | ||
# flux run -n 4 ./ring_usempi | ||
# flux run -n 4 ./ring_mpifh | ||
command: ./hello_cxx | ||
workingDir: /opt/ompi | ||
environment: | ||
LD_LIBRARY_PATH: /opt/conda/lib | ||
PYTHONPATH: /opt/conda/lib/python3.10/site-packages |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# OpenMPI Example | ||
|
||
You should be able to create a MiniKube cluster, install the operator with creating the namespace: | ||
|
||
```bash | ||
$ minikube start | ||
$ kubectl create namespace flux-operator | ||
$ kubectl apply -f ../../dist/flux-operator.yaml | ||
``` | ||
|
||
You might want to pre-pull the container: | ||
|
||
```bash | ||
$ minikube ssh docker pull ghcr.io/rse-ops/ompi:flux-sched-focal | ||
``` | ||
|
||
And then create the MiniCluster: | ||
|
||
```bash | ||
$ kubectl create -f minicluster.yaml | ||
``` | ||
|
||
And watch the example run! | ||
|
||
```bash | ||
$ kubectl logs -n flux-operator flux-sample-0-5gjqt -f | ||
``` | ||
|
||
A successful run will show four MPI ranks (and mpich is really vocal huh?)... | ||
|
||
```console | ||
broker.info[0]: rc1.0: running /etc/flux/rc1.d/02-cron | ||
broker.info[0]: rc1.0: /etc/flux/rc1 Exited (rc=0) 0.6s | ||
broker.info[0]: rc1-success: init->quorum 0.602697s | ||
broker.info[0]: online: flux-sample-0 (ranks 0) | ||
broker.info[0]: online: flux-sample-[0-3] (ranks 0-3) | ||
broker.info[0]: quorum-full: quorum->run 0.361781s | ||
Hello, world! I am 0 of 4(MPICH Version: 3.3a2 | ||
MPICH Release date: Sun Nov 13 09:12:11 MST 2016 | ||
MPICH Device: ch3:nemesis | ||
MPICH configure: --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --disable-maintainer-mode --disable-dependency-tracking --with-libfabric --enable-shared --prefix=/usr --enable-fortran=all --disable-rpath --disable-wrapper-rpath --sysconfdir=/etc/mpich --libdir=/usr/lib/x86_64-linux-gnu --includedir=/usr/include/mpich --docdir=/usr/share/doc/mpich --with-hwloc-prefix=system --enable-checkpointing --with-hydra-ckpointlib=blcr CPPFLAGS= CFLAGS= CXXFLAGS= FFLAGS= FCFLAGS= | ||
MPICH CC: gcc -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH CXX: g++ -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH F77: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
MPICH FC: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
, 1297) | ||
Hello, world! I am 2 of 4(MPICH Version: 3.3a2 | ||
MPICH Release date: Sun Nov 13 09:12:11 MST 2016 | ||
MPICH Device: ch3:nemesis | ||
MPICH configure: --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --disable-maintainer-mode --disable-dependency-tracking --with-libfabric --enable-shared --prefix=/usr --enable-fortran=all --disable-rpath --disable-wrapper-rpath --sysconfdir=/etc/mpich --libdir=/usr/lib/x86_64-linux-gnu --includedir=/usr/include/mpich --docdir=/usr/share/doc/mpich --with-hwloc-prefix=system --enable-checkpointing --with-hydra-ckpointlib=blcr CPPFLAGS= CFLAGS= CXXFLAGS= FFLAGS= FCFLAGS= | ||
MPICH CC: gcc -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH CXX: g++ -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH F77: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
MPICH FC: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
, 1297) | ||
Hello, world! I am 3 of 4(MPICH Version: 3.3a2 | ||
MPICH Release date: Sun Nov 13 09:12:11 MST 2016 | ||
MPICH Device: ch3:nemesis | ||
MPICH configure: --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --disable-maintainer-mode --disable-dependency-tracking --with-libfabric --enable-shared --prefix=/usr --enable-fortran=all --disable-rpath --disable-wrapper-rpath --sysconfdir=/etc/mpich --libdir=/usr/lib/x86_64-linux-gnu --includedir=/usr/include/mpich --docdir=/usr/share/doc/mpich --with-hwloc-prefix=system --enable-checkpointing --with-hydra-ckpointlib=blcr CPPFLAGS= CFLAGS= CXXFLAGS= FFLAGS= FCFLAGS= | ||
MPICH CC: gcc -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH CXX: g++ -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
Hello, world! I am 1 of 4(MPICH Version: 3.3a2 | ||
MPICH F77: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
MPICH Release date: Sun Nov 13 09:12:11 MST 2016 | ||
MPICH FC: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
MPICH Device: ch3:nemesis | ||
, 1297) | ||
MPICH configure: --build=x86_64-linux-gnu --prefix=/usr --includedir=${prefix}/include --mandir=${prefix}/share/man --infodir=${prefix}/share/info --sysconfdir=/etc --localstatedir=/var --disable-silent-rules --libdir=${prefix}/lib/x86_64-linux-gnu --libexecdir=${prefix}/lib/x86_64-linux-gnu --disable-maintainer-mode --disable-dependency-tracking --with-libfabric --enable-shared --prefix=/usr --enable-fortran=all --disable-rpath --disable-wrapper-rpath --sysconfdir=/etc/mpich --libdir=/usr/lib/x86_64-linux-gnu --includedir=/usr/include/mpich --docdir=/usr/share/doc/mpich --with-hwloc-prefix=system --enable-checkpointing --with-hydra-ckpointlib=blcr CPPFLAGS= CFLAGS= CXXFLAGS= FFLAGS= FCFLAGS= | ||
MPICH CC: gcc -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH CXX: g++ -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -Wformat -Werror=format-security -O2 | ||
MPICH F77: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
MPICH FC: gfortran -g -O2 -fdebug-prefix-map=/build/mpich-O9at2o/mpich-3.3~a2=. -fstack-protector-strong -O2 | ||
, 1297) | ||
broker.info[0]: rc2.0: flux submit -N 4 -n 4 --quiet --watch ./hello_cxx Exited (rc=0) 0.4s | ||
broker.info[0]: rc2-success: run->cleanup 0.380367s | ||
broker.info[0]: cleanup.0: flux queue stop --quiet --all --nocheckpoint Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup.1: flux cancel --user=all --quiet --states RUN Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup.2: flux queue idle --quiet Exited (rc=0) 0.1s | ||
broker.info[0]: cleanup-success: cleanup->shutdown 0.264937s | ||
broker.info[0]: children-complete: shutdown->finalize 62.0603ms | ||
broker.info[0]: rc3.0: running /etc/flux/rc3.d/01-sched-fluxion | ||
broker.info[0]: rc3.0: /etc/flux/rc3 Exited (rc=0) 0.2s | ||
broker.info[0]: rc3-success: finalize->goodbye 0.217901s | ||
broker.info[0]: goodbye: goodbye->exit 0.028526ms | ||
``` | ||
|
||
And the job will be completed. | ||
|
||
```bash | ||
kubectl get -n flux-operator pods | ||
``` | ||
```console | ||
NAME READY STATUS RESTARTS AGE | ||
flux-sample-0-flg28 0/1 Completed 0 9m39s | ||
flux-sample-1-fplvv 0/1 Completed 0 9m39s | ||
flux-sample-2-7bltz 0/1 Completed 0 9m39s | ||
flux-sample-3-p8mtj 0/1 Completed 0 9m39s | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
apiVersion: flux-framework.org/v1alpha1 | ||
kind: MiniCluster | ||
metadata: | ||
name: flux-sample | ||
namespace: flux-operator | ||
spec: | ||
# Number of pods to create for MiniCluster | ||
size: 4 | ||
tasks: 4 | ||
|
||
# suppress all output except for test run | ||
logging: | ||
quiet: false | ||
|
||
# This is a list because a pod can support multiple containers | ||
containers: | ||
# The container URI to pull (currently needs to be public) | ||
- image: ghcr.io/rse-ops/ompi:flux-sched-focal | ||
|
||
# Note that there are many examples here! | ||
# flux run -n 4 ./hello_c | ||
# flux run -n 4 ./hello_cxx | ||
# flux run -n 4 ./connectivity_c | ||
# flux run -n 4 ./hello_usempi | ||
# flux run -n 4 ./ring_c | ||
# flux run -n 4 ./ring_usempi | ||
# flux run -n 4 ./ring_mpifh | ||
workingDir: /opt/ompi | ||
command: ./hello_cxx |