-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add example of using flux tree with variables (#147)
* add script to run Signed-off-by: vsoch <vsoch@users.noreply.github.com>
- Loading branch information
Showing
4 changed files
with
136 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# Tree with Variables | ||
|
||
We can use [flux tree](https://github.com/flux-framework/flux-sched/blob/master/t/t2001-tree-real.t#L43-L51) | ||
to create instances inside of instances. For this example, we will start with a root, create | ||
two instances under it, and two instances under each of those. We will (instead of running hostname) run | ||
a script that demonstrates the environment available to each subinstance. | ||
You can read more about [the utility here](https://github.com/flux-framework/flux-sched/blob/master/resource/utilities/README.md). | ||
|
||
## Usage | ||
|
||
First, let's create a kind cluster. From the context of this directory: | ||
|
||
```bash | ||
$ kind create cluster --config ../../kind-config.yaml | ||
``` | ||
|
||
And then install the operator, create the namespace, and apply the MiniCluster YAML here. | ||
|
||
```bash | ||
$ kubectl apply -f ../../dist/flux-operator.yaml | ||
$ kubectl create namespace flux-operator | ||
$ kubectl apply -f ./minicluster.yaml | ||
``` | ||
|
||
The cluster creation has the present working directory (where you are reading this file) | ||
bound to `/tmp/workflow`, and we are running the `flux tree` command there. You can check the logs | ||
for the run via: | ||
|
||
```bash | ||
$ kubectl logs -n flux-operator flux-sample-0-7tx7s -f | ||
``` | ||
|
||
And when it's done, the tree.out (written to `/tmp/workflow` in the cluster) will be written to `tree.out`. | ||
In here you will see: | ||
|
||
```bash | ||
$ flux tree -T2x2 -J 4 -N 4 -c 4 -o /tmp/workflow/tree.out -Q easy:fcfs /bin/bash ./run-on-instance.sh | ||
``` | ||
```console | ||
$ cat tree.out | ||
TreeID Elapsed(sec) Begin(Epoch) End(Epoch) Match(usec) NJobs NNodes CPN GPN | ||
tree 3.646440 1682094481.024492 1682094484.670933 0.000000 4 4 4 0 | ||
tree.2 1.847760 1682094482.167398 1682094484.015160 0.000000 2 2 4 0 | ||
tree.2.2 0.146933 1682094483.195491 1682094483.342424 0.000000 1 1 4 0 | ||
tree.2.1 0.098842 1682094483.068877 1682094483.167719 0.000000 1 1 4 0 | ||
tree.1 1.789910 1682094482.071364 1682094483.861272 0.000000 2 2 4 0 | ||
tree.1.2 0.102510 1682094483.056029 1682094483.158540 0.000000 1 1 4 0 | ||
tree.1.1 0.119904 1682094482.937050 1682094483.056954 0.000000 1 1 4 0 | ||
``` | ||
|
||
This information is repeated from the [basic tree](../tree) example, and you can look there for details about what the above means. | ||
For this example, we focus on the variables available in the script, and we write files that are named by the tree id! You | ||
should be able to see them in the present working directory: | ||
|
||
```bash | ||
$ ls | ||
``` | ||
```console | ||
minicluster.yaml README.md run-on-instance.sh tree.1.1-output.txt tree.1.2-output.txt tree.2.1-output.txt tree.2.2-output.txt tree.out | ||
``` | ||
|
||
If we look in a script we can see the variables available to the instance: | ||
|
||
```bash | ||
$ cat tree.1.2-output.txt | ||
``` | ||
```console | ||
FLUX_TREE_ID tree.1.2 | ||
FLUX_TREE_JOBSCRIPT_INDEX 1 | ||
FLUX_TREE_NNODES 1 | ||
FLUX_TREE_NCORES_PER_NODE 1 | ||
FLUX_TREE_NGPUS_PER_NODE 0 | ||
``` | ||
|
||
Note that for this example we are only running the scripts on the leaves, hence why we only see one `NNODES` above. The table above | ||
that shows we go from `4 > 2 > 1`. You would direct custom logic in this little script to control execution of your job, likely with different instances using different resources. | ||
It's super cool! | ||
|
||
```bash | ||
$ kubectl delete -f minicluster.yaml | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
apiVersion: flux-framework.org/v1alpha1 | ||
kind: MiniCluster | ||
metadata: | ||
name: flux-sample | ||
namespace: flux-operator | ||
spec: | ||
# suppress all output except for test run | ||
logging: | ||
quiet: false | ||
|
||
# Number of pods to create for MiniCluster | ||
size: 4 | ||
tasks: 4 | ||
|
||
# Make this kind of persistent volume and claim available to pods | ||
volumes: | ||
data: | ||
storageClass: hostpath | ||
path: /tmp/workflow | ||
|
||
# See examples in this test file: | ||
# https://github.com/flux-framework/flux-sched/blob/master/t/t2001-tree-real.t#L43-L51 | ||
# And documentation here: | ||
# https://github.com/flux-framework/flux-sched/blob/master/resource/utilities/README.md | ||
containers: | ||
- image: ghcr.io/flux-framework/flux-restful-api:latest | ||
launcher: true | ||
cores: 4 | ||
|
||
# provide the /tmp/workflow as an output directory for each tree to write to! | ||
command: flux tree -T2x2 -J 4 -N 4 -c 4 -o /tmp/workflow/tree.out -Q easy:fcfs /bin/bash /tmp/workflow/run-on-instance.sh /tmp/workflow | ||
volumes: | ||
data: | ||
path: /tmp/workflow |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
#!/bin/bash | ||
|
||
# It's hard to see this for a quick job, so let's write to a file! | ||
outdir=${1} | ||
outfile="${outdir}/${FLUX_TREE_ID}-output.txt" | ||
|
||
# ID string uniquely identifying the hierarchical path of the Flux instance on which Jobscript is being executed | ||
echo "FLUX_TREE_ID ${FLUX_TREE_ID}" > "${outfile}" | ||
|
||
# the integer ID of each jobscript invocation local to the Flux instance. It starts from 1 and sequentially increases. | ||
echo "FLUX_TREE_JOBSCRIPT_INDEX ${FLUX_TREE_JOBSCRIPT_INDEX}" >> "${outfile}" | ||
|
||
# the number nodes assigned to the instance | ||
echo "FLUX_TREE_NNODES ${FLUX_TREE_NNODES}" >> "${outfile}" | ||
|
||
# the number of cores per node assigned to the instance | ||
echo "FLUX_TREE_NCORES_PER_NODE ${FLUX_TREE_NCORES_PER_NODE}" >> "${outfile}" | ||
|
||
# the number of GPUs per node assigned to the instance. | ||
echo "FLUX_TREE_NGPUS_PER_NODE ${FLUX_TREE_NGPUS_PER_NODE}" >> "${outfile}" |