In [129]:
# Full code and examples can be found here: https://github.com/flux-framework/flux-workflow-examples.git

import json
import os
import re
import sys
import flux
from flux.job import JobspecV1
from flux.job.JobID import JobID

Flux includes a sub-command for bootstrapping: `flux start`.  On an HPC system, you would use this in the same way as an MPI application; for example on a Slurm cluster, you would run `srun flux start`.  For local development and testing purposes, you can start multiple broker ranks on a single node by passing the `--size=N` flag to `flux start`.  For example, to start a Flux session with 4 brokers on the local node:

In [23]:
!flux start --size=4 flux getattr size

4


Flux uses hwloc to detect the resources on each node and then to populate its resource graph.  You can access the hwloc topology information that Flux collects with the `flux hwloc` subcommand:

In [27]:
!flux hwloc info
!flux hwloc topology

4 Machines, 28 Cores, 28 PUs
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topology SYSTEM "hwloc.dtd">
<topology>
  <object type="Machine" os_index="0" cpuset="0x0000007f" complete_cpuset="0x0000007f" online_cpuset="0x0000007f" allowed_cpuset="0x0000007f" nodeset="0x00000001" complete_nodeset="0x00000001" allowed_nodeset="0x00000001">
    <info name="DMIProductName" value="BHYVE"/>
    <info name="DMIProductVersion" value="1.0"/>
    <info name="DMIChassisVendor" value=""/>
    <info name="DMIChassisType" value="2"/>
    <info name="DMIChassisVersion" value="1.0"/>
    <info name="DMIChassisAssetTag" value="None"/>
    <info name="DMIBIOSVendor" value="BHYVE"/>
    <info name="DMIBIOSVersion" value="1.00"/>
    <info name="DMIBIOSDate" value="03/14/2014"/>
    <info name="DMISysVendor" value=""/>
    <info name="Backend" value="Linux"/>
    <info name="LinuxCgroup" value="/docker/a1f80548f61c765f5e97d7a3dc12389d8f7da4f778d7870005578b03bcc52f1f"/>
    <info name="OSName" value="Linu

Flux can also bootstrap its resource graph based on static input files, like in the case of a multi-user system instance setup by site adminstrators.  [More information on Flux's static resource configuration files](https://flux-framework.readthedocs.io/en/latest/adminguide.html#resource-configuration).  Flux provides a more standard interface to listing available resources that works regardless of the resource input source: `flux resource`.

In [34]:
# To view status of resources
!flux resource status
# To view scheduler's perspective on resources (allocated, free, etc
!flux resource list

    STATUS NNODES RANKS           NODELIST
     avail      4 0-3             a1f80548f61c,a1f80548f61c,a1f80548f61c,a1f80548f61c
     STATE NNODES   NCORES    NGPUS NODELIST
      free      4       28        0 a1f80548f61c,a1f80548f61c,a1f80548f61c,a1f80548f61c
 allocated      0        0        0 
      down      0        0        0 


Flux has a command for controlling the queue within the `job-manager`: `flux queue`.  This includes disabling job submission, re-enabling it, waiting for the queue to become idle or empty, and checking the queue status:

In [76]:
!flux queue disable "maintenance outage"
!flux queue enable
!flux queue -h

flux-queue: Job submission is disabled: maintenance outage
flux-queue: Job submission is enabled
Usage: flux-queue [OPTIONS] COMMAND ARGS
  -h, --help             Display this message.

Common commands from flux-queue:
   enable          Enable job submission
   disable         Disable job submission
   start           Start scheduling
   stop            Stop scheduling
   status          Get queue status
   drain           Wait for queue to become empty.
   idle            Wait for queue to become idle.


Each Flux instance has a set of attributes that are set at startup that affect the operation of Flux, such as `rank`, `size`, and `local-uri` (the Unix socket usable for communicating with Flux).  Many of these attributes can be modified at runtime, such as `log-stderr-level` (1 logs only critical messages to stderr while 7 logs everything, including debug messages).

In [44]:
!flux getattr rank
!flux getattr size
!flux getattr local-uri
!flux setattr log-stderr-level 3
!flux lsattr -v

0
4
local:///tmp/flux-T7FWFk/local-0
broker.mapping                          (vector,(0,1,4))
broker.pid                              16
broker.quorum                           0-3
broker.rc1_path                         /etc/flux/rc1
broker.rc3_path                         /etc/flux/rc3
conf.connector_path                     /usr/lib/flux/connectors
conf.exec_path                          /usr/libexec/flux/cmd
conf.module_path                        /usr/lib/flux/modules
conf.pmi_library_path                   /usr/lib/flux/libpmi.so
conf.shell_initrc                       /etc/flux/shell/initrc.lua
conf.shell_pluginpath                   /usr/lib/flux/shell/plugins
config.path                             -
content.acct-dirty                      0
content.acct-entries                    468
content.acct-size                       112091
content.acct-valid                      468
content.backing-module                  content-sqlite
content.backing-path                    /tmp/flux

Services within a Flux instance are implemented by modules. To query and manage broker modules, use `flux module`.  Modules that we have already directly interacted with in this tutorial include `resource` and `job-manager` (via `flux queue`), and we will interact with the `kvs` module in a few cells. For the most part, services are implemented by modules of the same name (e.g., `kvs` implements the `kvs` service and thus the `kvs.lookup` RPC).  In some circumstances, where multiple implementations for a service exist, a module of a different name implements a given service (e.g., in this instance, `sched-fluxion-qmanager` provides the `sched` service and thus `sched.alloc`, but in another instance `sched-simple` might provide the `sched` service).

In [45]:
!flux module list

Module                       Size Digest  Idle  S Service
job-ingest                1453656 AB27448 idle  R 
job-info                  1638128 E3A21EA idle  R 
kvs                       1835336 9E19B98 idle  R 
content-sqlite            1326664 368815B idle  R content-backing,kvs-checkpoint
job-manager               1722016 C480039 idle  R 
connector-local           1298368 5972E0B    0  R 
resource                  1706968 B5C4125 idle  R 
barrier                   1312256 8402DD6 idle  R 
job-exec                  1509296 ED8BF74 idle  R 
heartbeat                 1291720 16D0F76    0  R 
cron                      1407496 7D62B82 idle  R 
kvs-watch                 1528416 B7B3A25 idle  R 
sched-fluxion-qmanager    6374080 BEDC833 idle  R sched
sched-fluxion-resource   31145912 80BC659 idle  R 
job-list                  1710264 C3A95FB idle  R 


The key-value store (KVS) is a core component of a Flux instance. The `flux kvs` command provides a utility to list and manipulate values of the KVS. Modules of Flux use the KVS to persistently store information and retrieve it later on (potentially after a restart of Flux).  One example of KVS use by Flux is the `resource` module, which stores the resource set `R` of the current Flux instance:

In [47]:
!flux kvs ls 
!flux kvs ls resource
!flux kvs get resource.R

job         resource
R           eventlog
{"version": 1, "execution": {"R_lite": [{"rank": "0-3", "children": {"core": "0-6"}}], "starttime": 0.0, "expiration": 0.0, "nodelist": ["a1f80548f61c,a1f80548f61c,a1f80548f61c,a1f80548f61c"]}}


Flux provides a built-in mechanism for executing commands on nodes without requiring a job or resource allocation: `flux exec`.  `flux exec` is typically used by sys admins to execute administrative commands and load/unload modules across multiple ranks simultaneously.

In [50]:
!flux exec -r 0 flux getattr rank # only execute on rank 0
!flux exec flux getattr rank # execute on all ranks

0
2
1
3
1: a1f80548f61c
2: a1f80548f61c
0: a1f80548f61c
3: a1f80548f61c


To submit jobs to Flux, you can use the `flux mini` command, which has several sub-commands: `submit`, `run`, `bulksubmit`, `batch`, and `alloc`.  The `flux mini submit` command submits a job to Flux and prints out the jobid. 

In [52]:
!flux mini submit hostname

ƒQ41Hd5eo


The `flux mini run` command submit a job to Flux (similar to `flux mini submit`) but then it attaches to the job (with `flux job attach`), printing the job's stdout/stderr to the terminal and exiting with the same exit code as the job:

In [59]:
!flux mini run /bin/false

flux-job: task(s) exited with exit code 1


`flux mini submit` and `flux mini run` also support many other useful flags:

In [71]:
!flux mini run -n4 --label-io --time-limit=5s --env-remove=LD_LIBRARY_PATH hostname
!flux mini run --help

1: a1f80548f61c
0: a1f80548f61c
3: a1f80548f61c
2: a1f80548f61c
usage: flux-mini run [-h] [-t FSD] [--urgency N] [--job-name NAME] [-o OPT]
                     [--setattr ATTR=VAL] [--env RULE] [--env-remove PATTERN]
                     [--env-file FILE] [--input FILENAME] [--output FILENAME]
                     [--error FILENAME] [-l] [--flags FLAGS] [--dry-run]
                     [-N N] [-n N] [-c N] [-g N] [-v]
                     ...

positional arguments:
  command                   Job command and arguments

optional arguments:
  -h, --help                show this help message and exit
  -t, --time-limit=FSD      Time limit in Flux standard duration, e.g. 2d,
                            1.5h
      --urgency=N           Set job urgency (0-31), hold=0, default=16,
                            expedite=31
      --job-name=NAME       Set an optional name for job to NAME
  -o, --setopt=OPT          Set shell option OPT. An optional value is
                            supported 

The `flux mini bulksubmit` makes submitting the same executable repeatedly very simple.  It leverages the same syntax as GNU's parallel:

In [69]:
!flux mini bulksubmit --watch --wait echo {} ::: foo bar baz

ƒZc7xA5PV
ƒZc7xA5PW
ƒZc7ye4fq
foo
baz
bar


Of course, Flux can launch more than just single-node, single-core jobs.  We can submit multiple heterogeneous jobs, and Flux will co-scheduling while also ensuring no oversubscription of resources (e.g., cores).

Note: in this tutorial, we cannot assume that the host you are running on has multiple cores, thus the examples below only vary the number of nodes per job.  Varying the `cores-per-task` is also possible on Flux when the underlying hardware supports it (e.g., a multi-core node).

In [79]:
!flux mini submit --nodes=2 --ntasks=2 --cores-per-task=1 --job-name simulation sleep inf
!flux mini submit --nodes=1 --ntasks=1 --cores-per-task=1 --job-name analysis sleep inf

ƒfLk5gM6P
ƒfLuyRZ9q


We can now list out the jobs in the queue with `flux jobs`, and we should see both jobs that we just submitted.  We can also check the resources managed by Flux, and we should see that most of them are now allocated.

In [80]:
!flux jobs
!flux resource list

       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME NODELIST
   ƒfLuyRZ9q fluxuser analysis    R      1      1   1.780s a1f80548f61c
   ƒfLk5gM6P fluxuser simulation  R      2      2   2.165s a1f80548f61c,a1f80548f61c
   ƒfBrYE6KZ fluxuser simulation  R      2      2   22.34s a1f80548f61c,a1f80548f61c
   ƒdza8Gjwu fluxuser analysis    R      1      1   2.992m a1f80548f61c
   ƒdzQ49cv3 fluxuser simulation  R      2      2   2.999m a1f80548f61c,a1f80548f61c
     STATE NNODES   NCORES    NGPUS NODELIST
      free      4       20        0 a1f80548f61c,a1f80548f61c,a1f80548f61c,a1f80548f61c
 allocated      2        8        0 a1f80548f61c,a1f80548f61c
      down      0        0        0 


Since those jobs won't ever exit (and we didn't specify a timelimit), let's kill them off now and free up the resources.

In [84]:
!flux job killall -f
!flux jobs

flux-job: Command matched 0 jobs
       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME NODELIST


We can use the `flux mini batch` command to easily created nested flux instances.  When `flux mini batch` is invoked, Flux will automatically create a nested instance that spans the resources allocated to the job, and then Flux runs the batch script passed to `flux mini batch` on rank 0 of the nested instance. While a batch script is expected to launch parallel jobs using `flux mini run` or `flux mini submit` at this level, nothing prevents the script from further batching other sub-batch-jobs using the `flux mini batch` interface, if desired.

Note: Flux also provides a `flux mini alloc` which is an interactive version of `flux mini batch`, but demonstrating that in a Jupyter notebook is difficult due to the lack of pseudo-terminal.

In [85]:
!flux mini batch --nslots=2 --cores-per-slot=1 --nodes=2 ./sleep_batch.sh
!flux mini batch --nslots=2 --cores-per-slot=1 --nodes=2 ./sleep_batch.sh

ƒmFZVSBZq
ƒmFjQfNud


The contents of `sleep_batch.sh`:

``` bash 
    !/bin/bash
  
    echo "Starting my batch job"
    echo "Print the resources allocated to this batch job"
    flux resource list

    echo "Use sleep to emulate a parallel program"
    echo "Run the program at a total of 2 processes each requiring"
    echo "1 core. These processes are equally spread across 2 nodes."
    flux mini run -N 2 -n 2 sleep 30
    flux mini run -N 2 -n 2 sleep 30
```

In [93]:
!flux jobs

# Copy the Job ID of one of the `flux mini batch`s here to examine the job's resources and output
JOBID="ƒmFjQfNud"
!flux job info {JOBID} R
!flux job attach {JOBID}
!cat flux-{JOBID}.out

       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME NODELIST
{"version": 1, "execution": {"R_lite": [{"rank": "0-1", "children": {"core": "5"}}], "nodelist": ["a1f80548f61c,a1f80548f61c"], "starttime": 1618372783, "expiration": 1618977583}}

0: stdout redirected to flux-ƒmFjQfNud.out
0: stderr redirected to flux-ƒmFjQfNud.out
Starting my batch job
Print the resources allocated to this batch job
2 Machines, 2 Cores, 2 PUs
Use sleep to emulate a parallel program
Run the program at a total of 2 processes each requiring
1 core. These processes are equally spread across 2 nodes.


Flux also provides first-class python bindings which can be used to submit jobs programmatically. The following script shows this with the `flux.job.submit()` call:

In [123]:
f = flux.Flux()
compute_jobreq = JobspecV1.from_command(
    command=["./compute.py", "120"], num_tasks=1, num_nodes=1, cores_per_task=1
)
compute_jobreq.cwd = os.path.expanduser("~/flux-workflow-examples/job-submit-api/")
print(JobID(flux.job.submit(f,compute_jobreq)))

ƒufpKudHq


In [124]:
!flux jobs

       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME NODELIST
   ƒufpKudHq fluxuser compute.py  R      1      1   1.009s a1f80548f61c
   ƒuDBYHyGB fluxuser io-forward  R      1      1   1.024m a1f80548f61c
   ƒuDB76BUK fluxuser compute.py  R      4      2   1.024m a1f80548f61c,a1f80548f61c


In [125]:
compute_jobreq = JobspecV1.from_command(
    command=["./compute.py", "120"], num_tasks=4, num_nodes=2, cores_per_task=2
)
compute_jobreq.cwd = os.path.expanduser("~/flux-workflow-examples/job-submit-api/")
print(JobID(flux.job.submit(f, compute_jobreq)))

io_jobreq = JobspecV1.from_command(
    command=["./io-forwarding.py", "120"], num_tasks=1, num_nodes=1, cores_per_task=1
)
io_jobreq.cwd = os.path.expanduser("~/flux-workflow-examples/job-submit-api/")
print(JobID(flux.job.submit(f, io_jobreq)))

ƒuh4jxYej
ƒuh5CeKiw


In [126]:
!flux jobs

       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME NODELIST
   ƒuh5CeKiw fluxuser io-forward  R      1      1   0.929s a1f80548f61c
   ƒuh4jxYej fluxuser compute.py  R      4      2   0.956s a1f80548f61c,a1f80548f61c
   ƒufpKudHq fluxuser compute.py  R      1      1   3.789s a1f80548f61c
   ƒuDBYHyGB fluxuser io-forward  R      1      1    1.07m a1f80548f61c
   ƒuDB76BUK fluxuser compute.py  R      4      2   1.071m a1f80548f61c,a1f80548f61c


We can use the FluxExecutor class to submit large numbers of jobs to Flux. This method uses python's concurrent.futures interface.  Example snippet from `~/flux-workflow-examples/async-bulk-job-submit/bulksubmit_executor.py`:

``` python 
with FluxExecutor() as executor:
        compute_jobspec = JobspecV1.from_command(args.command)
        futures = [executor.submit(compute_jobspec) for _ in range(args.njobs)]
        # wait for the jobid for each job, as a proxy for the job being submitted
        for fut in futures:
            fut.jobid()
        # all jobs submitted - print timings
```

In [127]:
# Submit a FluxExecutor based script.
%run ./flux-workflow-examples/async-bulk-job-submit/bulksubmit_executor.py -n200 /bin/sleep 0

bulksubmit_executor: submitted 200 jobs in 0.49s. 405.55job/s
bulksubmit_executor: First job finished in about 0.977s
|██████████████████████████████████████████████████████████| 100.0% (48.3 job/s)
bulksubmit_executor: Ran 200 jobs in 4.1s. 48.2 job/s
