<div>
<center><img src="../assets/Flux-logo.svg" width="400"/>
</div>

# Welcome to the Flux Tutorial

> What is Flux Framework? 🤔️
 
Flux is a flexible framework for resource management, built for your site. The framework consists of a suite of projects, tools, and libraries that may be used to build site-custom resource managers for High Performance Computing centers and cloud environments. Flux is a next-generation resource manager and scheduler with many transformative capabilities like hierarchical scheduling and resource management (you can think of it as "fractal scheduling") and directed-graph based resource representations.

## I'm ready! How do I do this tutorial? 😁️

This tutorial is split into 3 chapters, each of which has a notebook.

Let's get started!

<br>

# Getting started with Flux

The code and examples that this tutorial is based on can be found at [flux-framework/Tutorials](https://github.com/flux-framework/Tutorials/tree/master/2024-HPCIC-AWS). You can also find python examples in the `assets/flux-workflow-examples` directory from the sidebar navigation in this JupyterLab instance. 

<div class="alert alert-block alert-info">
<span style="font-weight:600">Tip:</span> Did you know you can get help for flux or a flux command? For example, try "flux help" and "flux help jobs"
</div>

In [1]:
!flux help

Usage: flux [OPTIONS] COMMAND ARGS
  -h, --help             Display this message.
  -v, --verbose          Be verbose about environment and command search
  -V, --version          Display command and component versions
  -p, --parent           Set environment of parent instead of current instance
  -r, --root             Set environment of root instead of current instance

For general Flux documentation, please visit
    https://flux-framework.readthedocs.io

run and submit jobs, allocate resources
   submit             submit a job to a Flux instance
   run                run a Flux job interactively
   bulksubmit         submit jobs in bulk to a Flux instance
   alloc              allocate a new Flux instance for interactive use
   batch              submit a batch script to Flux

list and interact with jobs
   jobs               list jobs submitted to Flux
   top                display running Flux jobs
   pstree             display job hierarchies
   cancel             cancel one o

In [2]:
!flux help jobs

FLUX-JOBS(1)                       flux-core                      FLUX-JOBS(1)

NAME
       flux-jobs - list jobs submitted to Flux

SYNOPSIS
       flux jobs [OPTIONS] [JOBID ...]

DESCRIPTION
       flux  jobs is used to list jobs run under Flux. By default only pending
       and running jobs for the current user are listed. Additional  jobs  and
       information  can  be  listed  using options listed below.  Alternately,
       specific job ids can be listed on the command line to only  list  those
       job IDs.

OPTIONS
       -a     List  jobs  in  all  states,  including  inactive jobs.  This is
              shorthand for --filter=pending,running,inactive.

       -A     List jobs of all users. This is shorthand for --user=all.

       -n, --no-header
              For default output, do not output column headers.

       -u, --user=[USERNAME|UID]
              List jobs for a specific username or userid. Specify all for all
              users.

       --name=[JOB NAME]
  

In [3]:
!flux submit --help

usage: flux submit [OPTIONS...] COMMAND [ARGS...]

enqueue a job

positional arguments:
  command                     Job command and arguments

options:
  -h, --help                  show this help message and exit
  -B, --bank=BANK             Submit a job to a specific named bank
  -q, --queue=NAME            Submit a job to a specific named queue
  -t, --time-limit=MIN|FSD    Time limit in minutes when no units provided,
                              otherwise in Flux standard duration, e.g. 30s,
                              2d, 1.5h
      --urgency=N             Set job urgency (0-31), hold=0, default=16,
                              expedite=31
      --job-name=NAME         Set an optional name for job to NAME
  -o, --setopt=OPT            Set shell option OPT. An optional value is
                              supported with OPT=VAL (default VAL=1) (multiple
                              use OK)
  -S, --setattr=ATTR          Set job attribute ATTR. An optional value is
       

### What does the terminal prompt mean?
For cases when you need a terminal, we will <button data-commandLinker-command="terminal:open" data-name="flux" href="#">provide you with a button</button>! However, you can also select `File -> New -> Terminal` to open one on the fly. Let's next talk about visualizing Flux resources.

When running shell commands in Jupyter they are prepended with a !. Jupyter natively assumes Python code in its cells.

## Flux Resources

When you are interacting with Flux, you will commonly want to know what resources are available to you. Flux uses [hwloc](https://github.com/open-mpi/hwloc) to detect the resources on each node and then to populate its resource graph.

You can access the topology information that Flux collects with the `flux resource` subcommand. Let's run `flux resource list` to see the resources available to us in this notebook:

In [4]:
!flux resource list

     STATE NNODES NCORES NGPUS NODELIST
      free      4     40     0 a100cf62dc4f,a100cf62dc4f,a100cf62dc4f,a100cf62dc4f
 allocated      0      0     0 
      down      0      0     0 


Flux can also bootstrap its resource graph based on static input files, like in the case of a multi-user system instance setup by site administrators.  [More information on Flux's static resource configuration files](https://flux-framework.readthedocs.io/projects/flux-core/en/latest/guide/admin.html#configuration).  Flux provides a more standard interface to listing available resources that works regardless of the resource input source: `flux resource`.

In [5]:
# To view status of resources
!flux resource status

       STATE UP NNODES NODELIST
       avail [01;32m ✔[0;0m      4 a100cf62dc4f,a100cf62dc4f,a100cf62dc4f,a100cf62dc4f


It might also be the case that you need to see queues. Here is how to do that:

In [6]:
!flux queue list

 EN ST TDEFAULT   TLIMIT     NNODES     NCORES      NGPUS
 [01;32m ✔[0;0m [01;32m ✔[0;0m      inf      inf        0-4       0-40        0-0


<br>

# Flux Commands 

Here are how Flux commands map to a scheduler you are likely familiar with, Slurm. A larger table with similar mappings for LSF, Moab, and Slurm can be [viewed here](https://hpc.llnl.gov/banks-jobs/running-jobs/batch-system-cross-reference-guides). For submitting jobs, you can use the `flux` `submit`, `run`, `bulksubmit`, `batch`, and `alloc` commands.

<table>
    <tr>
        <th>Operation</th>
        <th>Slurm</th>
        <th>Flux</th>
    </tr>
    <tr>
        <td>One-off run of a single job (blocking)</td>
        <td><code>srun</code></td>
        <td><code>flux run</code></td>
    </tr>
    <tr>
        <td>One-off run of a single job (interactive)</td>
        <td><code>srun --pty</code></td>
        <td><code>flux run -o pty.interactive</code></td>
    </tr>
    <tr>
        <td>One-off run of a single job (not blocking)</td>
        <td><code>NA</code></td>
        <td><code>flux submit</code></td>
    </tr>
    <tr>
        <td>Bulk submission of jobs (not blocking)</td>
        <td><code>NA</code></td>
        <td><code>flux bulksubmit</code></td>
    </tr>    
    <tr>
        <td>Watching jobs</td>
        <td><code>NA</code></td>
        <td><code>flux watch</code></td>
    </tr>
    <tr>
        <td>Querying the status of jobs</td>
        <td><code>squeue</code>/<code>scontrol show job <i>job_id</i></code></td>
        <td><code>flux jobs</code>/<code>flux job info <i>job_id</i></code></td>
    </tr>
    <tr>
        <td>Canceling running jobs</td>
        <td><code>scancel</code></td>
        <td><code>flux cancel</code></td>
    </tr>
    <tr>
        <td>Allocation for an interactive instance</td>
        <td><code>salloc</code></td>
        <td><code>flux alloc</code></td>
    </tr>
    <tr>
        <td>Submitting batch jobs</td>
        <td><code>sbatch</code></td>
        <td><code>flux batch</code></td>
    </tr>
</table>

## flux run

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Running a single job (blocking)
</div>

The `flux run` command submits a job to Flux (similar to `flux submit`) but then attaches to the job with `flux job attach`, printing the job's stdout/stderr to the terminal and exiting with the same exit code as the job. It's basically doing an interactive submit, because you will be able to watch the output in your terminal, and it will block your terminal until the job completes.

In [7]:
!flux run sh -c 'sleep 5 && echo hello'

hello


The output from the previous command is the hostname (a container ID string in this case). If the job exits with a non-zero exit code this will be reported by `flux job attach` (occurs implicitly with `flux run`). For example, execute the following:

In [8]:
!flux run /bin/false

flux-job: task(s) exited with exit code 1


A job submitted with `run` can be canceled with two rapid `Cltr-C`s in succession, or a user can detach from the job with `Ctrl-C Ctrl-Z`. The user can then re-attach to the job by using `flux job attach JOBID`.

`flux submit` and `flux run` also support many other useful flags:

In [9]:
!flux run -N1 -n1 -c1 hostname

a100cf62dc4f


In [10]:
!flux run -n4 --label-io --time-limit=5s --env-remove=LD_LIBRARY_PATH hostname

1: a100cf62dc4f
0: a100cf62dc4f
3: a100cf62dc4f
2: a100cf62dc4f


In [11]:
# Uncomment and run this help command if you want to see all the flags for flux run
# !flux run --help

In [12]:
!flux run make

mpicc -o hello hello.c


In [13]:
!flux run -N1 -n8 -c2 --exclusive ./hello

0.081s: flux-shell: ERROR: error distributing 8 tasks over R
0.083s: flux-shell: FATAL: failed to initialize shell info
flux-job: task(s) exited with exit code 1


## flux submit

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Running a single job (not blocking)
</div>


The `flux submit` command submits a job to Flux and prints out the jobid.

In [14]:
# Let's peek at the help for flux submit!
!flux submit --help | head -n 15

usage: flux submit [OPTIONS...] COMMAND [ARGS...]

enqueue a job

positional arguments:
  command                     Job command and arguments

options:
  -h, --help                  show this help message and exit
  -B, --bank=BANK             Submit a job to a specific named bank
  -q, --queue=NAME            Submit a job to a specific named queue
  -t, --time-limit=MIN|FSD    Time limit in minutes when no units provided,
                              otherwise in Flux standard duration, e.g. 30s,
                              2d, 1.5h
      --urgency=N             Set job urgency (0-31), hold=0, default=16,


In [15]:
!flux submit hostname

ƒ3Z6mwRWPH


But how does one get output? To quickly see output (which will block the terminal if the job is still running) after a submit, you can do:

```bash
flux job attach $(flux job last)
```

To provide a custom path to an output or error file, you can provide `--out` and `--err`, respectively. Let's try those both now.

In [16]:
# What was the last job id again?
! flux job last

# Attach to the last job id that was submitted (will block if still running and stream output)
! flux job attach $(flux job last)

ƒ3Z6mwRWPH
a100cf62dc4f


In [17]:
# Now let's submit another one, and give it the same output and error file
! flux submit --out /tmp/harry-potter.txt --err /tmp/harry-potter.txt echo "Yer a wizard, $(whoami)!"

# Take a look!
! cat /tmp/harry-potter.txt

ƒ3Z7Moo6nB
Yer a wizard, jovyan!


`submit` supports common options like `--nnodes`, `--ntasks`, and `--cores-per-task`. There are short option equivalents (`-N`, `-n`, and `-c`, respectively) of these options as well. `--cores-per-task=1` is the default.

In [18]:
!flux submit -N1 -n2 sleep inf

ƒ3Z7jwgNpo


## flux bulksubmit

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Submitting jobs in bulk (not blocking)
</div>

The `flux bulksubmit` command enqueues jobs based on a set of inputs which are substituted on the command line, similar to `xargs` and the GNU `parallel` utility, except the jobs have access to the resources of an entire Flux instance instead of only the local system.

In [19]:
!flux bulksubmit --wait echo {} ::: harry ron hermione

ƒ3Z7z2ZZ6o
ƒ3Z7z2ZZ6p
ƒ3Z7z2ZZ6q


### carbon copy

The `--cc` option (akin to "carbon copy") to `submit` makes repeated submission even easier via, `flux submit --cc=IDSET`:

In [20]:
!flux submit --watch --cc=1-4 echo {cc}

ƒ3Z8KqXxW3
ƒ3Z8KqXxW4
ƒ3Z8KqXxW5
ƒ3Z8Ks1wnP
4
1
3
2


Try it in the <button data-commandLinker-command="terminal:open" data-name="flux" href="#">JupyterLab terminal</button> with a progress bar and jobs/s rate report: `flux submit --cc=1-100 --watch --progress --jps hostname`

Note that `--wait` is implied by `--watch`, meaning that when you are watching jobs, you are also waiting for them to finish. Here are some other carbon copy commands that are useful to try:

In [21]:
# Use flux carbon copy to submit identical jobs with different inputs
!flux submit --cc="1-2" echo "Hello I am job {cc}"

ƒ3Z8fEnZPm
ƒ3Z8fGGYg7


Here are some "carbon copy" jobs to try in the <button data-commandLinker-command="terminal:open" data-name="flux" href="#">JupyterLab terminal</button>:

```bash
# Use flux carbon copy to submit identical jobs with different inputs
flux submit --cc="1-10" echo "Hello I am job {cc}"

# Submits scripts myscript1.sh through myscript10.sh
flux submit --cc=0-6 flux-workflow-examples/bulksubmit/{cc}.sh

# Bypass the key value store and write output to file with jobid
flux submit --cc=1-10 --output=job-{{id}}.out echo "This is job {cc}"

# Use carbon copy to submit identical jobs with different inputs
flux bulksubmit --dry-run --cc={1} echo {0} ::: a b c ::: 0-1 0-3 0-7
```

Of course, Flux can launch more than just single-node, single-core jobs.  We can submit multiple heterogeneous jobs and Flux will co-schedule the jobs while also ensuring no oversubscription of resources (e.g., cores). Let's run the second example here, and add a clever trick to ask for output as we submit the jobs. This is a fun one, I promise!

In [22]:
! for jobid in $(flux submit --cc=0-7 /bin/bash bulksubmit/{cc}.sh); do flux job attach ${jobid}; sleep 1; done

Enter, stranger, but take heed
Of what awaits the sin of greed,
For those who take, but do not earn,
Must pay most dearly in their turn,
So if you seek beneath our floors
A treasure that was never yours,
Thief, you have been warned, beware
Of finding more than treasure there.


Note: in this tutorial, we cannot assume that the host you are running on has multiple cores, thus the examples below only vary the number of nodes per job.  Varying the `cores-per-task` is also possible on Flux when the underlying hardware supports it (e.g., a multi-core node). Let's run the middle example - it's a fun one, I promise!

In [23]:
!flux submit --nodes=2 --ntasks=2 --cores-per-task=1 --job-name magic sleep inf
!flux submit --nodes=1 --ntasks=1 --cores-per-task=1 --job-name moremagic sleep inf

ƒ3ZDdkGo9q
ƒ3ZDshk327


## flux watch

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> 👀️ Watching jobs
</div>

Wouldn't it be cool to submit a job and then watch it? Well, yeah! We can do this now with flux watch. Let's run a fun example, and then watch the output. We have sleeps in here interspersed with echos only to show you the live action! 🥞️
Also note a nice trick - you can always use `flux job last` to get the last JOBID.
Here is an example (not runnable, as notebooks don't support environment variables) for getting and saving a job id:

```bash
flux submit hostname
JOBID=$(flux job last)
```

And then you could use the variable `$JOBID` in your subsequent script or interactions with Flux! So what makes `flux watch` different from `flux job attach`? Aside from the fact that `flux watch` is read-only, `flux watch` can watch many (or even all (`flux watch --all`) jobs at once!

In [24]:
!flux submit ./job-watch.sh
!flux watch $(flux job last)

ƒ3ZE7s5B99
Oh you may not think me pretty,
But don’t judge on what you see,
I’ll eat myself if you can find
A smarter hat than me.
[A scheduler smarter than me]!


## flux jobs

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Querying the status of jobs
</div>

We can now list the jobs in the queue with `flux jobs` and we should see both jobs that we just submitted. Jobs that are instances are colored blue in output, red jobs are failed jobs, and green jobs are those that completed successfully. Note that the JupyterLab notebook may not display these colors. You will be able to see them in the terminal.

In [25]:
!flux submit sleep inf
!flux jobs

ƒ3ZJzX4FNF
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
  ƒ3ZJzX4FNF jovyan   sleep       R      1      1   0.507s a100cf62dc4f
  ƒ3ZDshk327 jovyan   moremagic   R      1      1   12.13s a100cf62dc4f
  ƒ3ZDdkGo9q jovyan   magic       R      2      2   12.66s a100cf62dc4f,a100cf62dc4f
  ƒ3Z7jwgNpo jovyan   sleep       R      2      1   26.04s a100cf62dc4f


You might also want to see "all" jobs with `-a`.

In [26]:
!flux jobs

       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
  ƒ3ZJzX4FNF jovyan   sleep       R      1      1   1.026s a100cf62dc4f
  ƒ3ZDshk327 jovyan   moremagic   R      1      1   12.64s a100cf62dc4f
  ƒ3ZDdkGo9q jovyan   magic       R      2      2   13.18s a100cf62dc4f,a100cf62dc4f
  ƒ3Z7jwgNpo jovyan   sleep       R      2      1   26.56s a100cf62dc4f


## flux cancel

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Canceling running jobs
</div>

Since some of the jobs we see in the table above won't ever exit (and we didn't specify a timelimit), let's cancel them all now and free up the resources.

In [27]:
# This was previously flux cancelall -f
!flux cancel --all
!flux jobs

flux-cancel: Canceled 4 jobs (0 errors)
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO


## flux alloc

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Allocation for an interactive instance
</div>

You might want to request an allocation for a set of resources (an allocation) and then attach to them interactively. This is the goal of flux alloc. Since we can't easily do that in a cell, try opening up the <button data-commandLinker-command="terminal:open" data-name="flux" href="#">JupyterLab terminal</button> and doing: 

```bash
# Look at the resources you have outside of the allocation
flux resource list

# Request an allocation with 2 "nodes" - a subset of what you have in total
flux alloc -N 2

# See the resources you are given
flux resource list

# You can exit from the allocation like this!
exit
```
When you want to automate this, submitting work to an allocation, you would use `flux batch`.

## flux batch

<div class="alert alert-block" style="background-color:skyblue">
<span style="font-weight:600">Description:</span> Submitting batch jobs
</div>

We can use the `flux batch` command to easily created nested flux instances.  When `flux batch` is invoked, Flux will automatically create a nested instance that spans the resources allocated to the job, and then Flux runs the batch script passed to `flux batch` on rank 0 of the nested instance. "Rank" refers to the rank of the Tree-Based Overlay Network (TBON) used by the [Flux brokers](https://flux-framework.readthedocs.io/projects/flux-core/en/latest/man1/flux-broker.html).

While a batch script is expected to launch parallel jobs using `flux run` or `flux submit` at this level, nothing prevents the script from further batching other sub-batch-jobs using the `flux batch` interface, if desired.

In [28]:
!flux batch --nslots=2 --cores-per-slot=1 --nodes=2 ./sleep_batch.sh
!flux batch --nslots=2 --cores-per-slot=1 --nodes=2 ./sleep_batch.sh

ƒ3ZL9YHnGP
ƒ3ZLQDGg2K


Take a quick look at [sleep_batch.sh](sleep_batch.sh) to see what we are about to run.

In [29]:
# Here we are submitting a job that generates output, and asking to write it to /tmp/cheese.txt
!flux submit --out /tmp/mad-eye.txt echo "CONSTANT VIGILANCE!"

# This will show us JOBIDs
!flux jobs

ƒ3ZLfDXQSj
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
[01;34m  ƒ3ZLQDGg2K jovyan   ./sleep_b+  R      2      2   1.138s a100cf62dc4f,a100cf62dc4f
[0;0m[01;34m  ƒ3ZL9YHnGP jovyan   ./sleep_b+  R      2      2   1.710s a100cf62dc4f,a100cf62dc4f
[0;0m

### `flux job`

Let's next inspect the last job we ran with `flux job info` and target the last job identifier with `flux job last`. 

In [30]:
# Note here we are using flux job last to see the last job id
# The "R" here asks for the resource spec
!flux job info $(flux job last) R

# When we attach it will direct us to our output file
!flux job attach $(flux job last)

# And we can look at the output file to see our expected output!
from IPython.display import Code
Code(filename='/tmp/mad-eye.txt', language='text')

{"version": 1, "execution": {"R_lite": [{"rank": "0", "children": {"core": "1"}}], "starttime": 1747878622.6577082, "expiration": 0.0, "nodelist": ["a100cf62dc4f"]}}
0: stdout redirected to /tmp/mad-eye.txt
0: stderr redirected to /tmp/mad-eye.txt


We can again see a list all completed jobs with `flux jobs -a`:

In [31]:
!flux jobs -a

       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
[01;34m  ƒ3ZLQDGg2K jovyan   ./sleep_b+  R      2      2   2.654s a100cf62dc4f,a100cf62dc4f
[0;0m[01;34m  ƒ3ZL9YHnGP jovyan   ./sleep_b+  R      2      2   3.227s a100cf62dc4f,a100cf62dc4f
[0;0m[01;32m  ƒ3ZLfDXQSj jovyan   echo       CD      1      1   0.141s a100cf62dc4f
[0;0m[37m  ƒ3ZDdkGo9q jovyan   magic      CA      2      2   13.69s a100cf62dc4f,a100cf62dc4f
[0;0m[37m  ƒ3ZDshk327 jovyan   moremagic  CA      1      1   13.15s a100cf62dc4f
[0;0m[37m  ƒ3Z7jwgNpo jovyan   sleep      CA      2      1   27.07s a100cf62dc4f
[0;0m[37m  ƒ3ZJzX4FNF jovyan   sleep      CA      1      1   1.535s a100cf62dc4f
[0;0m[01;32m  ƒ3ZE7s5B99 jovyan   job-watch+ CD      1      1   10.28s a100cf62dc4f
[0;0m[01;32m  ƒ3Z8uKfjfn jovyan   bash       CD      1      1   1.312s a100cf62dc4f
[0;0m[01;32m  ƒ3Z8uNdiET jovyan   bash       CD      1      1   1.311s a100cf62dc4f
[0;0m[01;32m  ƒ3Z8uM9ix7 jovyan   bash       CD      

To restrict the output to failed (i.e., jobs that exit with nonzero exit code, time out, or are canceled or killed) jobs, run:

In [32]:
!flux jobs -f failed

       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
[01;31m  ƒ3Z6H49Vwd jovyan   hello       F      8      1   0.056s a100cf62dc4f
[0;0m[01;31m  ƒ3Z4qpypNs jovyan   false       F      1      1   0.123s a100cf62dc4f
[0;0m[01;31m  ƒ3EEVpmWNU jovyan   sleep       F      1      1   0.135s a100cf62dc4f
[0;0m[01;31m  ƒ3EEVpmWNV jovyan   sleep       F      1      1   0.134s a100cf62dc4f
[0;0m[01;31m  ƒ3EEVpmWNT jovyan   sleep       F      1      1   0.135s a100cf62dc4f
[0;0m

### flux submit from within a batch

Next open up [hello-batch.sh](hello-batch.sh) to see an example of using `flux batch` to submit jobs within the instance, and then wait for them to finish. This script is going to:

1. Create a flux instance with the top level resources you specify
2. Submit jobs to the scheduler controlled by the broker of that sub-instance
3. Run the four jobs, with `--flags=waitable` and `flux job wait --all` to wait for the output file
4. Within the batch script, you can add `--wait` or `--flags=waitable` to individual jobs, and use `flux queue drain` to wait for the queue to drain, _or_ `flux job wait --all` to wait for the jobs you flagged to finish. 

Note that when you submit a batch job, you'll get a job id back for the _batch job_, and usually when you look at the output of that with `flux job attach $jobid` you will see the output file(s) where the internal contents are written. Since we want to print the output file easily to the terminal, we are waiting for the batch job by adding the `--flags=waitable` and then waiting for it. Let's try to run our batch job now.

In [33]:
! flux batch --flags=waitable --out /tmp/flux-batch.out -N2 ./hello-batch.sh
! flux job wait
! cat /tmp/hello-batch-1.out
! cat /tmp/hello-batch-2.out
! cat /tmp/hello-batch-3.out
! cat /tmp/hello-batch-4.out

ƒ3ZN74Gnmm
ƒ3ZN74Gnmm
Hello job 1 from a100cf62dc4f 💛️
Hello job 2 from a100cf62dc4f 💚️
Hello job 3 from a100cf62dc4f 💙️
Hello job 4 from a100cf62dc4f 💜️


Each of `flux batch` and `flux alloc` hints at creating a Flux instance. How deep can we go into that rabbit hole, perhaps for jobs and workflows with nested logic or more orchestration complexity?

### The Flux Hierarchy 🍇️

One feature of the Flux Framework scheduler that is unique is its ability to submit jobs within instances, where an instance can be thought of as a level in a graph. Let's start with a basic image - this is what it might look like to submit to a scheduler that is not graph-based (left), where all jobs go to a central job queue or database. Note that our maximum job throughput is one job per second. The throughput is limited by the workload manager's ability to process a single job. We can improve upon this by simply adding another level, perhaps with three instances. For example, let's say we create a flux allocation or batch that has control of some number of child nodes. We might launch three new instances (each with its own scheduler and queue, right image) at that level two, and all of a sudden, we get a throughput of 1x3, or three jobs per second.

<table>
<tr>
    <td>
<img src="../img/single-submit.png" style="float:left; margin-top:30px" width="350px">        
    </td>
    <td>
<img src="../img/instance-submit.png" style="float:right; margin-top:-20px" width="550px">        
    </td>
    </tr>
</table>

All of a sudden, the throughout can increase exponentially because we are essentially submitting to different schedulers. The example above is not impressive, but our [learning guide](https://flux-framework.readthedocs.io/en/latest/guides/learning_guide.html#fully-hierarchical-resource-management-techniques) (Figure 10) has a beautiful example of how it can scale, done via an actual experiment. We were able to submit 500 jobs/second using only three levels, vs. close to 1 job/second with one level. For an interesting detail, you can vary the scheduler algorithm or topology within each sub-instance, meaning that you can do some fairly interesting things with scheduling work, and all without stressing the top level system instance. 

Now that we understand nested instances, let's look at another batch example that better uses them. Here we have two job scripts:

- [sub_job1.sh](sub_job1.sh): Is going to be run with `flux batch` and submit sub_job2.sh
- [sub_job2.sh](sub_job2.sh): Is going to be submitted by sub_job1.sh.

Take a look at each script to see how they work, and then submit it!

In [41]:
!flux batch ./sub_job1.sh

ƒ3aSwyH5No


And now that we've submitted, let's look at the hierarchy for all the jobs we just ran. Here is how to try flux pstree, which normally can show jobs in an instance, but it has limited functionality given we are in a notebook! So instead of just running the single command, let's add "-a" to indicate "show me ALL jobs."
More complex jobs and in a different environment would have deeper nesting. You can [see examples here](https://flux-framework.readthedocs.io/en/latest/jobs/hierarchies.html?h=pstree#flux-pstree-command).

In [42]:
!flux pstree -a

.
├── ./sub_job1.sh
│   └── ./sub_job2.sh
│       └── sleep:R
├── 2*[./sub_job1.sh:CD]
├── 2*[hello:F]
├── ./hello-batch.sh:CD
├── 2*[./sleep_batch.sh:CD]
├── 56*[echo:CD]
├── magic:CA
├── moremagic:CA
├── 5*[sleep:CA]
├── job-watch.sh:CD
├── 8*[bash:CD]
├── 3*[hostname:CD]
├── make:CD
├── false:F
├── sh:CD
├── 442*[sleep:CD]
├── 50*[true:CD]
└── 3*[sleep:F]


You can also try a more detailed view with `flux pstree -a -X`!

### flux start

<div class="alert alert-block" style="background-color:lightgreen">
<span style="font-weight:600">Description:</span> Interactively starting a set of resources
</div>

Sometimes you need to interactively start a set of compute resources. We call this subset a flux instance. You can launch jobs under this instance, akin to how you've done above! In fact, this entire tutorial is started (to give you 4 faux nodes) with a `flux start` command: 

```bash
flux start --test-size=4
```

A Flux instance may be running as the default resource manager on a cluster, a job in a resource manager such as Slurm, LSF, or Flux itself, or as a test instance launched locally. This is really neat because it means you can launch Flux under other resource managers where it is not installed as the system workload manager. You can also execute "one off" commands to it, for example, to see the instance size:

In [38]:
!flux start --test-size=4 flux getattr size

4


When you run `flux start` without a command, it will give you an interactive shell to the instance. When you provide a command (as we do above) it will run it and exit. This is what happens for the command above! The output indicates the number of brokers started successfully. As soon as we get and print the size, we exit.

#### **Wrap up: All the Different Ways to Do Work (from the CLI)**
Here's a basic table that shows the four submission commands we use in Flux. 

|                        | creates subinstance           | runs distributed application          |
|------------------------|-------------------------------|---------------------------------------|
| interactive            | `flux alloc`                  | `flux run`                            |
| backgrounded           | `flux batch`                  | `flux submit`👀                       |

* `flux alloc` will allocate resources and start an interactive Flux sub-instance underneath those resources. Within that subinstance, you can submit as many jobs as you like, with no worry about backing up the parent (usually system) instance.
* `flux batch` will also allocate resources and start a Flux sub-instance, but the job is not interactive, and thus `batch` requires a script outlining the work to do.
* `flux run` runs a program under a Flux instance. It does not create a new sub-instance, and will watch until the program completes.
* `flux submit` does not exist in other resource managers, notably Slurm. It does the same thing as `flux run`, but does not watch for job output, instead writing this to a file. 

It can be kind of hard to think about this in the abstract, so here's a program that we'll run under each of the commands to display different behavior.

```c
#include <mpi.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
#include <stdbool.h>

int main (int argc, char *argv[])
{
    int id, ntasks;
    struct timespec t;
    const char *label;

    if (!(label = getenv ("FLUX_JOB_CC")))
        if (!(label = getenv ("FLUX_JOB_ID")))
            label = "0";

    monotime (&t);
    MPI_Init (&argc, &argv);
    MPI_Comm_rank (MPI_COMM_WORLD, &id);
    MPI_Comm_size (MPI_COMM_WORLD, &ntasks);
    if (id == 0) {
        printf ("%s: completed MPI_Init in %0.3fs.  There are %d tasks\n",
                label,
                monotime_since (t) / 1000, ntasks);
        fflush (stdout);
    }

    monotime (&t);
    MPI_Barrier (MPI_COMM_WORLD);
    if (id == 0) {
        printf ("%s: completed first barrier in %0.3fs\n",
                label,
                monotime_since (t) / 1000);
        fflush (stdout);
    }

    monotime (&t);
    MPI_Finalize ();
    if (id == 0) {
        printf ("%s: completed MPI_Finalize in %0.3fs\n",
                label,
                monotime_since (t) / 1000);
        fflush (stdout);
    }
    return 0;
}
```

Try not to dwell too much on what the program is doing -- it's basically an MPI "Hello, World" with a barrier in the middle. Let's try it with `flux run` first.


In [43]:
!flux run make

mpicc -o hello hello.c


In [46]:
!flux run -N1 -n10 ./hello

ƒ3b1EwHkXh: completed MPI_Init in 0.379s.  There are 10 tasks
ƒ3b1EwHkXh: completed first barrier in 0.004s
ƒ3b1EwHkXh: completed MPI_Finalize in 0.020s


The `-N 1` flag told Flux to allocate one node, and the `-n 16` flag told Flux to run 16 tasks, hence there were 16 MPI tasks communicating in `MPI_COMM_WORLD`. Let's try the same flags with `flux alloc`.

In [47]:
!flux alloc -N1 -n10 ./hello

0: completed MPI_Init in 0.443s.  There are 1 tasks
0: completed first barrier in 0.000s
0: completed MPI_Finalize in 0.063s
[999H[detached: session exiting][K
[?25h

The `-N1 -n16` flags mean something different in this context. The `./hellompi` is passed as the "initial program" to `flux alloc`, which allocates 1 node with 16 cores, then starts the initial program -- only one copy of it. Think about it, this makes sense for `flux batch` which must take a script: 

In [48]:
!flux batch -N1 -n10 ./hello

flux-batch: ERROR: ./hello does not appear to be a script, or failed to encode as utf-8


In [49]:
!flux batch -N1 -n10 --wrap --output=./hello.out --error=./hello.err ./hello

ƒ3b88at9b5


In [50]:
!flux watch $(flux job last)

0: stdout redirected to ./hello.out
0: stderr redirected to ./hello.err


In [52]:
## Use this to `cat` the file
!cat ./hello.out
!cat ./hello.err

0: completed MPI_Init in 0.438s.  There are 1 tasks
0: completed first barrier in 0.000s
0: completed MPI_Finalize in 0.061s


#### Some submission flags of note

* `-N` specifies a number of nodes
* `-n` specifies a number of tasks for distributed applications, and cores for interactive allocations
* `-c` specifies a number of cores per task
* `--requires` constrains a job to run on a specific rank or hostname
* `--dependency` makes a job depend on another job
* `-cc` submits carbon copies of the same job many times
* `--out` and `--err` redirect output and error to files

Let's work through some examples of these flags!

In [60]:
!flux submit -N1 -n4 -c2 --requires=hosts:$(hostname) --output /tmp/file.txt --error /dev/null ./hello

ƒ3bhmY93Y3


In [59]:
!flux run -N1 --dependency=afterok:$(flux job last) cat /tmp/file.txt

ƒ3bdXQiH1h: completed MPI_Init in 0.337s.  There are 4 tasks
ƒ3bdXQiH1h: completed first barrier in 0.002s
ƒ3bdXQiH1h: completed MPI_Finalize in 0.012s


# This concludes Chapter 1! 📗️

In this module, we covered:
1. Submitting jobs with Flux
2. The differences in submission commands.

To continue with the tutorial, open [Chapter 2](../ch2/02_flux_framework.ipynb)