<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

***

**Disclaimer:** The `SFAPI` module is very much a work in progress! We very much encourage Discussions, Issues, Bug Reports, and Pull Requests here: https://github.com/NERSC/Superfacility.jl

***

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

# Superfacility.jl Demo

In Julia we use `Project.toml` to define our dependencies:

In [None]:
using Pkg
Pkg.activate(@__DIR__)

We're now using the local project, let's make sure everything is installed by using `Pkg.instantiate`

In [None]:
Pkg.instantiate()

This will create a `Manifest.tmol` which contains the precise versions, and dependencies of all packages. We recommend instantiating a fresh manifest for each machine you work on rather than copying the `Manifest.toml` file because package versions depend on the Julia version, operating system, and system libraries. So it's not meant to be portable!

For the paranoid, you can using `Pkg.status` to check that you're in the right environment, and where each dependency is at:

In [None]:
Pkg.status()

I'm going to import a bunch of packages that we will use later:

In [None]:
using JSON
using Dates
using TimeZones
using Chain
using ResultTypes
using PrettyTables

using Base: @kwdef

To use the Superfacility API from Julia, import the `SFAPI` module form the `Superfacility` package (the `Project.toml` will list `Superfacility` as on of the project's dependencies

In [None]:
using Superfacility: SFAPI

***
# Intro and Exercise 1 - Un-Authenticated Client
## Check NERSC Status
### These can all be done without a superfacility client
***
Before we start any computing, let's check that Perlmutter is up.

Superfacility.jl tries to give you low-level access to the SFAPI, so most of the work is done by the `Query` module. This will return any query results as a dictionary:

In [None]:
SFAPI.Query.get("status")

We provide som convenience classes, to make working with SFAPI query results easier -- note that those are a work in progress: as you will see later, for more advanced use cases you will need to use your own convenicence functions

In [None]:
SFAPI.Status.CenterStatus(SFAPI.Query.get("status"))

In [None]:
center_status = SFAPI.Status.CenterStatus(SFAPI.Query.get("status"))
perlmutter_status = only(filter(x->x.name == "perlmutter", center_status))

Julia has a pipeline syntax -- which might be easier to understand for folks that think in pipes:

In [None]:
SFAPI.Status.CenterStatus(SFAPI.Query.get("status")) |> filter(x->x.name == "perlmutter") |> only

Or you just use the right backend (recommended anyway to keep overhead down):

In [None]:
SFAPI.Status.StatusEntry(SFAPI.Query.get("status/perlmutter"))

In [None]:
perlmutter_status.status

Note that for complex pipelines, we recommend the `Chain.jl` package, which lets you chain function calls more ergonomically:

In [None]:
@chain begin
    SFAPI.Query.get("status")
    SFAPI.Status.CenterStatus
    filter(x->x.name == "perlmutter", _)
    only
    _.status
end

You can get information about a particular class by using the `?` operator:

In [None]:
?SFAPI.Status.StatusEntry

Let's put it all together to make a table of resources and their status:

In [None]:
center = @chain begin
    SFAPI.Query.get("status")
    SFAPI.Status.CenterStatus
end

pretty_table(hcat(
    getproperty.(center, :name), 
    getproperty.(center, :description), 
    getproperty.(center, :status)
); header=["resource name", "description", "status"])

Outages don't have a Julia class representing an outage entry, so more manual work will be needed (PRs always welcome):

In [None]:
SFAPI.Query.get("status/outages")

Note that `SFAPI.Query` returns a `Result` or an `Error` type (like in Rust!). This way we can check for errors without loads of `try/catch` blocks. If you're confident that you're not going to receive an error (or if you checked for errors already) then you can use `unwrap` to access its contents:

In [None]:
pm_outages = @chain begin
    SFAPI.Query.get("status/outages")
    unwrap
    filter(x->first(x)[:name] == "perlmutter", _)
    only
end

pretty_table(hcat(
    getindex.(pm_outages, :start_at), 
    getindex.(pm_outages, :end_at), 
    getindex.(pm_outages, :description)
); header=["start", "end", "description"])

Let's filter these to only show recent and upcoming outages:

In [None]:
tz_min = now(tz"America/Los_Angeles") - Month(2)
pm_outages = @chain begin
    pm_outages
    filter(x->ZonedDateTime(x[:start_at]) < tz_min + Month(4), _)
    filter(x->ZonedDateTime(x[:start_at]) > tz_min, _)
end

pretty_table(hcat(
    getindex.(pm_outages, :start_at), 
    getindex.(pm_outages, :end_at), 
    getindex.(pm_outages, :description)
); header=["start", "end", "description"])

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

# Exercise 2 - Authenticated Client 
## Setup keys and get user and project information
***
Let's set up a client, with which we fetch SFAPI tokens.
I've stored my key is stored in a file in `~/.superfacility/`. Change the path below to where you stored your keys. 

**Important:** `SFAPI.Token.Client` expects keys in PEM format called `priv_key.pem` and `pub_key.pem`. The client ID string needs tobe stored in `clientid.txt`. The idea is that you can just download these straight from iris.nersc.gov without needing to modify, or rename anything :)

In [None]:
client = SFAPI.Token.Client(joinpath(homedir(), ".superfacility"))

In [None]:
tc = SFAPI.Token.fetch(client) |> unwrap

In [None]:
account = SFAPI.Account.User(SFAPI.Query.get("account", tc.token))

Tokens can go stale, so that's what the `SFAPI.Token.refresh` function is for:

In [None]:
? SFAPI.Account.User

In [None]:
tc = SFAPI.Token.refresh(tc)

It only refreshes the token if it's very close to expiring (based on expiration date/time). So it's good practise to call this function every time you make an authenticated API call:

In [None]:
tc = SFAPI.Token.refresh(tc)
projects = SFAPI.Query.get("account/projects", tc.token)
JSON.print(projects, 4)

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

# Exercise 3 - Filesystem interactions, executing commands, and small file upload/download
## Interact with NERSC Data Transfer Nodes 
***
Now that we have an authneticated client we can interact with NERSC systems

Let's make some useful variables for our home and scratch directory that we'll use in the next exercises

Your home and scratch paths are based on your username 

* `/global/homes/username_first_letter/username`
* `/pscratch/sd/username_first_letter/username`

Bonus points for using the `user` object to automatically generate it

In [None]:
home = "/global/homes/$(account.name[1])/$(account.name)"
scratch = "/pscratch/sd/$(account.name[1])/$(account.name)"

In [None]:
tc = SFAPI.Token.refresh(tc)
ls = SFAPI.Ls.Dir(SFAPI.Query.get("utilities/ls/dtns/$(scratch)", tc.token))
for e in ls.entries
    println(e)
end

### Excursion: Remote Command Execution

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run("ls $(scratch)", tc)

Julia is natively multi-tasking (SFAPI tasks are greedy: they start right away)

In [None]:
t = SFAPI.Executable.result(cmd, tc)

In [None]:
istaskdone(t)

In [None]:
fetch(t)

In [None]:
cmd_str = "sleep 10\necho hi"
println(cmd_str)

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run( 
    "cat << EOF | bash\n$(cmd_str)\nEOF" ,
    tc
)

In [None]:
t = SFAPI.Executable.result(cmd, tc)
while ! istaskdone(t)
    println("Wating for result ...")
    sleep(1)
end
println(fetch(t))

## Using Remote Command Execution to Create Files

The `upload` endpoint is currently not supported :( -- again PRs welcom ;)

So we're going to hack this in here using the `command` endpoint:

First, let's make sure that the directory in which we'll be working at NERSC exists:

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run( 
    "mkdir -p $(scratch)/sfapi",
    tc
)

t = SFAPI.Executable.result(cmd, tc)
while ! istaskdone(t)
    println("Wating for result ...")
    sleep(1)
end
println(fetch(t))

Let's create a job script:

In [None]:
job_script = """#!/bin/bash
#SBATCH -q regular
#SBATCH -A nstaff
#SBATCH -N 1
#SBATCH -C cpu
#SBATCH -t 00:05:00
#SBATCH -J sfapi-test
#SBATCH --output=$(scratch)/sfapi/test.out
#SBATCH --error=$(scratch)/sfapi/test.error

echo "hi"
"""

job_script_file = "$(scratch)/sfapi/test.sh"

println(job_script)

And let's upload the jobscript using our `SFAPI.Executable.run` hack:

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run( 
    "cat > $(job_script_file) << EOF\n$(job_script)EOF",
    tc
)

t = SFAPI.Executable.result(cmd, tc)
while ! istaskdone(t)
    println("Wating for result ...")
    sleep(1)
end
println(fetch(t))

Because we're paranoid, let's check that the jobscript is where we expect it to be:

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run(
    "ls $(job_script_file)",
    tc
)

t = SFAPI.Executable.result(cmd, tc)
while ! istaskdone(t)
    println("Wating for result ...")
    sleep(1)
end
r = fetch(t)

if you see a path to the `test.sh` then we know that it worked :)

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

# Exercise 4 - Interacting with Perlmutter
## Getting job information and submitting batch work
***

Now we'll connect to perlmutter and interact with Slurm to get information about past jobs as well as submit work to Slurm. Lets check how many jobs are currently running.

**Important:** This interface is a little unergonomic at the moment, so you need to keep the following in mind:

* use `cached=false` to make sure that we're getting an up-to-date list of jobs (there is some caching that speeds things up but also means that you don't get a list of jobs on Perlmutter _right now!_). This comes at the cost of speed: the endpoint runs `squeue` or `sacct` and processes the result.
* use `user=<user_name>` since we've already loaded the user account data into the `account` variable, we can use `"kwargs" => "user=$(account.name)"` .This limits `squeue` / `sacct` to return only jobs relevant to your user account.

We'll probably clean this up soon, so your preferences/priorities/use cases (via a GitHub issue) would help move this along. 

In [None]:
tc = SFAPI.Token.refresh(tc)
x = SFAPI.Query.get(
    "compute/jobs/perlmutter", tc.token;
    parameters=Dict(
        "index" => "0",
        "sacct" => "false",
        "cached" => "false",
        "kwargs" => "user=$(account.name)"
    )
) |> unwrap

Let's look at any job data in a more user-friendly way using the `JSON` library:

In [None]:
JSON.print(x[:output], 4)

Next we'll submit a new job -- often you want to submit the job alonside of the jobscript (so that Slurm uses the "right" working directory for stdout). In Julia we can use `@chain` to break up the jobscript path, drop the file name (i.e. leave only the directory part) by dropping the last element of the path array, and generate a path from that:

In [None]:
job_script_dir = @chain begin
    job_script_file
    splitpath         # break the path up into its individual pieces
    _[1:end-1]        # drop the last element (the file name part)
    joinpath          # create a fresh path (without the file name)
end

And now we can submit the job (`cd`'ing into the job dir first):

In [None]:
tc = SFAPI.Token.refresh(tc)
cmd = SFAPI.Executable.run(
    "cd $(job_script_dir)\nsbatch $(job_script_file)",
    tc
)

t = SFAPI.Executable.result(cmd, tc)
while ! istaskdone(t)
    println("Wating for result ...")
    sleep(1)
end
r = fetch(t)

Let's see if we can see our new job in the jobs list:

In [None]:
tc = SFAPI.Token.refresh(tc)
x = SFAPI.Query.get(
    "compute/jobs/perlmutter", tc.token;
    parameters=Dict(
        "index" => "0",
        "sacct" => "false",
        "cached" => "false",
        "kwargs" => "user=$(account.name)"
    )
) |> unwrap
JSON.print(x[:output], 4)