# Getting Started with pyUIT

This notebook shows how get up and running with pyUIT. It covers initial configuration and some of the most common commands.

## Configuration

Before you can use pyUIT to interact with the HPC, you first need to register a client application with UIT+ (see the [UIT+ documentation](https://www.uitplus.hpc.mil/files/README.pdf)). Be sure to save the client ID and the client secret keys. Create a UIT configuration file in your home directory ```~/.uit``` and copy the client ID and client secret keys into this file in the following format:

```
client_id: <YOUR_CLIENT_ID_HERE>
client_secret: <YOUR_CLIENT_SECRET_HERE>
```

Once you have a registered client and have the configuration file set up, then you can proceed with this notebook.

In [None]:
import uit

## (Optional) Enable Debug Logging

This will display every command sent to the HPC through UIT+, which login node was used, how long each command took, and a very brief stacktrace.

In [None]:
import logging
handler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s %(levelname)s:%(name)s:%(message)s')
handler.setFormatter(formatter)
logger = logging.getLogger('uit')
logger.handlers.clear()
logger.addHandler(handler)
logger.setLevel('DEBUG')
logger.debug('Test pyuit debug logging')

## Authenticating and Connecting

The first step in using pyUIT is to create a `uit.Client` and authenticate a user to the UIT+ server. Users must have a pIE account to access the HPC. If your pIE account was created recently (after 2018ish) then you must request that your account be synced to the UIT+ server. 

Note: By adding `notebook=True` as an argument to the authentication call the output will be a Ipython IFrame which displays the OAuth authentication page for UIT+. If you omit this argument then the page is opened up as a new tab in your system browser.

In [None]:
c = uit.Client()
c.authenticate()

Next, we need to connect to a specific HPC system. Currently `onyx`, `narwhal`, and `mustang` are the available systems. Other DSRC systems can be added in the future. 

In [None]:
c.connect('narwhal')

We are now connected to a login node and can make calls, upload or retrieve files and submit jobs to the queue.

## Basic Usage

By default, the `call` method will execute the command in the users $HOME directory. You can optionally pass in a `working_dir` argument to specify a different directory.

In [None]:
c.call('pwd', working_dir=c.WORKDIR)

Note that we passed in `c.WORKDIR` as the value for the `working_dir` argument. The `uit.Client` object has a few properties for common environment variables that are returned as `PosixPath` objects. Other environment variables that can be accessed as properties include:

In [None]:
c.HOME

In [None]:
c.CENTER

You can access other environment variables directly through the `uit.Client.env` attribute:

In [None]:
c.env.MODULEPATH

The `call` method, by default, returns a raw string of the `stdout` and `stderr` output from the HPC.

In [None]:
c.call('ls -la')

To make it a little easier to visually parse the output it is recommended to `print` it:

In [None]:
print(c.call('ls -la'))

Alternatively, for a few common commands pyUIT provides special methods that parses the output into a Python data structure. By default the return value is a `list` or `dict`, but if you have the `Pandas` module installed then you can specify the argument `as_df=True` to get result as a `pandas.DataFrame`:

In [None]:
c.list_dir(c.HOME)
# If you have Pandas installed then you can uncomment the following line.
# c.list_dir(c.HOME, as_df=True)

Other methods that have special parsing include `show_usage` and `status`. These methods are useful when sumbitting jobs to the queue.

## Uploading and Retrieving Files

You can copy files to and from an HPC system by using the `put_file` and `get_file` methods.

In [None]:
local_file = './data/hello_world.pbs'
remote_file = c.HOME/'pyuit_test'
c.put_file(local_path=local_file, remote_path=remote_file)

In [None]:
local_file = './data/pyuit_test.pbs'
c.get_file(remote_path=remote_file, local_path=local_file)

## Submitting Jobs to the Queue

The `show_usage` method can be used to access the subproject id, which is needed when submitting jobs to the HPC queuing system.

In [None]:
subproject = c.show_usage()[0]['subproject']
subproject

The `uit.Client.submit` method accepts a PBS script as one of the following types:
 * file path
 * string
 * `uit.PbsScript` object
 
So, if you already have a PBS script file then you can use the `uit.Client` directly to submit it. Alternatively, you can use the `uit.PbsScript` API to create a new PBS script programatically.

In [None]:
job_name = 'hello_world_with_pyuit'

pbs_script = uit.PbsScript(
    name=job_name,
    project_id=subproject,
    num_nodes=1,
    queue='debug',
    processes_per_node=1,
    node_type='compute',
    max_time='00:01:00',
    system=c.system,
)

pbs_script.execution_block = "echo Hello World!"
print(pbs_script.render())

In [None]:
job_id = c.submit(pbs_script=pbs_script)
job_id

We can monitor the status of the job by calling `status` and passing it the job ID. Run this cell repeatedly until the job is finished (status = 'F').

In [None]:
# status = c.status(job_id=job_id)
# If you have Pandas installed then you can uncomment the following line.
status = c.status(job_id=job_id, as_df=True)
status

This job will have written its stdout and stderr to files in the workdir that have a name based off of the job name and the job id. We can list these files to ensure that the job has run:

In [None]:
if status.status[0] == 'F':
    job_number = job_id.split('.')[0]
    print(c.list_dir(c.WORKDIR/f'{job_name}.*{job_number}', parse=False))
else:
    print('Your job is still running...')

We can `cat` the contents of the stdout file to see what output the job created.

In [None]:
if status.status[0] == 'F':
    job_stdout = c.WORKDIR/f'{job_name}.o{job_number}'
    print(c.call(f'cat {job_stdout}'))

Alternatively we can copy these files locally to continue to work with them.

In [None]:
if status.status[0] == 'F':
    sdtout = c.get_file(job_stdout)
    with sdtout.open() as out:
        print(out.read())