# Command Line Tools

As mentioned before, running Pegasus is in a Jupyter notebook is very convenient for tutorials and for smaller workflows, but production workflows are most commonly submitted on dedicated HTCondor submit nodes using command line tools. This section of the tutorial uses the same workflow as we have seen in the previous sections, generated inside the notebook. Planning, submitting and checking status will be done using the command line tools.

First, execute the following cell to generate the workflow. Note that we are just writing it out at the end.

In [2]:
import logging

from pathlib import Path

from Pegasus.api import *

logging.basicConfig(level=logging.DEBUG)

# --- Properties ---------------------------------------------------------------
props = Properties()
props["pegasus.monitord.encoding"] = "json"                                                                    
props["pegasus.catalog.workflow.amqp.url"] = "amqp://friend:donatedata@msgs.pegasus.isi.edu:5672/prod/workflows"
props["pegasus.mode"] = "tutorial" # speeds up tutorial workflows - remove for production ones
props.write() # written to ./pegasus.properties 

# --- Replicas -----------------------------------------------------------------
with open("f.a", "w") as f:
   f.write("This is sample input to KEG")

fa = File("f.a").add_metadata(creator="ryan")
rc = ReplicaCatalog().add_replica("local", fa, Path(".").resolve() / "f.a")

# --- Transformations ----------------------------------------------------------
preprocess = Transformation(
               "preprocess",
               site="condorpool",
               pfn="/usr/bin/pegasus-keg",
               is_stageable=False,
               arch=Arch.X86_64,
               os_type=OS.LINUX
            )

findrange = Transformation(
               "findrange",
               site="condorpool",
               pfn="/usr/bin/pegasus-keg",
               is_stageable=False,
               arch=Arch.X86_64,
               os_type=OS.LINUX
            )

analyze = Transformation(
               "analyze",
               site="condorpool",
               pfn="/usr/bin/pegasus-keg",
               is_stageable=False,
               arch=Arch.X86_64,
               os_type=OS.LINUX
            )

tc = TransformationCatalog().add_transformations(preprocess, findrange, analyze)

# --- Workflow -----------------------------------------------------------------
'''
                     [f.b1] - (findrange) - [f.c1]
                     /                             \
[f.a] - (preprocess)                               (analyze) - [f.d]
                     \                             /
                     [f.b2] - (findrange) - [f.c2]

'''
wf = Workflow("blackdiamond")

fb1 = File("f.b1")
fb2 = File("f.b2")
job_preprocess = Job(preprocess)\
                     .add_args("-a", "preprocess", "-T", "3", "-i", fa, "-o", fb1, fb2)\
                     .add_inputs(fa)\
                     .add_outputs(fb1, fb2)

fc1 = File("f.c1")
job_findrange_1 = Job(findrange)\
                     .add_args("-a", "findrange", "-T", "3", "-i", fb1, "-o", fc1)\
                     .add_inputs(fb1)\
                     .add_outputs(fc1)

fc2 = File("f.c2")
job_findrange_2 = Job(findrange)\
                     .add_args("-a", "findrange", "-T", "3", "-i", fb2, "-o", fc2)\
                     .add_inputs(fb2)\
                     .add_outputs(fc2)

fd = File("f.d")
job_analyze = Job(analyze)\
               .add_args("-a", "analyze", "-T", "3", "-i", fc1, fc2, "-o", fd)\
               .add_inputs(fc1, fc2)\
               .add_outputs(fd)

wf.add_jobs(job_preprocess, job_findrange_1, job_findrange_2, job_analyze)
wf.add_replica_catalog(rc)
wf.add_transformation_catalog(tc)
wf.write()


<Pegasus.api.workflow.Workflow at 0x7f50f96265f8>

## 1. Opening the Jupyter terminal

To open a new terminal window, navigate back to the listings tab of Jupyter notebook. This is where you have been opening all the sections from. In the top right corner of the listing, click `New` and then `Terminal`. It looks something like:

![Terminal Start](../images/terminal-start.png)

Once started, arrange your browser tabs/windows side by side so that you can see these instructions and the terminal window at the same time. In the following sections, when you are presented with a `$`, that means it is a command you can type in or copy and paste into the terminal window. Sometimes you have to substitute your own values and that is highlighted with square brackets `[]`.

First, cd to the correct directory:

    $ cd ~/notebooks/03-Command-Line-Tools/
    
If you run `ls`, you should see these files:

    $ ls
    03-Command-Line-Tools.ipynb
    f.a
    pegasus.properties
    workflow.yml
    
The 3 latter ones were just generated by the cell above.

## 2. Planning and submitting

We can now plan and submit the workflow by running:

    $ pegasus-plan --submit workflow.yml
    
In the output of the plan command, you will see a reference to several other Pegasus commands such as pegasus-status. More importantly, a workflow directory was generated for the new workflow instance. This directory is the handle to the workflow instance and used by Pegasus command line tools. Some useful tools to know about:

 * **pegasus-status -v [wfdir]** Provides status on a workflow instance
 * **pegasus-analyzer [wfdir]** Provides debugging clues why a workflow failed. Run this after a workflow has failed
 * **pegasus-statistics [wfdir]** Provides statistics, such as walltimes, on a workflow after it has completed
 * **pegasus-remove [wfdir]** Removes a workflow from the system


## 3. Workflow status

Use the workflow directory given in the output of the `pegasus-plan` command to determine the status of your workflow:

    $ pegasus-status -v [wfdir]

The flags `-l` and `-v` are just two different version of more verbose output. Please see `pegasus-status --help` to see all the options available.

You can keep running `pegasus-status` until the workflow has completed, or you can use the `-w` flag to mimic the `wait()` function we used in the API. This flag will make `pegasus-status` run periodically until the workflow is complete:

    $ pegasus-status -v -w [wfdir]
    

## 4. Workflow statistics

Once the workflow is complete, you can extract statistics from the provenance database:

    $ pegasus-statistics -s all [wfdir]

 
## What's Next?

The next notebook is `04-Containers/` that shows you how to use a docker container for executing jobs in your workflow.