# Kale <3 Parsl

<img src=https://www.fodmapeveryday.com/wp-content/uploads/2017/12/fennel-kale-and-parsley-for-salad.jpg width=500px/>
<br />

Here, we demonstrate using Kale to visualize and delay execution of Parsl workflows.

### Two-way communication
- Kale intercepts Parsl execution by overriding the Parsl DataFlowKernel
- Kale executes tasks via Parsl

### Scope
The following is a brief demonstration of this interaction. So far, only Python functions have been implemented. I haven't yet tried to integrate bash functions or batch tasks as we have in Kale.

### Passing Data in Kale
To match the Parsl model, it was necessary to allow data to be passed from one Kale Task to another via arguments, which was not previously possible. So that's a cool new feature!

For example, if the output of `Kale Task A` should be an argument to `Kale Task B`, then `Kale Task A` itself should be passed as an argument, and it's result will be evaluated immediately before `Kale Task B` is executed.

### Breaking changes
Also, I've made some changes to achieve this which have probably broken the previous Fireworks model, so some modifications will be necessary before merging to master.

In [None]:
from parsl import DataFlowKernel, ThreadPoolExecutor, App

from kale.workflow_widgets import WorkflowWidget, WorkerPoolWidget
from kale.parsl_dflow import KaleDFK

# Initialization

The `WorkerPoolWidget` defines how Kale tasks will be executed. The interface is definitely in progress. For example, the `Location` field doesn't do anything yet, but could be used to specify that a Workflow should execute on remote resources.

`Workers` is passed to `max_workers` in the internal ThreadPoolExecutor which will Parsl will ultimately use to execute the tasks.

In [None]:
wpw = WorkerPoolWidget()
wpw

Initialize the Kale `DataFlowKernel`. I don't think this `ThreadPoolExecutor` is actually used. This interface can be cleaned up.

In [None]:
workers = ThreadPoolExecutor()
kale_dfk = KaleDFK(executors=[workers])

# Define Parsl function

Define a Parsl function as normal, except the `KaleDFK` is passed in place of the standard Parsl DFK.

In [None]:
@App('python', kale_dfk)
def rand_add(*prev_list):
    """Add a random number to the previous ones."""
    import random
    import time
    
    # Random int between 0 & 10, inclusive.
    myrand = random.randint(0, 10)
    mysum = myrand + sum(prev_list)
    
    print("My number is {}. I was given {}. The sum is {}.\n".format(myrand, prev_list, mysum), end='')
    time.sleep(2)
    
    return mysum


# Define workflow
Secify the name of the new workflow. By requiring that a Workflow be initialized, multiple workflows can be defined sequentially with the same `KaleDFK` object by calling this function in between workflow definitions.

In [None]:
kale_dfk.new_workflow('Random Tree')

Now, instead of executing the tasks, the `KaleDFK` intercepts them and combines them into a kale `Workflow`. As you can see, the only result of this intercepted execution is a Parsl `AppFuture` returning `None`

In [None]:
rand_add(
    rand_add(
        rand_add(),
        rand_add()
    ),
    rand_add(
        rand_add(),
        rand_add()
    )
)

rand_add(
    rand_add(
        rand_add(),
        rand_add()
    ),
    rand_add(
        rand_add(),
        rand_add()
    )
)

# Launch workflow
A workflow widget is used to visualize and interact with the workflow once it has been defined.

Tasks can be selected in the plot using the mouse. `ctrl-click` for multi-select.

### `Workflow`
The **Run Workflow** button in the `Workflow` tab starts execution. 
If some tasks are selected, only those tasks are run. Currently, an error will be raised if there's an unselected task in the middle of a dependency chain.

Also, this is likely to cause some issues in the case of data being passed explicitly between functions as we have here.

### `Task`
In the `Task` tab, some basic information about the selected task is displayed, including the function name and its arguments.

### `Tags`
In the `Tags` tab, various selection utilites are available. Also, we have the concept of assigning tags to Kale tasks for easier selection and grouping. I haven't implemented this for workflows generated from Parsl.

### `Widget Log`
The `Widget Log` tab is generally where output from the functions is displayed, but I currently have it disabled, so all output will be printed below the widget.

In [None]:
ww = WorkflowWidget(kale_dfk.kale_workflow, wpw)
ww

# View results
Results of individual Kale Tasks are stored as futures on the task object. One way of accessing the result by task number (as shown in the DAG visualization) is shown below. There's probably a nicer way to do this.

In [None]:
kale_dfk.kale_workflow.get_future(6)#.result()

We also have a `kale_workflow.get_task_by_name(name)`, but in this case, all tasks have the same name since they all come from the same Parsl App, so that won't work here.

## Kale-Parsl Improvements
- Only Python tasks are currently supported with Parsl. This should be expanded to bash and batch tasks.
- In the same vein, I'm not sure whether IPyParallel execution of Python tasks will work. It should be doable with a few small changes.
- In the spirit of Parsl's on-the-fly workflow definition, it should be possible to dynamically update the widget if new tasks are added to the workflow. This isn't yet implemented.
- Error handling is a bit obfuscated. This could definitely be improved somehow.

## Kale Improvements
- We're in the process of developing a Kale service which will facilitate monitoring and controlling the status of workflows and tasks between multiple notebooks and across resources.
- General UI improvements are underway. Suggestions are welcome!