# Automating workflows: FireWorks

* Make sure you have the mongodb server running on your computer
* Workflows are made of three main components:
 * <b>FireTask:</b> an atomic computing job. It can call a single shell script or execute a single Python function that you define (either within FireWorks, or in an external package).
 * <b>Firework:</b> JSON spec that includes all the information needed to bootstrap your job. *spec* has all the information required to run the FireTasks.
 * <b>Workflow:</b> is a set of FireWorks with dependencies between them. 

In [None]:
# Import the necessary FireWorks tools:
from fireworks import Firework, Workflow, LaunchPad, ScriptTask
from fireworks.core.rocket_launcher import rapidfire, launch_rocket

In [None]:
# Set up the LaunchPad and reset it
launchpad = LaunchPad()
launchpad.reset('', require_password=False)

* Now, if you haven't done yet, go to your jupyter home, and launch a New Terminal
* Let's check if we have any workflows. Type in your ipython terminal:
```bash
lpad get_wflows
```
* lpad gives you the full control of your workflows.
 * To learn more:
```bash
lpad -h
```
 * Or for help on a specific argument, e.g.:
```bash
lpad get_wflows -h
```

# My first "hello world" workflow

Goal: create a Workflow that contains two fireworks, and each firework will run a ScriptTask that prints a message to the stdout.

In [None]:
# Let's create a task that prints 'hello' to standard output
task1 = ScriptTask.from_str('echo "hello"')

In [None]:
# Let's create another task that prints 'goodbye'
task2 = ScriptTask.from_str('echo "goodbye"')

In [None]:
# A firework that will run the first task
fw1 = Firework(task1, name="hello") 

In [None]:
# A firework that will run the second task, after the first firework
fw2 = Firework(task2, name="goodbye")

In [None]:
# Let's combine the fws in a workflow:
wf = Workflow([fw1, fw2], {fw1:fw2}, name="test workflow")

In [None]:
# Before adding the workflow to our LaunchPad and running it, let's investigate it a little:
wf.as_dict()

In [None]:
# What are the dependencies of our fireworks?
wf.links

In [None]:
# Now let's add the workflow in the LaunchPad!
launchpad.add_wf(wf)

Check again the workflows we have using lpad:
```bash
lpad get_wflows
lpad get_wflows -d more
lpad get_wflows -d all
```

THE BIG MOMENT, LET's LAUNCH OUR WORKFLOW! In your terminal:

```bash
rlaunch -s rapidfire
```

In [None]:
# You could launch the workflow in python as well (look between the lines in your terminal!)
launchpad.add_wf(wf)
rapidfire(launchpad)

* Congratulations! You have successfully run your first workflow!
* You should see "hello" and "goodbye" printed in your stdout.
* Check your workflow with lpad again. The state should read <b>COMPLETED</b>.
* Remember, -d more option will print more information:
```bash
lpad get_wflows -d more
```
* We can now see the state of each firework as well. Check what these look like:
 * state (of workflow)
 * states (of fws)
 * launch_dirs

# Dependencies of fws
When we described our workflow, we told fireworks that *fw2* depends on *fw1*:<br>
```python
Workflow([fw1, fw2], {fw1:fw2}, name="test work*flow")
```
<br>
We can also define the same dependency when defining your fireworks with the <b>parents</b> argument:

In [None]:
fw1 = Firework(task1, name="hello") 
fw2 = Firework(task2, name="goodbye", parents=[fw1])
wf = Workflow([fw1, fw2], name="test workflow")
launchpad.reset('', require_password=False)
launchpad.add_wf(wf)

Now let's track what happens to states of our fws as they are run:
```bash
lpad get_wflows -d more
```
You can ask lpad about fws directly:
```bash
lpad get_fws -d more
```
Note our first firework (hello) is <b>READY</b>, and the second one is <b>WAITING</b>
Let's launch one:
```bash
rlaunch -s singleshot
```
We only saw "hello". Let's check our workflow and states of our fws again:
```bash
lpad get_wflows -d more
```
Now our workflow is RUNNING, because:
First fw is COMPLETED
Second fw is READY

Let's launch the second fw too:
```bash
rlaunch -s singleshot
```
Now our workflow is COMPLETED
First fw is COMPLETED
Second fw is COMPLETED




# Exercise 1 (goal: understand dependencies)
- Our company has 4 employees: 
Ingrid (CEO), Jack (Manager), Jill (Manager), Kip (intern). 
Your goal is the design a workflow that prints out the organization chart according to the heirarchy.

In [None]:
# define four individual FireWorks used in the Workflow
task1 = ScriptTask.from_str('echo "Ingrid is the CEO."')
task2 = ScriptTask.from_str('echo "Jill is a manager."')
task3 = ScriptTask.from_str('echo "Jack is a manager."')
task4 = ScriptTask.from_str('echo "Kip is an intern."')

fw1 = Firework(task1)
fw2 = Firework(task2)
fw3 = Firework(task3)
fw4 = Firework(task4)

# assemble Workflow from FireWorks and their connections by id
workflow = Workflow([fw1, fw2, fw3, fw4], {fw1: [fw2, fw3], fw2: [fw4], fw3: [fw4]})

# check the links of your workflow now.
print workflow.links

# reset your LaunchPad (optional)
launchpad.reset('',require_password=False)

# store workflow in LaunchPad
launchpad.add_wf(workflow)

# check the states of your Workflow and Fireworks.

# launch the workflow in terminal in rapidfire mode.

# check the states of your Workflow and Fireworks.

# Web GUI

To run the webgui, go to your terminal and type:
```bash
lpad webgui 
```
What do the states mean? (See the [reference material](https://pythonhosted.org/FireWorks/reference.html) for details.)

<b>ARCHIVED:</b> deleted<br>
<b>DEFUSED:</b> canceled/paused. Child FireWorks won’t run.<br>
<b>WAITING:</b> waiting for a parent Firework to complete.<br>
<b>READY:</b> Firework is ready to run, but hasn’t started running yet. The Rocket Launcher must pull it.<br>
<b>RESERVED:</b> (Queue Launcher in reservation mode only). The Firework is waiting in a queue to run.<br>
<b>FIZZLED:</b> Firework has failed; it was executed but threw an error during the process.<br>
<b>RUNNING:</b> Firework is currently running.<br>
<b>COMPLETED:</b> Firework has successfully finished running.<br>

# Job management

* Cancel, restart, delete wflows
* Rerun a firework or workflow
* Debugging

In [None]:
# Let's add a workflow
# A firework that will run the first task
fw1 = Firework(task1, name="hello") 
fw2 = Firework(task1, name="goodbye") 
wf = Workflow([fw1, fw2], {fw1:fw2}, name="test workflow")
launchpad.reset('', require_password=False)
launchpad.add_wf(wf)

## Cancel, restart, delete workflows or fireworks

It is more convenient to do the rest of the job management in command line.

Let's see the workflows we have:

```bash
lpad get_wflows -d more
```

Let's pause the workflow:

```bash
lpad defuse_wflows -i 1
```

Check the workflows again:

```bash
lpad get_wflows -d more
```

Also refresh your webgui.

You can resume the DEFUSED workflows: 

```bash
lpad reignite_wflows -s DEFUSED
```

And check the status of the the workflow agian.

We can delete the workflow:
```bash
lpad delete_wflows -i 1
```

## Dealing with crashes

Go to your terminal and:
```bash
lpad add_scripts 'echo "starting"; sleep 10; echo "ending"' 
```
* This is just a short-hand of adding a workflow of ScriptTask's.
* Here we have firework that prints "starting", sleeps for 10 seconds and prints "ending".<br>

Now let's launch this job:
```bash
rlaunch -s singleshot
```
Let this run and finish. 

Now add the scripts once again with lpad_add but after launching the job, this time press <b>Ctrl+C</b> to abort the job before it ends. 

Ooops! Our DFT code crashed!
<br>
Now if you refresh your Web GUI or check wflows with lpad, you will see your fw and wf are FIZZLED!

To check the status of your firework:
```bash
lpad get_fws -i <FW_ID> -d all
```

To fireworks based on state of the firework:
```bash
lpad get_fws -s FIZZLED
```

We can print a report of our jobs:
```bash
lpad report
```

We can rerun a FIZZLED firework as:
```bash
lpad rerun_fws -i <FW_ID>
```
Or we can tell lpad to rerun all FIZZLED fireworks: 
```bash
lpad rerun_fws -s FIZZLED
```
Check the state of your fws, now it should say READY again!

### More advanced reruns:

In [None]:
# Can you put these tasks in a Firework and create a Workflow?
task1 = ScriptTask.from_str('echo "Start"')
task2 = ScriptTask.from_str('sleep 5')
task3 = ScriptTask.from_str('echo "Wait more for cool things to happen."')
task4 = ScriptTask.from_str('sleep 5')
task5 = ScriptTask.from_str('echo "The End."')

fw = Firework([task1, task2, task3, task4, task5], name="the looong wait")
wf = Workflow([fw], name="test workflow")
launchpad.reset('', require_password=False)
launchpad.add_wf(wf)

* Go to your terminal, and inspect the workflow and firework we added.
* Launch the job:
```bash
rlaunch -s singleshot
```
* AFTER seeing "Wait more for cool things to happen." hit Ctrl+C.
* Ooops we broke things right in the middle of a firework.
```bash
lpad get_fws
```
* We had some progress already, and when we restart, we don't want to repeat all the tasks in the firework from the beginning!
* We can use the --task-level option of rerun_fws
```bash
lpad rerun_fws -i 1 --task-level
lpad get_fws
rlaunch -s singleshot
```
The firework picked up from the failed task and only printed The End!

# Let's take a closer look into a Firework
* Goal is to learn about important components such as spec, state etc.

In [None]:
# Let's add a workflow
task1 = ScriptTask.from_str('echo "hello"')
task2 = ScriptTask.from_str('echo "goodbye"')
fw1 = Firework(task1, name="hello") 
fw2 = Firework(task1, name="goodbye") 
wf = Workflow([fw1, fw2], {fw1:fw2}, name="test workflow")
launchpad.reset('', require_password=False)
launchpad.add_wf(wf)
launchpad.get_fw_by_id(1).as_dict()

* <b>spec</b> contains all the information about the job:
 * <b>\_tasks</b> is a list of FireTasks. E.g. ScriptTask is just one kind of task. There can be many different tasks in a firework.
* <b>state</b>
* <b>name</b>

# Designing FireTasks
* FireTasks are where your code is actually run!
* ScriptTask is only one type of FireTask, among others such as TemplateWriterTask built into FireWorks.
* FWAction: encapsulates the output of a FireTask. Can modify the workflow!
 * stored_data
 * mod_spec
 * additions, detours

All FireTasks must inherent from a certain base class.
```python
class MyTask(FireTaskBase):
    
    _fw_name = "My Amazing Task"
    
    def run_task(self, fw_spec):
        
        # INSERT FAVORITE DFT CODE HERE!
        
        return FWAction()
```

* For example, see what actually ScriptTask looks like [here](https://github.com/materialsproject/fireworks/blob/master/fireworks/user_objects/firetasks/script_task.py).




# Example 2: custom FireTask to add numbers

```python
class AdditionTask(FireTaskBase):
    
    _fw_name = "Addition Task"
    
    def run_task(self, fw_spec):
        input_array = fw_spec['input_array']
        m_sum = sum(input_array)

        print("The sum of {} is: {}".format(input_array, m_sum))

        return FWAction(stored_data={'sum': m_sum}, mod_spec=[{'_push': {'input_array': m_sum}}])
```

In [None]:
# create the Firework consisting of a custom "Addition" task
firework = Firework([AdditionTask()] ,spec={"input_array": [1, 2, 3]})
launchpad.reset('', require_password=False)
launchpad.add_wf(firework)

run the workflow in terminal

In [None]:
# What happens if you add more tasks?
firework = Firework([AdditionTask(), AdditionTask(), AdditionTask()] ,spec={"input_array": [1, 2, 3]})
launchpad.reset('', require_password=False)
launchpad.add_wf(firework)

# Dynamic workflows (Optional)

In [None]:
class FibonacciAdderTask(FireTaskBase):

    _fw_name = "Fibonacci Adder Task"

    def run_task(self, fw_spec):
        smaller = fw_spec['smaller']
        larger = fw_spec['larger']
        stop_point = fw_spec['stop_point']

        m_sum = smaller + larger
        if m_sum < stop_point:
            print('The next Fibonacci number is: {}'.format(m_sum))
            # create a new Fibonacci Adder to add to the workflow
            new_fw = Firework(FibonacciAdderTask(), {'smaller': larger, 'larger': m_sum, 'stop_point': stop_point})
            return FWAction(stored_data={'next_fibnum': m_sum}, additions=new_fw)

        else:
            print('We have now exceeded our limit; (the next Fibonacci number would have been: {})'.format(m_sum))
            return FWAction()

In [None]:
# set up the LaunchPad and reset it
launchpad = LaunchPad()
launchpad.reset('', require_password=False)

# create the Firework consisting of a custom "Fibonacci" task
firework = Firework(FibonacciAdderTask(), spec={"smaller": 0, "larger": 1, "stop_point": 10})

# store workflow and launch it locally
launchpad.add_wf(firework)


# Passing job information between fireworks (Optional)

In [None]:
from fw_tutorials.dynamic_wf.printjob_task import PrintJobTask

# create the Workflow that passes job info
fw1 = Firework([ScriptTask.from_str('echo "This is the first FireWork"')], spec={"_pass_job_info": True}, fw_id=1)
fw2 = Firework([PrintJobTask()], parents=[fw1], fw_id=2)
wf = Workflow([fw1, fw2])

# store workflow and launch it locally
launchpad.reset('', require_password=False)
launchpad.add_wf(wf)

While one could construct an entire workflow by chaining together multiple FireTasks within a single Firework, this is often not ideal. For example, we might want to switch between different FireWorkers for different parts of the workflow depending on the computing requirements for each step. Or, we might have a restriction on walltime that necessitates breaking up the workflow into more atomic steps. Finally, we might want to employ complex branching logic or error-correction that would be cumbersome to employ within a single Firework. 