# 1 - Beginning Workflows

In this lesson, we'll cover the basics of using atomate to run calculations. This will be a hands-on lesson where we dive into running a full workflows and break that down into components to understand how the various moving parts give us the ability to scale from 1 calculation to 10's of thousands.

In [None]:
import mp_workshop.atomate

In [None]:
!pmg config --add PMG_MAPI_KEY <MAPI_KEY>

# Building a workflows

To begin, we'll start by grabbing a structure from materials project using pymatgen and the MPRester interface we learned about in a previous course

In [None]:
from pymatgen import MPRester

mpr = MPRester()

struct = mpr.get_structure_by_material_id("mp-27")
print(struc)

Now, let's construct a workflow using atomate to optimize this structure in DFT

In [None]:
from atomate.vasp.workflows.presets.core import wf_structure_optimization

In [None]:
import os
db_file = os.path.join(os.environ['HOME'], 'mp_workshop/fireworks_config/db.json')
wf_config = {"DB_FILE": db_file}
wf = wf_structure_optimization(struc, wf_config)
print(wf)

Get some more information on the workflow

In [None]:
wf.as_dict()

# Lets configure the workflow to use Fake VASP to simulate a DFT calculation

Due to a combination of licensing issues and just not being able to run this quickly on the jupyter server, we're going to simulate VASP running with a magic function. You will later learn about powerups, which let you modify a workflow. For this exercise we're going to use a powerup that will replace the normal VASP running functionality with something that just copies the vasp output files we've prepared for you

In [None]:
from mp_workshop.atomate import si_struct_opt_path

# The output files are stored here
print(si_struct_opt_path)

In [None]:
from atomate.vasp.powerups import use_fake_vasp
wf = use_fake_vasp(wf, ref_dirs={"Si-structure optimization": si_struct_opt_path})
wf.as_dict()

## Now we have to get ourself a LaunchPad so that we can submit this workflow to our database


Atomate uses Fireworks as its workflow engine. Fireworks hides the database with an object called a LaunchPad. This allows you to submit and query workflows from anywhere you have database access. We need to get ourselves a LaunchPad object so we can submit our workflow

![fireworks_start_0.png](attachment:fireworks_start_0.png)
![fireworks_start_1.png](attachment:fireworks_start_1.png)

In [None]:
from fireworks.core.launchpad import LaunchPad

lp = LaunchPad.auto_load()

For this one time, we have to initialize the database. In everyday use, we'll only do this once.

In [None]:
lp.reset(None,require_password=False)

We can use the launchpad to add a workflow to our database:

In [None]:
lp.add_wf(wf)

# Adding other workflows

In [None]:
from atomate.vasp.workflows.presets.core import wf_elastic_constant, wf_bulk_modulus, \
                                                wf_gibbs_free_energy

elastic_wf = wf_elastic_constant(struct, wf_config)
lp.add_wf(elastic_wf)

gibb_wf = wf_gibbs_free_energy(struct, wf_config)
lp.add_wf(gibb_wf)

modulus_wf = wf_bulk_modulus(struct, wf_config)
lp.add_wf(modulus_wf)


# Monitoring Workflows

Fireworks lets you monitor the status of workflows and fireworks using both python and the command line. Let's start off by looking at the status of our workflow. For each bit of python code, i'll include a cell with a command line command using jupyter-notebook's '!' functionality. In practice, we use the command line tools quite a bit and will be emphasized in this notebook.

**Command Line Access in Jupyter**: Jupyter lets you running command line commands by prefacing them with the exclamation mark:

In [None]:
lp.get_wf_summary_dict(1)

In [None]:
lp.get_wf_ids()

**Let's write a function to print the state of all workflows**

In [None]:
def get_wflows():
    for wf_id in lp.get_wf_ids():
        for key,value in lp.get_wf_summary_dict(wf_id).items():
            print(key, ": ",value)
        print("\n")

In [None]:
get_wflows()

**Let's write a function to print the state of all fireworks**

In [None]:
def get_fws():
    for fw_id in lp.get_fw_ids():
        fw = lp.get_fw_dict_by_id(fw_id)
        for prop in ["fw_id","updated_on","state","name"]:
            print(prop, ": ",fw[prop])

        print("\n")
        
get_fws()

**Let's defuse the extra workflows**

In [None]:
ids = lp.get_wf_ids()
print(ids)

In [None]:
for i in ids[1:]:
    lp.defuse_wf(i)

**The command line allows you to get the same information**

For more information on available commands, use: ```lpad --help```

In [None]:
!lpad get_wflows -i 1

In [None]:
!lpad get_fws -i 1

In [None]:
!lpad --help

In [None]:
# Let's look at what this command can do:
!lpad get_fws --help

# Now lets run this workflow


There are a few different ways to run a workflow. The first is to just run it within this notebook directly.

In [None]:
from fireworks.core.rocket_launcher import launch_rocket

In [None]:
# Lets move into a temporary working directory
import os

os.mkdir("temp")
os.chdir("temp")

In [None]:
launch_rocket(lp)

Now, lets see how that changed our fireworks

In [None]:
get_fws()

This let me run a single firework in the notebook. What if I wanted to run multiple fireworks? First lets reset the old firework and add some more workflows to our database.

In [None]:
lp.rerun_fw(1)

We can do the same thing using the command line: ```lpad rerun_fws -i 1```

In [None]:
get_fws()

In [None]:
# Let's add the workflow a few more times to have multiple fireworks in database
lp.add_wf(wf)
lp.add_wf(wf)

We can run all of the available fireworks using a 2 lines of python and a single command:

In [None]:
from fireworks.core.rocket_launcher import rapidfire
rapidfire(lp)

This let us run fireworks until we no longer had any to run. But we're still running fireworks in our jupyter notebook. If I want to run on this on another machine I need to do something else. Normally, we would want to launch these jobs to our supercomputing queue and let that run them as resources become available. 

### Using the queue launcher:

Setting up the queue launcher unfortunately takes some work. There are configuration files to tell atomate how to submit jobs, where the database is and what special parameters to use for this supercomputer. For more information visit [the atomate installation guide](https://atomate.org/installation.html#configure-database-connections-and-computing-center-parameters) after this workshop after this workshop for more guidance.

This has all been setup for you in this workshop. Once setup, to use the queue, we simply launch the fireworks to the queue.

**Lets start off by clearing the database of fireworks**

This is not something we usually do in everyday calculations

In [None]:
lp.reset(None,require_password=False)

In [None]:
from atomate.vasp.workflows.presets.core import wf_bandstructure

wf = wf_bandstructure(struct, wf_config)
wf.as_dict()

In [None]:
from mp_workshop.atomate import si_static_path,si_nscf_line_path,si_nscf_uniform_path
wf = use_fake_vasp(wf,{"Si-structure optimization":si_struct_opt_path,
                       "Si-static": si_static_path,
                       "Si-nscf uniform" : si_nscf_uniform_path,
                       "Si-nscf line": si_nscf_line_path})

lp.add_wf(wf)

In [None]:
get_fws()

Fireworks has a commmand line method to submit jobs to the SLURM queue:

In [None]:
!qlaunch -r rapidfire --nlaunches 1

Now, the supercomputer will take care of running the jobs and eventually we can test to see that they are working

In [None]:
get_fws()

Now lets have qlaunch submit fireworks until all are done.

In [None]:
!qlaunch -r rapidfire

In [None]:
get_fws()

### Now, we have a completed workflow 