# 1 - Beginning Workflows

In this lesson, we'll cover the basics of using atomate to run calculations. This will be a hands-on lesson where we dive into running a full workflows and break that down into components to understand how the various moving parts give us the ability to scale from 1 calculation to 10's of thousands.

In [1]:
import mp_workshop.atomate

Echo Test: MP Workshop


# Building a workflow

To begin, we'll start by grabbing a structure from materials project using pymatgen and the MPRester interface we learned about in a previous course

In [2]:
from pymatgen import MPRester

mpr = MPRester()

struc = mpr.get_structure_by_material_id("mp-27")
print(struc)

Full Formula (Si1)
Reduced Formula: Si
abc   :   2.736139   2.736139   2.736139
angles:  60.000000  60.000000  60.000000
Sites (1)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Si      0    0    0


Now, let's construct a workflow using atomate to optimize this structure in DFT

In [3]:
from atomate.vasp.workflows.presets.core import wf_structure_optimization

In [4]:
wf = wf_structure_optimization(struc,{"DB_FILE": None})
print(wf)

Workflow object: (fw_ids: dict_keys([-1]) , name: Si)


Get some more information on the workflow

In [5]:
wf.as_dict()

{'created_on': datetime.datetime(2018, 8, 10, 4, 38, 41, 14015),
 'fws': [{'created_on': '2018-08-10T04:38:41.013482',
   'fw_id': -1,
   'name': 'Si-structure optimization',
   'spec': {'_tasks': [{'_fw_name': 'FileWriteTask',
      'files_to_write': [{'contents': '',
        'filename': 'FW--Si-structure_optimization'}]},
     {'_fw_name': '{{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}',
      'structure': {'@class': 'Structure',
       '@module': 'pymatgen.core.structure',
       'charge': None,
       'lattice': {'a': 2.7361386705337156,
        'alpha': 60.00000010462969,
        'b': 2.7361386686542817,
        'beta': 60.000000127351925,
        'c': 2.73613867,
        'gamma': 60.000000062179055,
        'matrix': [[2.3695656, 0.0, 1.36806933],
         [0.7898552, 2.23404787, 1.36806933],
         [0.0, 0.0, 2.73613867]],
        'volume': 14.484360157964268},
       'sites': [{'abc': [0.0, 0.0, 0.0],
         'label': 'Si',
         'species': [{'element': 'Si',

# Running with Fake VASP to simulate a DFT calculation

Due to a combination of licensing issues and just not being able to run this quickly on the jupyter server, we're going to simulate VASP running with a magic function. You will later learn about powerups, which let you modify a workflow. For this exercise we're going to use a powerup that will replace the normal VASP running functionality with something that just copies files we've prepared for you

In [6]:
from atomate.vasp.powerups import use_fake_vasp

## Lets do some work to get the path to fake VASP files

In [7]:
from mp_workshop.atomate import si_struct_opt_path

print(si_struct_opt_path)

/home/jovyan/work/workshop-2018/mp_workshop/fake_vasp/Si_structure_opt


In [8]:
wf = use_fake_vasp(wf, ref_dirs={"Si-structure optimization": si_struct_opt_path})
wf.as_dict()

{'created_on': datetime.datetime(2018, 8, 10, 4, 38, 41, 14015),
 'fws': [{'created_on': '2018-08-10T04:38:41.013482',
   'fw_id': -1,
   'name': 'Si-structure optimization',
   'spec': {'_tasks': [{'_fw_name': 'FileWriteTask',
      'files_to_write': [{'contents': '',
        'filename': 'FW--Si-structure_optimization'}]},
     {'_fw_name': '{{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}',
      'structure': {'@class': 'Structure',
       '@module': 'pymatgen.core.structure',
       'charge': None,
       'lattice': {'a': 2.7361386705337156,
        'alpha': 60.00000010462969,
        'b': 2.7361386686542817,
        'beta': 60.000000127351925,
        'c': 2.73613867,
        'gamma': 60.000000062179055,
        'matrix': [[2.3695656, 0.0, 1.36806933],
         [0.7898552, 2.23404787, 1.36806933],
         [0.0, 0.0, 2.73613867]],
        'volume': 14.484360157964268},
       'sites': [{'abc': [0.0, 0.0, 0.0],
         'label': 'Si',
         'species': [{'element': 'Si',

## Now we have to get ourself a LaunchPad so that we can submit this workflow to our database


Atomate uses Fireworks as its workflow engine. Fireworks hides the database with an object called a LaunchPad. This allows you to submit and query workflows from anywhere you have database access. We need to get ourselves a LaunchPad object so we can submit our workflow

In [9]:
from fireworks.core.launchpad import LaunchPad

lp = LaunchPad.auto_load()

For this one time, we have to initialize the database. In everyday use, we'll only do this once. For this lesson, we'll use it a few times:

In [10]:
lp.reset(None,require_password=False)

2018-08-10 04:38:44,090 INFO Performing db tune-up
2018-08-10 04:38:44,099 INFO LaunchPad was RESET.


We can use the launchpad to add a workkflow to our database:

In [11]:
lp.add_wf(wf)

2018-08-10 04:38:44,547 INFO Added a workflow. id_map: {-1: 1}


{-1: 1}

# Monitoring Workflows

Fireworks lets you monitor the status of workflows and fireworks using both python and the command line. Let's start off by looking at the status of our workflow. For each bit of python code, i'll include a cell with a command line command using jupyter-notebook's '!' functionality. In practice, we use the command line tools quite a bit and will be emphasized in this notebook.

**Command Line Access in Jupyter**: Jupyter lets you running command line commands by prefacing them with the exclamation mark:

In [12]:
# Lets get workflows

def get_wflows():
    for wf_id in lp.get_wf_ids():
        for key,value in lp.get_wf_summary_dict(wf_id).items():
            print(key, ": ",value)
        print("\n")

get_wflows()

launch_dirs :  OrderedDict([('Si-structure optimization--1', [])])
created_on :  2018-08-10 04:38:41.014000
name :  Si
state :  READY
states :  OrderedDict([('Si-structure optimization--1', 'READY')])
updated_on :  2018-08-10 04:38:41.014000




This is how you get workflow information on the command line

In [13]:
!lpad get_wflows

Echo Test: MP Workshop
{
    "name": "Si--1",
    "states_list": "REA",
    "created_on": "2018-08-10T04:38:41.014000",
    "state": "READY"
}


In [14]:
def get_fws():
    for fw_id in lp.get_fw_ids():
        fw = lp.get_fw_dict_by_id(fw_id)
        for prop in ["fw_id","updated_on","state","name"]:
            print(prop, ": ",fw[prop])

        print("\n")
        
get_fws()

fw_id :  1
updated_on :  2018-08-10T04:38:44.538972
state :  READY
name :  Si-structure optimization




This command line gets you the same information

In [15]:
!lpad get_fws

Echo Test: MP Workshop
{
    "created_on": "2018-08-10T04:38:41.013482",
    "fw_id": 1,
    "name": "Si-structure optimization",
    "updated_on": "2018-08-10T04:38:44.538972",
    "state": "READY"
}


In [16]:
!lpad --help

Echo Test: MP Workshop
usage: lpad [-h] [-o {json,yaml}] [-l LAUNCHPAD_FILE] [-c CONFIG_DIR]
            [--logdir LOGDIR] [--loglvl LOGLVL] [-s]
            {version,init,reset,add,check_wflow,get_launchdir,append_wflow,dump_wflow,add_scripts,get_fws,track_fws,rerun_fws,defuse_fws,pause_fws,reignite_fws,resume_fws,update_fws,get_wflows,defuse_wflows,pause_wflows,reignite_wflows,archive_wflows,delete_wflows,get_qids,cancel_qid,detect_unreserved,detect_lostruns,set_priority,webgui,recover_offline,forget_offline,admin,report,introspect}
            ...

A command line interface to FireWorks. For more help on a specific command,
type "lpad <command> -h".

positional arguments:
  {version,init,reset,add,check_wflow,get_launchdir,append_wflow,dump_wflow,add_scripts,get_fws,track_fws,rerun_fws,defuse_fws,pause_fws,reignite_fws,resume_fws,update_fws,get_wflows,defuse_wflows,pause_wflows,reignite_wflows,archive_wflows,delete_wflows,get_qids,cancel_qid,detect_unreserved,detect_lostruns,set_prio

In [17]:
# Let's look at what this command can do:
!lpad get_fws --help

Echo Test: MP Workshop
usage: lpad get_fws [-h] [-i FW_ID] [-n NAME]
                    [-s {FIZZLED,RESERVED,PAUSED,WAITING,RUNNING,READY,ARCHIVED,COMPLETED,DEFUSED}]
                    [-q QUERY] [-lm] [--qid QID]
                    [-d {all,more,less,ids,count,reservations}] [-m MAX]
                    [--sort {created_on,updated_on}]
                    [--rsort {created_on,updated_on}]

optional arguments:
  -h, --help            show this help message and exit
  -i FW_ID, --fw_id FW_ID
                        fw_id
  -n NAME, --name NAME  get FWs with this name
  -s {FIZZLED,RESERVED,PAUSED,WAITING,RUNNING,READY,ARCHIVED,COMPLETED,DEFUSED}, --state {FIZZLED,RESERVED,PAUSED,WAITING,RUNNING,READY,ARCHIVED,COMPLETED,DEFUSED}
                        Select by state.
  -q QUERY, --query QUERY
                        Query (enclose pymongo-style dict in single-quotes,
                        e.g. '{"state":"COMPLETED"}')
  -lm, --launches_mode  Query the launches collection (enclos

# Now lets run this workflow


There are a few different ways to run a workflow. The first is to just run it within this notebook directly.

In [18]:
from fireworks.core.rocket_launcher import launch_rocket

In [19]:
# Lets move into a temporary working directory
import os

os.mkdir("temp")
os.chdir("temp")

In [20]:
launch_rocket(lp)

2018-08-10 04:38:53,174 INFO Launching Rocket
2018-08-10 04:38:53,226 INFO RUNNING fw_id: 1 in directory: /home/jovyan/work/workshop-2018/lessons/atomate/temp
2018-08-10 04:38:53,237 INFO Task started: FileWriteTask.
2018-08-10 04:38:53,238 INFO Task completed: FileWriteTask 
2018-08-10 04:38:53,242 INFO Task started: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}.
2018-08-10 04:38:53,287 INFO Task completed: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}} 
2018-08-10 04:38:53,302 INFO Task started: {{atomate.vasp.firetasks.run_calc.RunVaspFake}}.
2018-08-10 04:38:53,358 INFO atomate.vasp.firetasks.run_calc RunVaspFake: verified inputs successfully
2018-08-10 04:38:53,381 INFO atomate.vasp.firetasks.run_calc RunVaspFake: ran fake VASP, generated outputs
2018-08-10 04:38:53,382 INFO Task completed: {{atomate.vasp.firetasks.run_calc.RunVaspFake}} 
2018-08-10 04:38:53,386 INFO Task started: {{atomate.common.firetasks.glue_tasks.PassCalcLocs}}.
2018-08-10 04:38:53,38

True

Now, lets see how that changed our fireworks

In [21]:
!lpad get_fws

Echo Test: MP Workshop
{
    "fw_id": 1,
    "name": "Si-structure optimization",
    "created_on": "2018-08-10T04:38:41.013482",
    "updated_on": "2018-08-10T04:38:54.110877",
    "state": "COMPLETED"
}


This let me run a single firework in the notebook. What if I wanted to run multiple fireworks? First lets reset the old firework and add some more workflows to our database

In [22]:
# We can do the same thing using the command line:
!lpad rerun_fws 

Echo Test: MP Workshop
2018-08-10 04:38:59,325 INFO Finished setting 1 FWs to rerun


In [23]:
!lpad get_fws

Echo Test: MP Workshop
{
    "state": "READY",
    "name": "Si-structure optimization",
    "updated_on": "2018-08-10T04:38:59.316635",
    "created_on": "2018-08-10T04:38:41.013482",
    "fw_id": 1
}


In [24]:
# Let's add the workflow a few more times to have multiple fireworks in database
lp.add_wf(wf)
lp.add_wf(wf)

2018-08-10 04:39:02,682 INFO Added a workflow. id_map: {1: 2}
2018-08-10 04:39:02,692 INFO Added a workflow. id_map: {2: 3}


{2: 3}

We can run all of the available fireworks using a 2 lines of python and a single command:

In [25]:
from fireworks.core.rocket_launcher import rapidfire
rapidfire(lp)

2018-08-10 04:39:02,785 INFO Created new dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/launcher_2018-08-10-04-39-02-785216
2018-08-10 04:39:02,786 INFO Launching Rocket
2018-08-10 04:39:02,807 INFO RUNNING fw_id: 1 in directory: /home/jovyan/work/workshop-2018/lessons/atomate/temp/launcher_2018-08-10-04-39-02-785216
2018-08-10 04:39:02,815 INFO Task started: FileWriteTask.
2018-08-10 04:39:02,816 INFO Task completed: FileWriteTask 
2018-08-10 04:39:02,818 INFO Task started: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}}.
2018-08-10 04:39:02,830 INFO Task completed: {{atomate.vasp.firetasks.write_inputs.WriteVaspFromIOSet}} 
2018-08-10 04:39:02,834 INFO Task started: {{atomate.vasp.firetasks.run_calc.RunVaspFake}}.
2018-08-10 04:39:02,861 INFO atomate.vasp.firetasks.run_calc RunVaspFake: verified inputs successfully
2018-08-10 04:39:02,883 INFO atomate.vasp.firetasks.run_calc RunVaspFake: ran fake VASP, generated outputs
2018-08-10 04:39:02,884 INFO Task completed

This let us run fireworks until we no longer had any to run. But we're still running fireworks in our jupyter notebook. If I want to run on this on another machine I need to do something else. Normally, we would want to launch these jobs to our supercomputing queue and let that run them as resources become available. 

### Using the queue launcher:

Setting up the queue launcher unfortunately takes some work. There are configuration files to tell atomate how to submit jobs, where the database is and what special parameters to use for this supercomputer. 

This has all been setup for you in this workshop. Once setup, to use the queue, we simply launch the fireworks to the queue.

Lets start off by clearing the database of fireworks

In [26]:
lp.reset(None,require_password=False)

2018-08-10 04:39:05,374 INFO Performing db tune-up
2018-08-10 04:39:05,382 INFO LaunchPad was RESET.


In [27]:
from atomate.vasp.workflows.presets.core import wf_bandstructure

wf = wf_bandstructure(struc,{"DB_FILE": None})
wf.as_dict()

{'created_on': datetime.datetime(2018, 8, 10, 4, 39, 5, 426934),
 'fws': [{'created_on': '2018-08-10T04:39:05.426807',
   'fw_id': -5,
   'name': 'Si-nscf line',
   'spec': {'_tasks': [{'_fw_name': 'FileWriteTask',
      'files_to_write': [{'contents': '', 'filename': 'FW--Si-nscf_line'}]},
     {'_fw_name': '{{atomate.vasp.firetasks.glue_tasks.CopyVaspOutputs}}',
      'additional_files': ['CHGCAR'],
      'calc_loc': True},
     {'_fw_name': '{{atomate.vasp.firetasks.write_inputs.WriteVaspNSCFFromPrev}}',
      'mode': 'line',
      'prev_calc_dir': '.',
      'reciprocal_density': 20,
      'small_gap_multiply': [0.5, 5]},
     {'_fw_name': '{{atomate.vasp.firetasks.run_calc.RunVaspCustodian}}',
      'auto_npar': '>>auto_npar<<',
      'gamma_vasp_cmd': '>>gamma_vasp_cmd<<',
      'scratch_dir': '>>scratch_dir<<',
      'vasp_cmd': '>>vasp_cmd<<'},
     {'_fw_name': '{{atomate.common.firetasks.glue_tasks.PassCalcLocs}}',
      'name': 'nscf'},
     {'_fw_name': '{{atomate.vasp.fire

In [28]:
from mp_workshop.atomate import si_static_path,si_nscf_line_path,si_nscf_uniform_path
wf = use_fake_vasp(wf,{"Si-structure optimization":si_struct_opt_path,
                       "Si-static": si_static_path,
                       "Si-nscf uniform" : si_nscf_uniform_path,
                       "Si-nscf line": si_nscf_line_path})

lp.add_wf(wf)

2018-08-10 04:39:05,528 INFO Added a workflow. id_map: {-5: 1, -4: 2, -3: 3, -2: 4}


{-5: 1, -4: 2, -3: 3, -2: 4}

In [29]:
!lpad get_fws

Echo Test: MP Workshop
[
    {
        "updated_on": "2018-08-10T04:39:05.426809",
        "state": "WAITING",
        "name": "Si-nscf line",
        "created_on": "2018-08-10T04:39:05.426807",
        "fw_id": 1
    },
    {
        "updated_on": "2018-08-10T04:39:05.426671",
        "state": "WAITING",
        "name": "Si-nscf uniform",
        "created_on": "2018-08-10T04:39:05.426668",
        "fw_id": 2
    },
    {
        "updated_on": "2018-08-10T04:39:05.426519",
        "state": "WAITING",
        "name": "Si-static",
        "created_on": "2018-08-10T04:39:05.426516",
        "fw_id": 3
    },
    {
        "updated_on": "2018-08-10T04:39:05.522307",
        "state": "READY",
        "created_on": "2018-08-10T04:39:05.426367",
        "name": "Si-structure optimization",
        "fw_id": 4
    }
]


Fireworks has a commmand line method to submit jobs to the SLURM queue:

In [30]:
!qlaunch -r rapidfire --nlaunches 1

Echo Test: MP Workshop
2018-08-10 04:39:10,592 INFO getting queue adapter
2018-08-10 04:39:10,593 INFO Created new dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376
2018-08-10 04:39:10,603 INFO The number of jobs currently in the queue is: 0
2018-08-10 04:39:10,603 INFO 0 jobs in queue. Maximum allowed by user: 0
2018-08-10 04:39:10,621 INFO Launching a rocket!
2018-08-10 04:39:10,644 INFO reserved FW with fw_id: 4
2018-08-10 04:39:10,644 INFO Created new dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376/launcher_2018-08-10-04-39-10-644473
2018-08-10 04:39:10,647 INFO moving to launch_dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376/launcher_2018-08-10-04-39-10-644473
2018-08-10 04:39:10,648 INFO submitting queue script
2018-08-10 04:39:10,666 INFO Job submission was successful and job_id is 6
2018-08-10 04:39:10,669 INFO Launched allowed number of jobs: 1


Now, the supercomputer will take care of running the jobs and eventually we can test to see that they are working

In [31]:
!lpad get_fws

Echo Test: MP Workshop
[
    {
        "created_on": "2018-08-10T04:39:05.426807",
        "fw_id": 1,
        "name": "Si-nscf line",
        "updated_on": "2018-08-10T04:39:05.426809",
        "state": "WAITING"
    },
    {
        "created_on": "2018-08-10T04:39:05.426668",
        "fw_id": 2,
        "name": "Si-nscf uniform",
        "updated_on": "2018-08-10T04:39:05.426671",
        "state": "WAITING"
    },
    {
        "created_on": "2018-08-10T04:39:05.426516",
        "name": "Si-static",
        "updated_on": "2018-08-10T04:39:14.263505",
        "fw_id": 3,
        "state": "READY"
    },
    {
        "created_on": "2018-08-10T04:39:05.426367",
        "fw_id": 4,
        "updated_on": "2018-08-10T04:39:14.262112",
        "name": "Si-structure optimization",
        "state": "COMPLETED"
    }
]


Now lets have qlaunch submit fireworks until all are done.

In [32]:
!qlaunch -r rapidfire

Echo Test: MP Workshop
2018-08-10 04:39:17,312 INFO getting queue adapter
2018-08-10 04:39:17,313 INFO Found previous block, using /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376
2018-08-10 04:39:17,323 INFO The number of jobs currently in the queue is: 0
2018-08-10 04:39:17,323 INFO 0 jobs in queue. Maximum allowed by user: 0
2018-08-10 04:39:17,343 INFO Launching a rocket!
2018-08-10 04:39:17,359 INFO reserved FW with fw_id: 3
2018-08-10 04:39:17,360 INFO Created new dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376/launcher_2018-08-10-04-39-17-359938
2018-08-10 04:39:17,361 INFO moving to launch_dir /home/jovyan/work/workshop-2018/lessons/atomate/temp/block_2018-08-10-04-39-10-593376/launcher_2018-08-10-04-39-17-359938
2018-08-10 04:39:17,362 INFO submitting queue script
2018-08-10 04:39:17,370 INFO Job submission was successful and job_id is 7
2018-08-10 04:39:17,374 INFO Sleeping for 5 seconds...zzz...
2

In [34]:
!lpad get_fws

Echo Test: MP Workshop
[
    {
        "created_on": "2018-08-10T04:39:05.426807",
        "fw_id": 1,
        "updated_on": "2018-08-10T04:40:35.163561",
        "name": "Si-nscf line",
        "state": "COMPLETED"
    },
    {
        "created_on": "2018-08-10T04:39:05.426668",
        "fw_id": 2,
        "updated_on": "2018-08-10T04:40:28.771991",
        "name": "Si-nscf uniform",
        "state": "COMPLETED"
    },
    {
        "created_on": "2018-08-10T04:39:05.426516",
        "fw_id": 3,
        "updated_on": "2018-08-10T04:39:23.034604",
        "name": "Si-static",
        "state": "COMPLETED"
    },
    {
        "created_on": "2018-08-10T04:39:05.426367",
        "fw_id": 4,
        "updated_on": "2018-08-10T04:39:14.262112",
        "name": "Si-structure optimization",
        "state": "COMPLETED"
    }
]


### Now, we have a completed workflow 