# 2 - Workflow management with atomate

In this lesson we cover a few more advanced features of the atomate software package.  More specifically we'll go over an example that includes parents and children, talk more about where you can find preset workflows, and cover a basic example of analyzing workflow-generated materials data.

In [2]:
from mp_workshop.atomate import wf_to_graph, use_fake_vasp_workshop
from tqdm import tqdm_notebook

## Why use atomate?

Just to motivate this lesson a bit more, I'd like to share an example that illustrates the value of atomate.  Let's say you wanted to calculate the bandstructure of every polymorph of SiO$_2$.  Normally, you'd need to get all of the cifs from MP or the ICSD, construct the POSCARs by hand, or use some of your own infrastructure to convert them, run the calculations and manage the directory structure, and aggregate the results in a way that allowed you to analyze them together.  In atomate, this is achieved in a simple, five line snippet:

In [None]:
from atomate.vasp.workflows.presets.core import get_wf
from fireworks import LaunchPad

lpad = LaunchPad.auto_load()
lpad.reset("", require_password=False)

from pymatgen import MPRester
mpr = MPRester()

structures = mpr.get_structures("SiO2")
for structure in tqdm_notebook(structures):
    wf = get_wf(structure, "bandstructure.yaml")
    lpad.add_wf(wf)

This snippet attests to the so-called "high-throughput" approach, which has value both because it enables you to aggregate a lot of data quickly in a way that is only possible in computational materials science and because it enables you to examine trends in a large dataset much more quickly, which is typically the most valuable way to use DFT-based simulation.

Before we move on, let's reset the workflow database.

In [None]:
lpad.reset(password="", require_password=False, max_reset_wo_password=10000)

In [None]:
lpad.fireworks.database

## The elastic tensor: A multi-step preset workflow

In addition to enabling "high-throughput" functionality, atomate allows you to execute a relatively complex set of instructions using very simple constructions.  Our canonical example of this functionality is contained in the elastic workflow.

In [None]:
structure = mpr.get_structure_by_material_id("mp-149", conventional_unit_cell=True)
structure

In [None]:
from mp_workshop import get_viewer_link

In [None]:
get_viewer_link(structure)

In [None]:
from atomate.vasp.workflows.presets.core import wf_elastic_constant, wf_elastic_constant_minimal

In [None]:
wf = wf_elastic_constant(structure)

Here's a rendering of the control flow associated with the firework, which you can see both here and in the fireworks dashboard.

In [None]:
from mp_workshop.atomate import wf_to_graph
wf_to_graph(wf)

From our exploration of the workflow, we can see that the workflow has a "root" firework, which is the structure optimization firework, and a single "leaf" firework, which is the analysis task.  In between, we have several fireworks which have "elastic deformation" in their titles, each of which represents a calculation which will transform the output of the optimization firework by applying a unique strain.

In the analysis step, the stresses from those deformations and the strains are aggregated and used to fit an elastic tensor.  Let's examine the different types of fireworks using the graph tool.

In [None]:
print(wf.fws[0].name)
wf_to_graph(wf.fws[0])

In [None]:
print(wf.fws[1].name)
wf_to_graph(wf.fws[1])

In [None]:
print(wf.fws[-1].name)
wf_to_graph(wf.fws[-1])

Note that the standard preset workflow for the elastic tensor uses many calculations in excess of what is required to determine the elastic tensor, which yields a higher-quality tensor in which some of the numerical noise is washed out over the duplicity of calculations.  Note that you can also generate a "minimal" elastic workflow which uses neither the more expensive DFT parameters nor the extended calculations.  Tensors generated using this workflow are typically not as accurate, but often work for simple semiconductors with a lot of symmetry.  Let's try it with bulk silicon.

In [None]:
wf = wf_elastic_constant_minimal(structure)

In [None]:
wf_to_graph(wf)

Note that for silicon, the two deformation tasks are a single normal deformation along the x axis (which are equivalent to those along the y- and z-axis), and a single shear deformation (equivalent to the two others normally included).  Since this is considerably simpler, we'll run this as our example.

To run the workflow, we import our launchpad and rocket launcher tools.  Since we can't use VASP in the workflow, we're also going to "fake" vasp by copying files rather than running the binary.

In [None]:
from fireworks import LaunchPad
from mp_workshop.atomate import use_fake_vasp_workshop

In [None]:
wf = use_fake_vasp_workshop(wf)
lpad.add_wf(wf)

Now we can run the workflow either in the notebook using the pythonic fireworks rocket launcher or using `qlaunch` from the linux command line.

In [None]:
from fireworks.core.rocket_launcher import rapidfire

In [None]:
rapidfire(lpad)

## Powerups

Atomate contains a number of tools for modifying workflows, known as "powerups" which are functions which will modify the workflows in commonly desired ways.  For example, if you want to modify the parameters of a VASP calculation, you can use the add_modify_incar powerup to change these for every firework in the workflow.

### add_modify_incar

In [None]:
from atomate.vasp.powerups import add_modify_incar

In [None]:
wf = get_wf(structure, "bandstructure.yaml")

In [None]:
wf_to_graph(wf.fws[0])

In [None]:
modified = add_modify_incar(wf, {"incar_update": {"ENCUT": 700}})

In [None]:
wf_to_graph(modified.fws[0])

Prior to the execution of the VASP calculation, a ModifyIncar firetask is inserted which will read the incar and modify the ENCUT parameter such that it matches our powerup specification.

### add_tags, add_structure_metadata

In [None]:
from atomate.vasp.powerups import add_tags

In [None]:
print("WF metadata: ", wf.metadata)
print("Last task: ", wf.fws[0].tasks[-1])

In [None]:
wf = add_tags(wf, ["photocathodes"])

In [None]:
print("WF metadata: ", wf.metadata)
print("Last task: ", wf.fws[0].tasks[-1])

## A few other "complex" workflows.

Atomate contains a few workflows that will do more complicating things like calculations on surfaces.  Let's say I want to calculate the determine the hydrogen adsorption energy on every low-index facet of Pt.

In [None]:
from atomate.vasp.workflows.base.adsorption import get_wf_surface_all_slabs
from pymatgen import MPRester, Molecule

In [None]:
pt = mpr.get_structure_by_material_id("Pt", conventional_unit_cell=True)
adsorbate = Molecule("H", [[0, 0, 0]])

In [None]:
wf = get_wf_surface_all_slabs(pt, [adsorbate])

In [None]:
wf_to_graph(wf)

This is how you can do the requisite calculation to find the band-gap from an HSE calculation after the PBE bandstructure.

In [None]:
from atomate.vasp.workflows.presets.core import wf_bandstructure_plus_hse

In [None]:
wf = wf_bandstructure_plus_hse(structure)

In [None]:
wf_to_graph(wf)

There's also support for workflows in FEFF and QChem.

In [None]:
from atomate.feff.workflows.core import get_wf_xas

In [None]:
wf = get_wf_xas("Si", structure)
wf_to_graph(wf)

In [None]:
wf_to_graph(wf.fws[0])

In [None]:
# TODO: double check that this will actually work
from atomate.qchem.workflows.base import double_FF_opt

## Analyzing workflow results

Our final example for atomate will take a set of tasks from our database and construct a phase diagram.

In [None]:
structures = mpr.get_structures("Al-Cr")
structures += mpr.get_structures("Al")
structures += mpr.get_structures("Cr")
for structure in structures:
    wf = get_wf(structure, "bandstructure.yaml")
    lpad.add_wf(wf)

In [None]:
from maggma.stores import MongoStore

In [None]:
tasks = MongoStore(database="vasp_db", collection_name="tasks")

In [None]:
tasks.connect()

In [None]:
doc = tasks.query_one(criteria={"chemsys": "Al"})

In [None]:
doc.keys()

In [None]:
doc['chemsys']

In [None]:
doc = tasks.query_one(criteria={"chemsys": "Al"}, properties=['output'])

In [None]:
docs = tasks.query(criteria={"chemsys": "Al-Cr", 
                             "task_label": "structure optimization"},
                   properties=['output', 'formula_pretty'])
docs

In [None]:
docs = list(docs)

In [None]:
[d['formula_pretty'] for d in docs]

In [None]:
fe_cr_tasks = []
for chemsys in ['Al', 'Cr', 'Al-Cr']:
    task_list = tasks.query(criteria={"chemsys": chemsys,
                                      "task_label": "structure optimization"},
                            properties=["output", "composition_unit_cell"])
    fe_cr_tasks.extend(list(task_list))

In [None]:
len(fe_cr_tasks)

In [None]:
from pymatgen.entries.computed_entries import ComputedEntry
from pymatgen import Composition

In [None]:
entries = []
for task in fe_cr_tasks:
    energy = task['output']['energy']
    composition = Composition.from_dict(task['composition_unit_cell'])
    entry = ComputedEntry(composition, energy)
    entries.append(entry)

In [None]:
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

In [None]:
pd = PhaseDiagram(entries)

In [None]:
plotter = PDPlotter(pd)

In [None]:
plotter.show()

In [None]:
plotter.show_unstable = True
plotter.show()