# Workflow Manager

This Notebook provides a means of managing preparing and/or running iprPy calculations.  

Note that this Notebook mostly outlines the code/steps associated with with work and is *not* the most optimal means of preparing and running. This is especially true for runners as ideally each runner should be a truly separate process to observe and manage how each is behaving.

For preparing and running on a cluster, the suggestions are:
- Copy the prepare cells below that you wish to use into a Python script.  Submit a job to the cluster for the prepare script.  It only needs to be a serial process, but can take a long time depending on how many calculations are being prepared.
- Submit separate jobs for each runner you wish to be active.  These can easily be based on the iprPy runner command line.

Example prepare Python scripts that correspond to the content below can be found in the bin/prepare/ directory of the iprPy repository.

In [1]:
# import libraries
import numpy as np
import atomman as am

# https://github.com/usnistgov/iprPy
import iprPy
print('iprPy version', iprPy.__version__)

iprPy version 0.11.2


---

## 1. Load the database

The database to use where the calculation records will be added to and to search for existing calculations to skip.

In [2]:
database = iprPy.load_database('master')
print(database)

database style mongo at localhost:27017.iprPy


---

## 2. Define global prepare terms

All prepare terms are collected into a dictionary making it easy to pass along to the underlying prepare methods.

In [3]:
prepare_terms = {}

### 2.1. Executable terms

These are basic terms that specify executables and some options that are required by most calculations in the workflow

- __lammps_command__ is the primary LAMMPS executable to use.
- __mpi_command__ is the MPI command to use.  Leave {np_per_runner} as a variable.

In [4]:
prepare_terms['lammps_command'] =        'E:/LAMMPS/2020-03-03/bin/lmp_mpi'
prepare_terms['mpi_command'] =           'mpiexec -localonly {np_per_runner}'

### 2.2. Old LAMMPS executables (optional)

Some older implementations of potentials will no longer work with the most current version of LAMMPS.  These options allow for alternate LAMMPS executables to be automatically selected as needed.  Note that this is only important if you want to compare the different versions of a given potential as all current active potentials in the NIST database are compatible with the newest LAMMPS.

- __lammps_command_snap_1__: SNAP version 1 needs LAMMPS between 8 Oct 2014 and 30 May 2017.
- __lammps_command_snap_2__: SNAP version 2 needs LAMMPS between 3 Dec 2018 and 12 June 2019.
- __lammps_command_old__: Some older implementations of potentials need LAMMPS before 30 Oct 2019.

In [5]:
prepare_terms['lammps_command_snap_1'] = 'E:/LAMMPS/2017-01-27/bin/lmp_mpi'
prepare_terms['lammps_command_snap_2'] = 'E:/LAMMPS/2019-06-05/bin/lmp_mpi'
prepare_terms['lammps_command_old'] =    'E:/LAMMPS/2019-06-05/bin/lmp_mpi'

---

## 3. Specify the LAMMPS potentials to use

Any terms in prepare_terms that start with "potential_" are identified as terms that modify which LAMMPS potentials are to be used to prepare the calculations.  All "potential_" terms are passed to database.potdb.get_lammps_potentials() as kwargs with the "potential_" part of their name removed.  Then, only calculations that correspond to the matching LAMMPS potentials returned are prepared.

The most useful related terms are

- __potential_status__ *(None, str or list, optional*) Limits the search by the status of the LAMMPS implementations: "active", "superseded" and/or "retracted".  Most users should set this to 'active
- __potential_id__ *(str or list*) The unique record id(s) labeling the records to parse by.
- __potential_potid__ *(str or list, optional*) The unique record id(s) labeling the associated potential records to parse by.
- __potential_pair_style__ *(str or list, optional*) LAMMPS pair_style(s) to parse by.
- __potential_symbols__ *(str or list, optional*) Model symbol(s) to parse by.  Typically correspond to elements for atomic potential models.
- __potential_elements__ *(str or list, optional*) Element(s) in the model to parse by.

In [7]:
prepare_terms['potential_status'] =      'active'
prepare_terms['potential_id'] = [
    '2022--Xu-Y--Ni-Rh--LAMMPS--ipr1',
    '2022--Xu-Y--Ni-Pd--LAMMPS--ipr1',
    '2019--Plummer-G--Ti-Al-C--LAMMPS--ipr1',
    '2019--Plummer-G--Ti-Si-C--LAMMPS--ipr1',
    '2021--Plummer-G--Ti-Al-C--LAMMPS--ipr1',
]

---

## 4. Prepare pools

The prepared calculations are divided into separate "pools" of similar calculation styles.  This is done for a variety of reasons.

- Different pools may benefit from a larger number of processors to work with.
- Separating the calculation styles helps in showing the overall progress on obtaining results for the different styles.  
- Some calculation results serve as inputs for later calculations.  Dividing calculations based on where they are positioned in the workflow makes it easier to determine the optimum times to prepare further calculations.

Each master prepare option used below defines a set of default prepare terms and values to use.  These can be overridden by simply adding new terms to prepare_terms that give replacement values.  Additionally, any other calculation/prepare options can be modified using prepare_terms.

Not all possible calculations will be prepared for a given style until all corresponding parent calculations are finished meaning that you should either wait to prepare those calculations or prepare them multiple times. 

The prepare_terms options required for each pool are

- __styles__ lists the iprPy calculation styles to prepare in the pool.  By default, these will use the pre-defined "main" branch, but alternate branches can be selected by giving the branch name after a :.
- __run_directory__ is the name of the specific run directory where the pool is located.  All prepared calculations will be created in this run directory.
- __np_per_runner__ is the number of processors each runner will be assigned to use for the underlying simulations.
- __num_pots__ is the maximum number of potentials to prepare at a time.  Smaller numbers mean that any associated prepares are called more times, but it reduces the number of calculation variations to build and test prior to any calculations being created.

It is possible to prepare multiple pools at the same time by giving lists of values for the four terms above.  However, this Notebook breaks the pools into separate cells for better interactivity and control.

### 4.1. Pool #1: Basic potential evaluations and scans

These are basic potential evaluation methods and initial energy scans.  

Parent calculations:

- None

#### 4.1.1. isolated_atom

Evaluates the energy of a single atom in isolation.

    buildcombos                     lammpspotential potential_file intpot

#### 4.1.2. diatom_scan

Evaluates the energy of a pair of atoms at various interatomic spacings.

    buildcombos                     diatom potential_file intpot
    minimum_r                       0.02 angstrom
    maximum_r                       10.0 angstrom
    number_of_steps_r               500

#### 4.1.3. E_vs_r_scan:bop

Is a variation of E_vs_r_scan specifically for bop potentials where the minimum r value is increased.  __NOTE:__ This should be listed/called before E_vs_r_scan so that bop potentials are prepared with this and not the standard E_vs_r_scan.

    buildcombos                     crystalprototype load_file prototype
    prototype_potential_pair_style  bop
    sizemults                       10 10 10
    minimum_r                       2.0 angstrom
    maximum_r                       6.0 angstrom
    number_of_steps_r               201

#### 4.1.4. E_vs_r_scan

Evaluates the energy of crystal prototypes subjected to a volumetric scan.

    buildcombos                     crystalprototype load_file prototype
    sizemults                       10 10 10
    minimum_r                       0.5 angstrom
    maximum_r                       6.0 angstrom
    number_of_steps_r               276

In [9]:
# Specify master prepare options
prepare_terms['styles'] = ' '.join([
    'isolated_atom',
    'diatom_scan',
    'E_vs_r_scan:bop',
    'E_vs_r_scan',
])
prepare_terms['run_directory'] = 'master_1'
prepare_terms['np_per_runner'] = '1'
prepare_terms['num_pots']      = '100'

# Run master_prepare
database.master_prepare(**prepare_terms)

5 potential ids found
Preparing calculation isolated_atom branch main
Using potential #s 0 to 4

1005 existing calculation records found
5 matching interatomic potentials found
5 calculation combinations to check
0 new records to prepare

Preparing calculation diatom_scan branch main
Using potential #s 0 to 4

2175 existing calculation records found
5 matching interatomic potentials found
24 calculation combinations to check
0 new records to prepare

Preparing calculation E_vs_r_scan branch bop
Using potential #s 0 to 4

28260 existing calculation records found
19 matching crystal prototypes found
0 matching interatomic potentials found
1 invalid calculations skipped
0 calculation combinations to check

Preparing calculation E_vs_r_scan branch main
Using potential #s 0 to 4

28260 existing calculation records found
19 matching crystal prototypes found
5 matching interatomic potentials found
324 calculation combinations to check
0 new records to prepare



### 4.2. Pool #2: Crystal relaxations

These perform crystal structure relaxations based on a guess structure and an interatomic potential.  

Parent calculations:

- E_vs_r_scan
- E_vs_r_scan:bop

#### 4.2.1. relax_box

Relaxes a crystal structure by only altering box dimensions to zero pressure while keeping all atoms in the same box-relative positions.

    buildcombos                     atomicreference load_file reference
    buildcombos                     atomicparent load_file parent
    parent_record                   calculation_E_vs_r_scan
    parent_load_key                 minimum-atomic-system
    parent_status                   finished
    sizemults                       10 10 10
    atomshift                       0.05 0.05 0.05
    strainrange                     1e-6

#### 4.2.2. relax_static

Relaxes a crystal structure using energy/force minimization plus a simultaneous box relax.

    buildcombos                     atomicreference load_file reference
    buildcombos                     atomicparent load_file parent
    parent_record                   calculation_E_vs_r_scan
    parent_load_key                 minimum-atomic-system
    parent_status                   finished
    sizemults                       10 10 10
    atomshift                       0.05 0.05 0.05
    energytolerance                 0.0
    forcetolerance                  1e-10 eV/angstrom
    maxiterations                   10000
    maxevaluations                  100000
    maxatommotion                   0.01 angstrom
    maxcycles                       100
    cycletolerance                  1e-10

#### 4.2.3. relax_dynamic

Relaxes a crystal structure using a nph barrostat plus a Langevin thermostat set at 0 K.  This evolves the system while dampening out forces over time.

    buildcombos                     atomicreference load_file reference
    buildcombos                     atomicparent load_file parent
    parent_record                   calculation_E_vs_r_scan
    parent_load_key                 minimum-atomic-system
    parent_status                   finished
    sizemults                       10 10 10
    atomshift                       0.05 0.05 0.05
    temperature                     0.0
    integrator                      nph+l
    thermosteps                     1000
    runsteps                        10000
    equilsteps                      0

In [10]:
# Specify master prepare options
prepare_terms['styles'] = ' '.join([
    'relax_box',
    'relax_static',
    'relax_dynamic',
])
prepare_terms['run_directory'] = 'master_2'
prepare_terms['np_per_runner'] = '1'
prepare_terms['num_pots']      = '100'

# Run master_prepare
database.master_prepare(**prepare_terms)

5 potential ids found
Preparing calculation relax_box branch main
Using potential #s 0 to 4

132080 existing calculation records found
6587 matching atomic references found
5 matching interatomic potentials found
5 matching interatomic potentials found
324 matching atomic parents found
1201 calculation combinations to check
0 new records to prepare

Preparing calculation relax_static branch main
Using potential #s 0 to 4

177037 existing calculation records found
6587 matching atomic references found
5 matching interatomic potentials found
5 matching interatomic potentials found
324 matching atomic parents found
1201 calculation combinations to check
0 new records to prepare

Preparing calculation relax_dynamic branch main
Using potential #s 0 to 4

103030 existing calculation records found
6587 matching atomic references found
5 matching interatomic potentials found
5 matching interatomic potentials found
324 matching atomic parents found
1201 calculation combinations to check
0 new r

### 4.3. Pool #3: Further crystal relaxations

This performs further crystal relaxations on the results of pool #2. 

Parent calculations

- relax_dynamic

#### 4.3.1. relax_static:from_dynamic

Takes the resulting structures of relax_dynamic and subjects them to an energy/force minimization plus box relaxation.

    buildcombos                     atomicarchive load_file archive
    archive_record                  calculation_relax_dynamic
    archive_branch                  main
    archive_load_key                final-system
    archive_status                  finished
    sizemults                       1 1 1
    energytolerance                 0.0
    forcetolerance                  1e-10 eV/angstrom
    maxiterations                   10000
    maxevaluations                  100000
    maxatommotion                   0.01 angstrom
    maxcycles                       100
    cycletolerance                  1e-10

In [11]:
# Specify master prepare options
prepare_terms['styles'] = ' '.join([
    'relax_static:from_dynamic'
])
prepare_terms['run_directory'] = 'master_2'
prepare_terms['np_per_runner'] = '1'
prepare_terms['num_pots']      = '100'

# Run master_prepare
database.master_prepare(**prepare_terms)

5 potential ids found
Preparing calculation relax_static branch from_dynamic
Using potential #s 0 to 4

177037 existing calculation records found
5 matching interatomic potentials found
1201 matching atomic archives found
1201 calculation combinations to check
191 new records to prepare



### 4.4. Pool #4: Crystal space group analysis

These evaluate the crystal space group information for the relaxed structures computed above and for the initial prototype and DFT structures used.

Parent calculations

- relax_box
- relax_static
- relax_static:from_dynamic

#### 4.4.1. crystal_space_group:prototype

Evaluates the crystal space group information for the prototype structures.  Only needs to be done once per prototype.

    buildcombos                     crystalprototype load_file proto

#### 4.4.2. crystal_space_group:reference

Evaluates the crystal space group information for DFT relaxed structures.  Only needs to be done once per structure.

    buildcombos                     atomicreference load_file ref

#### 4.4.3. crystal_space_group:relax

Takes the resulting structures of relax_dynamic and subjects them to an energy/force minimization plus box relaxation.

    buildcombos                     atomicarchive load_file archive1
    buildcombos                     atomicarchive load_file archive2
    archive1_record                 calculation_relax_static
    archive1_load_key               final-system
    archive1_status                 finished
    archive2_record                 calculation_relax_box
    archive2_load_key               final-system
    archive2_status                 finished

In [10]:
# Specify master prepare options
prepare_terms['styles'] = ' '.join([
    #'crystal_space_group:prototype',
    #'crystal_space_group:reference',
    'crystal_space_group:relax',
])
prepare_terms['run_directory'] = 'master_4'
prepare_terms['np_per_runner'] = '1'
prepare_terms['num_pots']      = '100'

# Run master_prepare
database.master_prepare(**prepare_terms)

Using potential #s 0 to 2

Preparing calculation crystal_space_group branch relax
194852 existing calculation records found
583 matching atomic archives found
798 matching atomic archives found
1381 calculation combinations to check
1381 new records to prepare



### 4.5.  Further styles coming soon...

## 5. Runner

Once calculations have been prepared, you can then start runner jobs to perform them.  

Options for managing runners:

- Use the cell below to call runner() for the database.  This will perform one calculation at a time until finished or stopped.  Not recommended unless you only want one runner active at any given time.
- Open a separate terminal for each runner you wish to be active and call the "iprPy runner" command with the specific database and run directory you want each to use.  This isolates each runner in action and allows for runners to operate on the same or different databases and run directories.
- Submit runner jobs to a cluster that has access to the run directory and the database.


In [11]:
database.runner(run_directory='master_4')

Runner started with pid 56228
5d73c2f8-2825-4e66-a2c7-9929633404f5
sim calculated successfully

08b57466-c5cf-4354-9c93-a9f2d618863c
sim calculated successfully

9a378027-2556-4bea-9f29-b4f5231cf40d
sim calculated successfully

9f3e62d9-7ec9-42e8-99a8-b75e17eb4659
sim calculated successfully

ab93a7f9-0b8c-46f2-bf1e-f86b108e7194
sim calculated successfully

5354b670-b22f-4016-a69d-331cf907208e
sim calculated successfully

23fedbb6-2196-4e48-860e-3bd1ec20e148
sim calculated successfully

1c34eddc-90e5-404c-9fb1-d2b342787067
sim calculated successfully

339fcfaf-f1e4-4a75-a59b-13676908f5a3
sim calculated successfully

24d85c3b-6972-402e-90b7-ae815701fcca
sim calculated successfully

35427d61-7107-4896-b775-9d7ced3483f8
sim calculated successfully

7a20020c-0048-4ba3-8ccc-2547193311eb
sim calculated successfully

9db98b3c-0058-4b5a-a78a-18302b01d469
sim calculated successfully

539dadbd-173a-4da5-8727-1a3f54a02f16
sim calculated successfully

192a7a34-d662-4170-85e5-4caf338edbfa
sim calcu


81f694a0-a086-40f1-80eb-fc82c7466a79
sim calculated successfully

b2db60e3-e7ad-4e90-a63a-7002a5e3cff1
sim calculated successfully

3a120467-de8a-41bf-a0b8-c3b038b3a42b
sim calculated successfully

2eaba9cf-b3be-43fb-84b5-caf4dd391deb
sim calculated successfully

5d08b401-959c-44ce-8fc4-2b462ba36ab1
sim calculated successfully

19451b1a-f7a6-4bd6-a1ae-cbf502ca426c
sim calculated successfully

845ada04-5f69-420f-826b-ff4b8d548530
sim calculated successfully

fa0e4f2c-c7f3-4e82-803b-08cb10e1f32e
sim calculated successfully

e46ef8ba-66e0-4966-b9ec-0e38880002f4
sim calculated successfully

1ddf2181-982a-4a1d-a131-fbc1c63ae589
sim calculated successfully

e1cd3501-3747-4fb1-afc1-8ceb01eb1c28
sim calculated successfully

1d32cb5e-14a5-43af-922c-d6bb658693f6
sim calculated successfully

e1995605-1bb4-4d85-a14b-6acd8a2ca770
sim calculated successfully

a2419e6e-a907-4457-94db-a08b5be180eb
sim calculated successfully

be3b91af-03ba-42dd-ad43-4d6cc241e856
sim calculated successfully

cb2a12c5-

Exception ignored in: <function _Stream.__del__ at 0x000002805E6A4EE8>
Traceback (most recent call last):
  File "C:\Users\lmh1\AppData\Local\Continuum\anaconda3\lib\tarfile.py", line 416, in __del__
    if hasattr(self, "closed") and not self.closed:
KeyboardInterrupt


sim calculated successfully

06b235bb-0343-4866-baf8-055fde41ec60
sim calculated successfully

e5e8fdbe-fd68-4c87-80e6-e9599bedc96c
sim calculated successfully

5e29bca1-2894-47c9-af54-9fec927ebc28
sim calculated successfully

3916cb61-5c52-4d62-aa6e-fe2eb0b26edb
sim calculated successfully

8fd1ca83-11d0-4abf-acba-3e0b401155c3
sim calculated successfully

e7c7ea8d-ec75-44ba-a314-be7d8e3d3476
sim calculated successfully

44ee63f4-11af-4e3a-be76-0106a125f6f4
sim calculated successfully

dfcfdb65-cd76-4b17-b662-df4a2d37785c
sim calculated successfully

50da16d7-b1bf-4a4a-8ab5-00b20ed81e30
sim calculated successfully

2295f5e6-25f5-48d5-8ca1-da92328a3c4a
sim calculated successfully

9bcbe15f-10db-4437-8040-300fea1fe015
sim calculated successfully

116fce21-c99e-4cd3-8b86-7c8ec6f20638
sim calculated successfully

497212f5-f96e-4231-b1b4-5e50031ad8f8
sim calculated successfully

67031834-c68f-4898-a420-c6be2fa31b63
sim calculated successfully

248785da-4670-41f7-b693-e9ec66267f3e
sim calcul


7e6908d5-92a4-4aa3-8d32-e247c9c12b8f
sim calculated successfully

ab8bf9fd-22c3-48e9-a580-c31980981647
sim calculated successfully

d188fc70-9b06-4b34-a087-30034b3c63e0
sim calculated successfully

3cdebe29-36f5-4e02-a27b-a8ac89ed5b95
sim calculated successfully

ca64ec53-b2fc-4e64-b74f-e9cb494173e4
sim calculated successfully

3c0865d4-c6cd-4595-be3a-1c9d40560f13
sim calculated successfully

ace92867-fd3c-4469-8b32-688bd7e64678
sim calculated successfully

0917bafd-a45f-4a15-ac6e-dc0ccedc350d
sim calculated successfully

3fcf5947-30cb-4701-8cb8-6e6e743348db
sim calculated successfully

3e5c45de-fce2-457b-ba7f-73f195fd467b
sim calculated successfully

514ef4b6-bdfc-4678-8ec3-76c2f2f5e5ea
sim calculated successfully

4e69c712-3bed-42b8-8c89-84ac3585c46a
sim calculated successfully

5edffe02-05cb-477d-8600-1532f0efebcb
sim calculated successfully

e6f57138-1e33-4d89-87d3-21a5149c4653
sim calculated successfully

Didn't find an open simulation
No simulations left to run
