# Atomate2 Workflow Ingestor 
###  Kat Nykiel, Alejandro Strachan
School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana 47907, United States

## Load Atomate2 Workflow

This Sim2L stores atomate2 workflows by ingesting each firework individually. To keep track of the workflows, you can run the following commands to query your atomate2 database.

```python
from fireworks import LaunchPad
from jobflow import SETTINGS
import json

# connect to the launchpad (mongodb)
lp = LaunchPad.auto_load()

# query the launchpad for a workflow
wf = lp.workflows.find_one({"nodes": 93701})

# write the workflow graph to a json file
with open("workflow.json", "w") as f:
    json.dump(wf['links'], f, default=str)

# connect to the job store (mongodb)
store = SETTINGS.JOB_STORE
store.connect()

# query the job store for the fireworks in the given workflow
fw_docs = list(store.query({"metadata.fw_id": {"$in": wf['nodes']}}))

# write the firework docs to a json file
with open("fireworks.json", "w") as f:
    json.dump(fw_docs, f, default=str)
```

In [1]:
# Import libraries
import numpy as np
import json
import os
import pandas as pd

# Import nanoHUB-specific libraries
import nanohubremote as nr
from simtool import findInstalledSimToolNotebooks,searchForSimTool
from simtool import getSimToolInputs,getSimToolOutputs,Run

Here we load an example elastic constant workflow. If you want to upload your own file, change the path to your chosen JSON files of the workflow graph and firework docs

In [2]:
# Load json files
wf_path = './../examples/workflow.json' # workflow graph JSON file
with open(wf_path,'r') as f:
    wf_graph = json.load(f)   
    
fws_path = './../examples/fireworks.json' # firework docs JSON file
with open(fws_path,'r') as f:
    fw_docs = json.load(f)

## Load Sim2L

In [3]:
# Load the Sim2L
simToolName = "wflowingestor"
simToolLocation = searchForSimTool(simToolName)
for key in simToolLocation.keys():
    print("%18s = %s" % (key,simToolLocation[key]))

installedSimToolNotebooks = findInstalledSimToolNotebooks(simToolName,returnString=True)
print(installedSimToolNotebooks)

      notebookPath = /home/nanohub/nykiel.4/wflow-ingestor/simtool/wflowingestor.ipynb
       simToolName = wflowingestor
   simToolRevision = None
         published = False
{}


In [4]:
# Get the list of inputs
inputs = getSimToolInputs(simToolLocation)

In [5]:
# Get the list of outputs
outputs = getSimToolOutputs(simToolLocation)

## Submit Sim2L sequentially

In [6]:
# Set the inputs for the Sim2L
inputs['wf_graph'].value = wf_graph
inputs['author'].value = "Jane Doe"
inputs['dataset'].value = "example"

Here, we loop through the firework docs to ingest each in our resultsDB

In [7]:
for fw_doc in fw_docs[:2]:
    
    inputs['fw_doc'].value = fw_doc
    r = Run(simToolLocation,inputs)
    print(r.getResultSummary())

runname = f44b89ee8c6942c1a66824f6620ab353
outdir  = RUNS/f44b89ee8c6942c1a66824f6620ab353
cached  = False
submit --local /apps/share64/debian10/anaconda/anaconda-7/bin/papermill
       --no-request-save-on-cell-execute --autosave-cell-every 0
       --no-use-black-format-injection --parameters_file inputs.yaml
       /home/nanohub/nykiel.4/wflow-ingestor/simtool/wflowingestor.ipynb
       wflowingestor.ipynb


Input Notebook:  /home/nanohub/nykiel.4/wflow-ingestor/simtool/wflowingestor.ipynb
Output Notebook: wflowingestor.ipynb
Executing notebook with kernel: python3
Executing: 100%|██████████| 15/15 [00:27<00:00,  1.81s/cell]


                       name  \
0  simToolSaveErrorOccurred   
1    simToolAllOutputsSaved   
2                     fw_id   
3                      name   
4                 structure   
5                   vasp_id   
6                   outputs   

                                                data encoder display  \
0                                                  0    text    None   
1                                                  1    text    None   
2                                              91865    text    None   
3                                          "relax 1"    text    None   
4  {"@class": "Structure", "@module": "pymatgen.c...    text    None   
5                                                 ""    text    None   
6                                                 {}    text    None   

              filename  
0  wflowingestor.ipynb  
1  wflowingestor.ipynb  
2  wflowingestor.ipynb  
3  wflowingestor.ipynb  
4  wflowingestor.ipynb  
5  wflowingestor.ipynb  

Input Notebook:  /home/nanohub/nykiel.4/wflow-ingestor/simtool/wflowingestor.ipynb
Output Notebook: wflowingestor.ipynb
Executing notebook with kernel: python3
Executing: 100%|██████████| 15/15 [01:13<00:00,  4.89s/cell]


                       name  \
0  simToolSaveErrorOccurred   
1    simToolAllOutputsSaved   
2                     fw_id   
3                      name   
4                 structure   
5                   vasp_id   
6                   outputs   

                                                data encoder display  \
0                                                  0    text    None   
1                                                  1    text    None   
2                                              91864    text    None   
3                                          "relax 2"    text    None   
4  {"@class": "Structure", "@module": "pymatgen.c...    text    None   
5                                                 ""    text    None   
6                                                 {}    text    None   

              filename  
0  wflowingestor.ipynb  
1  wflowingestor.ipynb  
2  wflowingestor.ipynb  
3  wflowingestor.ipynb  
4  wflowingestor.ipynb  
5  wflowingestor.ipynb  