# Atomate2 Workflow Ingestor 
###  Kat Nykiel, Alejandro Strachan
School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana 47907, United States

## Load Atomate2 TaskDocuments

This Sim2L uses parsed VASP results in the form of TaskDocuments from [atomate2](https://github.com/materialsproject/atomate2). 

These documents are obtained from VASP using Atomate2's [VaspDrone](https://materialsproject.github.io/atomate2/reference/atomate2.vasp.drones.VaspDrone.html#atomate2.vasp.drones.VaspDrone), saved as json files. For example, running the following in a directory of VASP results would provide a TaskDocument json file

```
# Import libraries
from atomate2.vasp.drones import VaspDrone
from monty.json import jsanitize

# Parse results with atomate2
drone = VaspDrone()
doc = drone.assimilate()
doc = jsanitize(doc.dict(),recursive_msonable=True)

# Save results as json file
with open('doc.json','w',encoding='utf-8') as f_o:
    json.dump(doc,f_o)
```

These documents contain most information about the VASP run; however, its large size means it is not easily queryable. The purpose of this Sim2L is to extract relevant features from this schema which can further be used in machine learning workflows.

In [None]:
# Import libraries
import numpy as np
import json
import os
import pandas as pd

# Import nanoHUB-specific libraries
import nanohubremote as nr
from simtool import findInstalledSimToolNotebooks,searchForSimTool
from simtool import getSimToolInputs,getSimToolOutputs,Run

Here we load an example TaskDocument. If you want to upload your own file, change the path to your chosen JSON file

In [None]:
# Load json file
path = './../examples/doc.json' # set path to JSON file
with open(path,'r') as f:
    doc = json.load(f)   

## Load Sim2L

In [None]:
# Load the Sim2L
simToolName = "vaspingestor"
simToolLocation = searchForSimTool(simToolName)
for key in simToolLocation.keys():
    print("%18s = %s" % (key,simToolLocation[key]))

installedSimToolNotebooks = findInstalledSimToolNotebooks(simToolName,returnString=True)
print(installedSimToolNotebooks)

In [None]:
# Get the list of inputs
inputs = getSimToolInputs(simToolLocation)
print(inputs)

In [None]:
# Get the list of outputs
outputs = getSimToolOutputs(simToolLocation)
print(outputs)

## Submit Sim2L sequentially

In [None]:
# Set the inputs for the Sim2L
inputs['doc'].value = doc
inputs['author'].value = "Jane Doe"
inputs['dataset'].value = "example"

In [None]:
# Run Sim2L
r = Run(simToolLocation,inputs)

In [None]:
# Obtain results for Sim2L
r.getResultSummary()