# VASP Ingestor
###  Kat Nykiel, Alejandro Strachan
School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana 47907, United States

## Abstract
This [Sim2L](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0264492) allows researchers to share their density functional theory calculations performed using VASP and make the findable, accessible, interoperable, and reusable ([FAIR](https://www.go-fair.org/fair-principles/)). Users can upload their raw output file (with minimal manipulation) and the Sim2L extracts basic results and automatically indexes the raw and post-processed data into a globally accessible database. The table associated with this tool has a unique identifier ([DOI](doi:10.21981/4FM5-GB08)) and each entry has a unique internal identifier.

Using this Sim2L, researchers satisfy data-sharing requirements such as those called for by the [US Office of Science and Technology Policy](https://www.whitehouse.gov/ostp/news-updates/2022/08/25/ostp-issues-guidance-to-make-federally-funded-research-freely-available-without-delay/)


## VASP Data Ingestor Input/Output Overview

The Sim2L takes three inputs: 
- the VASP run in JSON format (obtained with [atomate2](https://github.com/materialsproject/atomate2))
- the Author associated with the dataset
- a Tag to identify the specific dataset

The Sim2L ingests the results and extracts the following outputs: 

| Name               | Type       | Unit | Description                                                                |                                                                            |
| ------------------ | ---------- | ---- | -------------------------------------------------------------------------- |                              |
| Structure (O)          | Dictionary | n/a  | pymatgen Structure object, containing lattice vectors and atomic positions |
| Composition (O)         | Dictionary | n/a  | Chemical composition of the unit cell                                      |
| lattice_parameters (O)  | Array      | Å    | a, b, c lattice parameters of unit cell                                    |
| Energy (O)              | Number     | eV   | Total energy of the system                                                 |
| Stress (O)              | Array      | kbar | External pressure of the system                                            |
| Forces (O)              | Array      | eV/Å | List of (x,y,z) forces on each atom                                        |
| max_force (O)           | Number     | eV/Å | Maximum force reported during the simulation                               |
| rms_force (O)           | Number     | eV/Å | Root mean square force reported during the simulation                      |
| KPOINTS (O)             | Array      | n/a  | Number of k-points in the x, y, and z directions                           |
| ENCUT (O)               | Number     | eV   | Kinetic energy cutoff for the plane wave basis set                         |
| XC_functional (O)       | String     | n/a  | Choice of exchange-correlation functional used, read from VASP's GGA tag   |
| Pseudopotential (O)     | String     | n/a  | Choice of pseudopotential used                                             |

All inputs and outputs are indexed in the ResultsDB: https://nanohub.org/results.  

The Sim2L itself can be found here here: [vaspingestor.ipynb](simtool/vaspingestor.ipynb)


## Upload your VASP run and make your results FAIR
This notebook shows how to upload your VASP run into the Sim2L. nanoHUB does the rest automatically. Once you upload your data, it will be automatically indexed into the results database.

[Upload your VASP file and share it](notebooks/workflow.ipynb)


## Query the ResultsDB and explore VASP data 
This notebook shows how to query the ResultsDB and visualize the data. The example is from the paper “High-Throughput Density Functional Theory Screening of Double Transition Metal MXenes” by Nykiel and Strachan and shows how to reproduce some of the figures in the publication.

[resultsDB_query.ipynb](notebooks/resultsDB_query.ipynb)