# Importing data from VASP calculation into Citrination

Citrination is a public, un-siloed repository of materials data coupled to analysis and modeling tools.

By putting data on Citrination, you can:
 1. Share incremental results within your group
 1. Supplement your data with similar published data
 1. Release your data to the public as you publish associated papers
 1. Recieve feedback on the quality of your DFT calculations
 1. View statistical analyses of the data as it comes in
 1. Build machine learning models on the data that update as data comes in

There are two steps for getting data from any format onto Citrination:
 1. formatting the data as a PIF
 1. uploading to Citrination

## Formatting VASP outputs as PIFs

We provide scripts to extract common conditions and properties from VASP calculations.  You just pass in a path to the calculation and it returns a PIF!

In [1]:
from dfttopif import directory_to_pif
pif = directory_to_pif("/home/maxhutch/science/alloy.pbe/B/B.hR12")

In [2]:
from pypif.util.read_view import ReadView
rv = ReadView(pif)
print(rv.keys())
print(rv.chemical_formula)

dict_keys(['Converged', 'Cutoff Energy', 'k-Points per Reciprocal Atom', 'Relaxed', 'VASP', 'Total Energy', 'Pressure', 'Psuedopotentials', 'XC Functional', 'Density Functional Theory'])
B12


The PIF is a lightweight schema on top of the JSON format:

In [3]:
from pypif.pif import dumps
print(dumps(pif, indent=2)[:500])

{
  "chemicalFormula": "B12",
  "properties": [
    {
      "method": {
        "software": {
          "name": "VASP",
          "version": "5.2.12"
        },
        "name": "Density Functional Theory"
      },
      "units": "eV",
      "scalars": [
        {
          "value": -80.346476
        }
      ],
      "dataType": "COMPUTATIONAL",
      "conditions": [
        {
          "units": "eV",
          "scalars": [
            {
              "value": 318.6
            }
          ],
  


## Uploading files to Citrination

PIFs and other files can be uploaded to Citrination via the `citrination_client` package.

### Setting up the client

The client authenticates with your API key, which is located on the "Account" page on Citrination.  Please keep these keys out of your source code.  Environmnet variables are a best practice. 

In [4]:
from citrination_client import CitrinationClient
from os import environ
client = CitrinationClient(environ['CITRINATION_API_KEY'], 'https://stage.citrination.com')

We'll use the same client to query and download from Citrination later.

### Creating a dataset

To upload data, we'll need to specify a dataset for it to live in.

It is easiest to create a dataset via the website.

You can share datasets via the groups tab.

You'll only need to do this once.

### Uploading a pif to a dataset

The client uploads files, so we create a temporary file to store the pif in.

We may also want to add a tag to the pif, which will make it easier to search and filter pifs later.

In [5]:
from tempfile import TemporaryDirectory
from pypif.pif import dump
from os.path import join

dataset_id = 758
pif.tags = ["my_first_upload",]
with TemporaryDirectory() as tmpdir:
    tempname = join(tmpdir, "pif.json")
    with open(tempname, "w") as fp:
        dump(pif, fp)
    client.upload_file(tempname, dataset_id)

## Why are we doing this again?

 1. Putting data on Citrination makes sharing easy, both privately during the study and publicly after publication
 1. Citrination will analyze and model your data as it comes in