# PBAs: E Above Hull Analysis

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## Pre-processing and Loading

Before we can load the json into python, we need to change a few formatting issues with the file:

In [2]:
with open('pba.json', 'r') as file :
  pba_json = file.read()

In [3]:
#Getting rid of the /* i */ and replacing with a comma:
for i in range(1,700):
    j = str(i)
    pba_json = pba_json.replace('/* ' + j + ' */', ',')

print(pba_json[:100])

,
{
    "_id" : ObjectId("58d009ea48a464edfdfb435d")
}

,
{
    "_id" : ObjectId("58e5d103d95cbb63a6


In [4]:
#Adding square brackets:
pba_json = '[\n' + pba_json + '\n]'
#Deleting first comma:
pba_json = pba_json[:2] + pba_json[3:]

In [5]:
#Getting rid of the ObjectId() tag from the _id value:
pba_json = pba_json.replace("ObjectId(", "")
pba_json = pba_json.replace(")", "")

In [6]:
#Saving to file as pba_1.json
pba_1 = open('pba_1.json', 'w')
pba_1.write(pba_json)
pba_1.close()

Now we are ready to load the file using monty.serialization.loadfn, which loads the json into a list of dictionary entries.

In [7]:
from monty.serialization import loadfn

In [8]:
data_1 = loadfn('pba_1.json')

In [9]:
len(data_1)

536

In [10]:
data_1[1]

{'_id': '58e5d103d95cbb63a64878f0', 'input': {'structure': Structure Summary
  Lattice
      abc : 9.9509025313318 9.9509025313318 9.9509025313318
   angles : 89.99613296435679 90.00386703564321 90.00386703564321
   volume : 985.3429511575596
        A : 9.95090252 -0.0003358 -0.0003358
        B : -0.0003358 9.95090252 0.0003358
        C : -0.0003358 0.0003358 9.95090252
  PeriodicSite: Ca (7.4762, 7.4759, 7.4759) [0.7514, 0.7513, 0.7513]
  PeriodicSite: Ca (2.4747, 2.4744, 7.4759) [0.2487, 0.2486, 0.7513]
  PeriodicSite: Ca (2.4747, 7.4759, 2.4744) [0.2487, 0.7513, 0.2486]
  PeriodicSite: Ca (2.4705, 7.4801, 7.4801) [0.2483, 0.7517, 0.7517]
  PeriodicSite: Fe (0.0067, 9.9439, 9.9439) [0.0007, 0.9993, 0.9993]
  PeriodicSite: Fe (4.9721, 4.9785, 9.9466) [0.4997, 0.5003, 0.9996]
  PeriodicSite: Fe (0.0039, 4.9785, 4.9785) [0.0004, 0.5003, 0.5003]
  PeriodicSite: Fe (4.9721, 9.9466, 4.9785) [0.4997, 0.9996, 0.5003]
  PeriodicSite: Co (4.9696, 9.9460, 9.9460) [0.4995, 0.9995, 0.9995]
  P

Now that the pba data is loaded into python, we can begin to building pymatgen entries for each structure.

## Using Pymatgen

In [11]:
import pymatgen as mg

### Creating pymatgen entries

Next, we want to make pymatgen entries using the composition and energy values. Here is an example of a ComputedEntry:

In [12]:
from pymatgen.entries.computed_entries import ComputedEntry

my_entry = ComputedEntry(composition="Ni4O2",
                  energy=-28,
                  parameters={"potcar_symbols": ['pbe Ni_pv', 'pbe O'],
                              "hubbards":{'Ni': 6.2, 'O': 0.0}},
                  data={"oxide_type":"oxide"})

print(my_entry)

ComputedEntry None - Ni4 O2
Energy = -28.0000
Correction = 0.0000
Parameters:
potcar_symbols = ['pbe Ni_pv', 'pbe O']
hubbards = {'Ni': 6.2, 'O': 0.0}
Data:
oxide_type = oxide


The first step to creating a ComputedEntry is gettting the composition, which can be given either as a dict or as a string.

In [13]:
struct=data_1[1]['input']['structure']

In [59]:
struct.composition

Comp: Ca6 Fe4 Co4 C24 N24

In [None]:
Next, we access the energy value from the 'output' section of the main 

In [41]:
out = data_1[1]['output']
out['energy']

-476.8670732

Practice making my own pymatgen entry with the pba in data_1[1]:

In [42]:
struct=data_1[1]['input']['structure']
pba_1 = ComputedEntry(composition=struct.composition,
                  energy=data_1[1]['output']['energy'])

print(pba_1)

ComputedEntry None - Ca4 Fe4 Co4 C24 N24
Energy = -476.8671
Correction = 0.0000
Parameters:
Data:


In [52]:
range(0, 5)

range(0, 5)

Let's use this method to make a list of pba entries:

In [58]:
pba_entries = []
for i in range(0, len(data_1)):
    if 'input' not in data_1[i]:
        pba_entries.append(np.NaN)
    else:
        struct = data_1[i]['input']['structure']
        pba_entry = ComputedEntry(composition = struct.composition,
                                 energy = data_1[i]['output']['energy'])
        pba_entries.append(pba_entry)
pba_entries[1]

ComputedEntry None - Ca4 Fe4 Co4 C24 N24
Energy = -476.8671
Correction = 0.0000
Parameters:
Data:

Next, let's try to add this to the main entries list that was generated from the Materials Project. We will then try to apply the MPC corrections to this list, then get e_above_hull values.

In [16]:
from pymatgen import MPRester

In [17]:
mpr = MPRester(api_key='clRGHmBDgp1xt9zA') #need API key (froom MP website -> dashboard)

In [18]:
entries = mpr.get_entries_in_chemsys('Ca-Fe-Co-C-N'.split('-'))
#this gets all of the entries containing these elements

In [19]:
len(entries)

212

In [30]:
entries[1]

ComputedEntry mp-45 - Ca1
Energy = -2.0218
Correction = 0.0000
Parameters:
run_type = GGA
is_hubbard = False
pseudo_potential = {'functional': 'PBE', 'labels': ['Ca_sv'], 'pot_type': 'paw'}
hubbards = {}
potcar_symbols = ['PBE Ca_sv']
oxide_type = None
Data:
oxide_type = None

In [31]:
type(entries[1])

pymatgen.entries.computed_entries.ComputedEntry

Now we have to take into account the corrections

NELECT become nelect in parameters

see email for all parameters and how to access in the dictionary

In [21]:
from pymatgen.entries.compatibility import MaterialsProjectCompatibility

In [22]:
mpc = MaterialsProjectCompatibility()

In [23]:
entries_1 = mpc.process_entries(entries)
#use correction term here in building phase diagram
#put this entries_1 into the phase diagram method

Will then have energy above hull values, which is what we will analyze. 

see example for method on how to access e above hull (higher e above hull is less stable)

eventually put into df with A, P, R, # of A, and e above hull (we need an e above hull for every A = 1 to 8)

Remember that we only need the e above hull data for the pbas, we don't need that calculation for all of the other strucutres with the same A,P,R etc.

Not sure how to calculate the e above hull for materials that aren't already in the materials project database?

### Starting the analysis

In [24]:
from pymatgen import MPRester
mpr = MPRester(api_key='clRGHmBDgp1xt9zA')
from pymatgen.entries.compatibility import MaterialsProjectCompatibility
mpc = MaterialsProjectCompatibility()
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter

In [25]:
for i in range(1,2): #range(len(data_1)): #Looping through the length of the data_1 list
    if 'input' in data_1[i]:
        struct = data_1[i]['input']['structure']
        entries = mpr.get_entries_in_chemsys(struct.composition.as_dict().keys())
        entries_1 = mpc.process_entries(entries)
        pd = PhaseDiagram(entries)

In [26]:
plotter = PDPlotter(pd, show_unstable=True)
plotter.show()

ValueError: Only 1-4 components supported!

In [27]:
e_hull = [] #will eventually make this a dataframe
compositions = []
for e in entries_1:
    e_hull.append(pd.get_e_above_hull(e))
    compositions.append(e.composition.reduced_formula)
compositions

['Ca',
 'Ca',
 'Ca',
 'Ca',
 'Ca',
 'Ca',
 'Ca',
 'Ca',
 'Ca',
 'Fe',
 'Fe',
 'Fe',
 'Fe',
 'Fe',
 'Fe',
 'Fe',
 'Co',
 'Co',
 'Co',
 'Co',
 'Co',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'C',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'N2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC2',
 'CaC4',
 'CaC2',
 'Ca3N2',
 'CaN6',
 'Ca11N8',
 'Ca3N2',
 'Ca3N2',
 'CaN',
 'CaN2',
 'Ca3N2',
 'Ca2N',
 'Ca3N2',
 'Ca3N2',
 'Ca3N2',
 'Fe13Co3',
 'Fe3Co',
 'Fe11Co5',
 'Fe9Co7',
 'Fe15Co',
 'FeCo',
 'FeCo3',
 'Fe7Co',
 'Fe5Co3',
 'FeCo',
 'FeCo9',
 'Fe5C2',
 'Fe3C',
 'Fe5C2',
 'Fe7C3',
 'Fe3C',
 'Fe2C',
 'Fe3C',
 'Fe4C',
 'Fe5C2',
 'Fe2C',
 'Fe4C',
 'Fe4C',
 'Fe3C',
 'Fe7C3',
 'Fe3N',
 'Fe2N',
 '