# Crystal relaxation

This Notebook follows the workflow for identifying and analyzing bulk crystal structures as is currently being done with the Interatomic Potentials Repository.

**Quick Notes:**

- All input scripts take key-value pairs where the "key"s correspond to the calculation's input parameters.

- Multiple calculations are prepared by specifying multiple values for the same key (on separate lines).

- The special "buildcombos" key accesses predefined functions for generating lists of input parameter values for certain sets of keys.  See documentation for more details.

__Global workflow details:__

This Notebook uses:

- reference structures downloaded using the "1. Reference atomic structures" Notebook (optional).

- calculation_E_vs_r_scan records generated by the "2. Cohesive energy scans" Notebook.

The calculation_crystal_space_group records and the "unique_crystals.csv" file generated by this Notebook are used as inputs for many of the other Notebooks.

**Library imports**

In [1]:
# Standard Python libraries
from __future__ import (absolute_import, print_function,
                        division, unicode_literals)
import os

# http://www.numpy.org/
import numpy as np

from IPython.core.display import display, HTML

# https://pandas.pydata.org/
import pandas as pd

from DataModelDict import DataModelDict as DM

# https://github.com/usnistgov/atomman
import atomman.unitconvert as uc

# https://github.com/usnistgov/iprPy
import iprPy
print('iprPy version', iprPy.__version__)

iprPy version 0.8.3


## 0. Access database 

### Load database

In [2]:
database = iprPy.load_database('demo')

## 1. relax_box calculation

This calculation statically relaxes a given system by only adjusting the box dimensions to zero pressure without any internal relaxations, i.e. all atoms retain box-relative positions.

In [3]:
calculation = iprPy.load_calculation('relax_box')
run_directory = iprPy.load_run_directory('demo_1')

### Show calculation's allowed keys

In [4]:
print(calculation.allkeys)

['lammps_command', 'mpi_command', 'length_unit', 'pressure_unit', 'energy_unit', 'force_unit', 'potential_file', 'potential_content', 'potential_dir', 'load_file', 'load_content', 'load_style', 'family', 'load_options', 'symbols', 'box_parameters', 'a_uvw', 'b_uvw', 'c_uvw', 'atomshift', 'sizemults', 'pressure_xx', 'pressure_yy', 'pressure_zz', 'strainrange']


### Write input script

In [5]:
input_script = """
# Commands and executables
lammps_command              lmp_mpi
mpi_command                 

# Build load information based on reference structures
buildcombos                 atomicreference load_file reference

# Specify reference buildcombos limiters (only build for potential listed)
reference_potential_name    1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1
reference_potential_name    2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3

# Build load information from E_vs_r_scan results
buildcombos                 atomicparent load_file parent

# Specify parent buildcombos terms (parent record's style and the load_key to access)
parent_record               calculation_E_vs_r_scan              
parent_load_key             minimum-atomic-system

# System manipulations
a_uvw                      
b_uvw                      
c_uvw                    
atomshift                   
sizemults                   10 10 10

# Units that input/output values are in
length_unit                 
pressure_unit               
energy_unit                 
force_unit                  

# Run parameters
strainrange                 1e-6
"""
with open('input_script.in', 'w') as f:
    f.write(input_script)

### Prepare calculations

In [6]:
with open('input_script.in') as f:
    input_dict = iprPy.input.parse(f, singularkeys=calculation.singularkeys)
    
database.prepare(run_directory, calculation, **input_dict)

In [7]:
database.check_records(calculation.record_style)

In database style local at C:\Users\lmh1\Documents\calculations\ipr\demo :
- 139 of style calculation_relax_box
 - 91 are complete
 - 25 still to run
 - 23 issued errors


### Run calculations

In [8]:
database.runner(run_directory)

Runner started with pid 13976
No simulations left to run


In [9]:
results_df = database.get_records_df(style=calculation.record_style)
error_df = results_df[results_df.status=='error']
print(len(error_df), 'calculations issued errors:')
errors = []
for error in error_df.error:
    lines = error.splitlines()
    err = ''
    for i in range(len(lines)-1, -1, -1):
        if 'Error:' in lines[i]:
            err = '\n'.join(lines[i:-1])
            break
        if i == 0:
            err = error
    errors.append(err)
for error in np.unique(errors):
    print(error)

33 calculations issued errors:
b'Traceback (most recent call last):
  File "calc_relax_box.py", line 458, in <module>
    main(*sys.argv[1:])
  File "calc_relax_box.py", line 54, in main
    strainrange = input_dict[\'strainrange\'])
  File "calc_relax_box.py", line 145, in relax_box
    strainrange=strainrange, cycle=cycle)
  File "calc_relax_box.py", line 346, in calc_cij
    C = am.ElasticConstants(Cij=cij)
  File "c:\\users\\lmh1\\documents\\python-packages\\atomman\\atomman\\core\\ElasticConstants.py", line 59, in __init__
    self.Cij = kwargs[\'Cij\']
  File "c:\\users\\lmh1\\documents\\python-packages\\atomman\\atomman\\core\\ElasticConstants.py", line 115, in Cij
    assert value.max() > 0.0, \'Cij values not valid\'
AssertionError: Cij values not valid
'
b'Traceback (most recent call last):
  File "calc_relax_box.py", line 458, in <module>
    main(*sys.argv[1:])
  File "calc_relax_box.py", line 54, in main
    strainrange = input_dict[\'strainrange\'])
  F

## 2. relax_dynamic calculation

This calculation dymamically relaxes a given system for a specified number of MD integrations at a specified temperature, pressure, etc.  Here, we are only doing 0 K relaxations.

In [10]:
calculation = iprPy.load_calculation('relax_dynamic')
run_directory = iprPy.load_run_directory('demo_4')

### Write input script

In [11]:
input_script = """
# Commands and executables
lammps_command              lmp_mpi
mpi_command                 c:\\Program Files\\MPICH2\\bin\\mpiexec -localonly 4

# Build load information based on reference structures
buildcombos                 atomicreference load_file reference

# Specify reference buildcombos limiters (only build for potential listed)
reference_potential_name    1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1
reference_potential_name    2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3

# Build load information from E_vs_r_scan results
buildcombos                 atomicparent load_file parent

# Specify parent buildcombos terms (parent record's style and the load_key to access)
parent_record               calculation_E_vs_r_scan              
parent_load_key             minimum-atomic-system

# System manipulations
a_uvw                      
b_uvw                      
c_uvw       
atomshift                   
sizemults                   10 10 10

# Units that input/output values are in
length_unit                 
pressure_unit               
energy_unit                 
force_unit                  

# Run parameters
temperature                 0.0
pressure_xx                 
pressure_yy                 
pressure_zz                 
pressure_xy                 
pressure_xz                 
pressure_yz                 
integrator                  nph+l
thermosteps                 1000
dumpsteps                   
runsteps                    10000
equilsteps                  0
randomseed                  
"""
with open('input_script.in', 'w') as f:
    f.write(input_script)

### Prepare calculations

In [12]:
with open('input_script.in') as f:
    input_dict = iprPy.input.parse(f, singularkeys=calculation.singularkeys)
    
database.prepare(run_directory, calculation, **input_dict)

In [13]:
database.check_records(calculation.record_style)

In database style local at C:\Users\lmh1\Documents\calculations\ipr\demo :
- 139 of style calculation_relax_dynamic
 - 0 are complete
 - 139 still to run
 - 0 issued errors


### Run calculations

In [14]:
database.runner(run_directory)

Runner started with pid 13976
No simulations left to run


In [15]:
results_df = database.get_records_df(style=calculation.record_style)
error_df = results_df[results_df.status=='error']
print(len(error_df), 'calculations issued errors:')
errors = []
for error in error_df.error:
    lines = error.splitlines()
    err = ''
    for i in range(len(lines)-1, -1, -1):
        if 'Error:' in lines[i]:
            err = '\n'.join(lines[i:-1])
            break
        if i == 0:
            err = error
    errors.append(err)
for error in np.unique(errors):
    print(error)

0 calculations issued errors:


## 3. relax_static calculation

This calculation statically relaxes a given system using energy minimizations combined with box dimension relaxations.  Here, we pass in results from both the E_vs_r_scan calculation and the relax_dynamic calculation.

In [16]:
calculation = iprPy.load_calculation('relax_static')
run_directory = iprPy.load_run_directory('demo_1')

### Write input script

In [17]:
input_script = """
# Commands and executables
lammps_command              lmp_mpi
mpi_command                 

# Build load information based on reference structures
buildcombos                 atomicreference load_file reference

# Specify reference buildcombos limiters (only build for potential listed)
reference_potential_name    1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1
reference_potential_name    2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3

# Build load information from E_vs_r_scan results
buildcombos                 atomicparent load_file parent

# Specify parent buildcombos terms (parent record's style and the load_key to access)
parent_record               calculation_E_vs_r_scan              
parent_load_key             minimum-atomic-system

# System manipulations
a_uvw                      
b_uvw                      
c_uvw                 
atomshift                   
sizemults                   10 10 10

# Units that input/output values are in
length_unit                 
pressure_unit               
energy_unit                 
force_unit                  

# Run parameters
energytolerance             0.0
forcetolerance              1e-10 eV/angstrom
maxiterations               10000
maxevaluations              100000
maxatommotion               0.01 angstrom
maxcycles                   100
cycletolerance              1e-10
"""
with open('input_script.in', 'w') as f:
    f.write(input_script)

In [20]:
input_script = """
# Commands and executables
lammps_command              lmp_mpi
mpi_command                 

# Build load information from relax_dynamic results
buildcombos                 atomicarchive load_file archive

# Specify archive parent buildcombos terms (parent record's style and the load_key to access)
archive_record              calculation_relax_dynamic
archive_load_key            final-system

# System manipulations
a_uvw                      
b_uvw                      
c_uvw                 
atomshift                   
sizemults                   1 1 1

# Units that input/output values are in
length_unit                 
pressure_unit               
energy_unit                 
force_unit                  

# Run parameters
energytolerance             0.0
forcetolerance              1e-10 eV/angstrom
maxiterations               10000
maxevaluations              100000
maxatommotion               0.01 angstrom
maxcycles                   100
cycletolerance              1e-10
"""
with open('input_script.in', 'w') as f:
    f.write(input_script)

### Prepare calculations

In [21]:
with open('input_script.in') as f:
    input_dict = iprPy.input.parse(f, singularkeys=calculation.singularkeys)
    
database.prepare(run_directory, calculation, **input_dict)

In [22]:
database.check_records(calculation.record_style)

In database style local at C:\Users\lmh1\Documents\calculations\ipr\demo :
- 278 of style calculation_relax_static
 - 0 are complete
 - 278 still to run
 - 0 issued errors


### Run calculations

In [23]:
database.runner(run_directory)

Runner started with pid 13976
No simulations left to run


In [24]:
results_df = database.get_records_df(style=calculation.record_style)
error_df = results_df[results_df.status=='error']
print(len(error_df), 'calculations issued errors:')
errors = []
for error in error_df.error:
    lines = error.splitlines()
    err = ''
    for i in range(len(lines)-1, -1, -1):
        if 'Error:' in lines[i]:
            err = '\n'.join(lines[i:-1])
            break
        if i == 0:
            err = error
    errors.append(err)
for error in np.unique(errors):
    print(error)

54 calculations issued errors:
FileNotFoundError: [Errno 2] No such file or directory: \'10000.dump\'
ValueError: Filtering failed: 12000atoms expected, 11999 found
ValueError: Filtering failed: 16000atoms expected, 15999 found
ValueError: Filtering failed: 2000atoms expected, 1182 found
ValueError: Filtering failed: 3000atoms expected, 2528 found
ValueError: Filtering failed: 8000atoms expected, 7997 found
ValueError: Filtering failed: 8000atoms expected, 7998 found
ValueError: Filtering failed: 8000atoms expected, 7999 found


## 4. crystal_space_group calculation

This calculation analyzes the space group of a given system.  Here, this is used to determine if the bulk system's structure has transformed.

In [25]:
calculation = iprPy.load_calculation('crystal_space_group')
run_directory = iprPy.load_run_directory('demo_1')

In [26]:
print(calculation.allkeys)

['length_unit', 'pressure_unit', 'energy_unit', 'force_unit', 'load_file', 'load_content', 'load_style', 'family', 'load_options', 'symbols', 'box_parameters', 'symmetryprecision', 'primitivecell', 'idealcell']


### Write input script

In [27]:
input_script = """

# Build load information based on prototype records
buildcombos                 crystalprototype load_file

# Build load information based on reference structures
buildcombos                 atomicreference load_file ref

# Specify reference buildcombos limiters (only build for element sets listed)
ref_elements                Fe
ref_elements                Cu
ref_elements                Ni
ref_elements                Cu Ni

# Build load information from relax_static results
buildcombos                 atomicarchive load_file relax_static

# Specify archive parent buildcombos terms (parent record's style and the load_key to access)
relax_static_record         calculation_relax_static
relax_static_load_key       final-system

# Build load information from relax_box results
buildcombos                 atomicarchive load_file relax_box

# Specify archive parent buildcombos terms (parent record's style and the load_key to access)
relax_box_record            calculation_relax_box
relax_box_load_key          final-system

# Units that input/output values are in
length_unit                 
pressure_unit               
energy_unit                 
force_unit                  

# Run parameters
symmetryprecision           
primitivecell               
idealcell                   
"""
with open('input_script.in', 'w') as f:
    f.write(input_script)

### Prepare calculations

In [28]:
with open('input_script.in') as f:
    input_dict = iprPy.input.parse(f, singularkeys=calculation.singularkeys)
    
database.prepare(run_directory, calculation, **input_dict)

In [29]:
database.check_records(calculation.record_style)

In database style local at C:\Users\lmh1\Documents\calculations\ipr\demo :
- 445 of style calculation_crystal_space_group
 - 115 are complete
 - 330 still to run
 - 0 issued errors


### Run calculations

In [30]:
database.runner(run_directory)

Runner started with pid 13976
No simulations left to run


In [31]:
results_df = database.get_records_df(style=calculation.record_style)
error_df = results_df[results_df.status=='error']
print(len(error_df), 'calculations issued errors:')
errors = []
for error in error_df.error:
    lines = error.splitlines()
    err = ''
    for i in range(len(lines)-1, -1, -1):
        if 'Error:' in lines[i]:
            err = '\n'.join(lines[i:-1])
            break
        if i == 0:
            err = error
    errors.append(err)
for error in np.unique(errors):
    print(error)

0 calculations issued errors:


## 5. Calculation analysis

In [32]:
crystal_match_file = 'reference_prototype_match.csv'

### Load crystal_match_file

In [33]:
ref_proto_match = pd.read_csv(crystal_match_file)

### Retrieve finished calculation results

In [34]:
spg_records = database.get_records_df(style='calculation_crystal_space_group', full=True, flat=False, status='finished')

In [35]:
# Get key lists for relax_* calculations
raw_df = database.get_records_df(style='calculation_relax_box', full=False, flat=True)
try:
    box_keys = raw_df.key.tolist()
except:
    box_keys = []

raw_df = database.get_records_df(style='calculation_relax_static', full=False, flat=True)
try:
    static_keys = raw_df.key.tolist()
except:
    static_keys = []

raw_df = database.get_records_df(style='calculation_relax_dynamic', full=False, flat=True)
try:
    dynamic_keys = raw_df.key.tolist()
except:
    dynamic_keys = []

In [36]:
pot_records = database.get_records_df(style='potential_LAMMPS')

### Identify compositions

In [37]:
iprPy.analysis.assign_composition(spg_records, database)

### Split all spg records into references, prototypes and calculation relaxes

In [38]:
spg_records['record_type'] = 'calc'
spg_records.loc[(spg_records.load_file == spg_records.family + '.poscar'), 'record_type'] = 'reference'
spg_records.loc[(spg_records.load_file == spg_records.family + '.json'), 'record_type'] = 'prototype'

prototype_records = spg_records[spg_records.record_type == 'prototype']
reference_records = spg_records[spg_records.record_type == 'reference']
family_records = spg_records[(spg_records.record_type == 'prototype') | (spg_records.record_type == 'reference')]

calc_records = spg_records[spg_records.record_type == 'calc'].reset_index(drop=True)

In [39]:
calc_records.keys()

Index(['error', 'family', 'idealcell', 'iprPy_version', 'key', 'load_file',
       'load_options', 'load_style', 'pearson_symbol', 'primitivecell',
       'script', 'spacegroup_Schoenflies', 'spacegroup_international',
       'spacegroup_number', 'status', 'symbols', 'symmetryprecision', 'ucell',
       'wykoff_fingerprint', 'composition', 'record_type'],
      dtype='object')

### Analyze calculation results

In [40]:
results = []
for series in calc_records.itertuples():
    results_dict = {}
    
    # Copy over values in series    
    results_dict['calc_key'] = series.key
    results_dict['composition'] = series.composition
    results_dict['family'] = series.family
    results_dict['a'] = series.ucell.box.a
    results_dict['b'] = series.ucell.box.b
    results_dict['c'] = series.ucell.box.c
    results_dict['alpha'] = series.ucell.box.alpha
    results_dict['beta'] = series.ucell.box.beta
    results_dict['gamma'] = series.ucell.box.gamma
    
    # Identify prototype
    try:
        results_dict['prototype'] = ref_proto_match[ref_proto_match.reference==series.family].prototype.values[0]
    except:
        results_dict['prototype'] = series.family
    else:
        if pd.isnull(results_dict['prototype']):
            results_dict['prototype'] = series.family
    
    # Check if structure has transformed relative to reference
    family_series = family_records[family_records.family == series.family].iloc[0]
    results_dict['transformed'] = (not (family_series.spacegroup_number == series.spacegroup_number
                                   and family_series.pearson_symbol == series.pearson_symbol))
    
    # Extract info from parent calculations
    for parent in database.get_parent_records(name=series.key):
        parent_dict = parent.todict()

        if parent_dict['key'] in box_keys:
            results_dict['method'] = 'box'
            results_dict['E_coh'] = parent_dict['E_cohesive']
            results_dict['potential_LAMMPS_key'] = parent_dict['potential_LAMMPS_key']
            continue

        elif parent_dict['key'] in dynamic_keys:
            results_dict['method'] = 'dynamic'
            continue

        elif parent_dict['key'] in static_keys:
            results_dict['method'] = 'static'
            results_dict['E_coh'] = parent_dict['E_cohesive']
            results_dict['potential_LAMMPS_key'] = parent_dict['potential_LAMMPS_key']
    
    pot_record = pot_records[pot_records.key == results_dict['potential_LAMMPS_key']].iloc[0]
    results_dict['potential_id'] = pot_record.pot_id
    results_dict['potential_key'] = pot_record.pot_key
    results_dict['potential_LAMMPS_id'] = pot_record.id
    
    results.append(results_dict)
columns = ['calc_key', 'potential_LAMMPS_key', 'potential_LAMMPS_id', 'potential_key', 'potential_id',
           'composition', 'prototype', 'family', 'method', 'transformed',
           'E_coh', 'a', 'b', 'c', 'alpha', 'beta', 'gamma']
results = pd.DataFrame(results, columns=columns)

### Save raw crystal data per crystal

In [41]:
# Settings
outputpath = 'C:/Users/lmh1/Documents/demo_results'
savecolumns = ['calc_key',
               'prototype', 'family', 'method', 'transformed', 
               'E_coh', 'a', 'b', 'c', 'alpha', 'beta', 'gamma']

In [42]:
for implememtation_key in np.unique(results.potential_LAMMPS_key):
    imp_results = results[results.potential_LAMMPS_key == implememtation_key]
    potential = imp_results.iloc[0].potential_id
    implementation = imp_results.iloc[0].potential_LAMMPS_id
    
    contentpath = os.path.join(outputpath, potential, implementation)
    if not os.path.isdir(contentpath):
        os.makedirs(contentpath)
    
    for composition in np.unique(imp_results.composition):
        comp_results = imp_results[imp_results.composition == composition].sort_values('E_coh')
        fstem = 'crystal.' + composition
        
        comp_results[savecolumns].to_csv(os.path.join(contentpath, fstem + '.csv'), index=False)

In [43]:
results

Unnamed: 0,calc_key,potential_LAMMPS_key,potential_LAMMPS_id,potential_key,potential_id,composition,prototype,family,method,transformed,E_coh,a,b,c,alpha,beta,gamma
0,0061c5d2-c4fa-44f2-816d-fecbf963d122,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,Ni,A7--alpha-As,A7--alpha-As,dynamic,True,-4.450000,3.520000,3.520000,3.520000,90.000000,90.000000,90.000000
1,01da758f-6fa6-4f5f-b630-ef6316e29ec2,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,CuNi,oqmd-1225682,oqmd-1225682,dynamic,True,-2.985313,4.171929,7.221554,4.825708,90.000000,93.601846,90.000000
2,0331b775-12bb-4fca-a99f-db7ce99af7b3,274c2811-3b80-4aa6-82ca-d2bdfecbd442,2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3,06bf7ccd-ea6a-4744-bfbc-adefa5bcdd60,2003--Mendelev-M-I-Han-S-Srolovitz-D-J-et-al--...,Fe,oqmd-1214792,oqmd-1214792,dynamic,False,-4.034054,8.848551,8.848551,8.848551,90.000000,90.000000,90.000000
3,035fecdc-0941-43fd-8fba-eb4deb93c801,274c2811-3b80-4aa6-82ca-d2bdfecbd442,2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3,06bf7ccd-ea6a-4744-bfbc-adefa5bcdd60,2003--Mendelev-M-I-Han-S-Srolovitz-D-J-et-al--...,Fe,A1--Cu--fcc,oqmd-7505,static,False,-4.002045,3.658364,3.658364,3.658364,90.000000,90.000000,90.000000
4,04892de3-cdfe-415b-beb0-8cd89cf8e8ad,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,Ni,A1--Cu--fcc,oqmd-676148,static,False,-4.450000,3.520000,3.520000,3.520000,90.000000,90.000000,90.000000
5,068907d3-2b7d-4c74-9e7f-923f80920930,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,Cu,A4--C--dc,oqmd-1215500,box,False,-2.502544,5.361341,5.361341,5.361341,90.000000,90.000000,90.000000
6,084bdc7f-39ed-4a1e-ad2d-d3f1592a7f76,274c2811-3b80-4aa6-82ca-d2bdfecbd442,2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3,06bf7ccd-ea6a-4744-bfbc-adefa5bcdd60,2003--Mendelev-M-I-Han-S-Srolovitz-D-J-et-al--...,Fe,oqmd-1215950,oqmd-1215950,dynamic,True,-4.122435,2.855325,2.855325,2.855325,90.000000,90.000000,90.000000
7,094b5e29-cd22-41b6-8699-5c7adfab594c,274c2811-3b80-4aa6-82ca-d2bdfecbd442,2003--Mendelev-M-I--Fe-2--LAMMPS--ipr3,06bf7ccd-ea6a-4744-bfbc-adefa5bcdd60,2003--Mendelev-M-I-Han-S-Srolovitz-D-J-et-al--...,Fe,A2--W--bcc,oqmd-8568,static,False,-4.122435,2.855325,2.855325,2.855325,90.000000,90.000000,90.000000
8,0a25ad6f-13db-4d88-b36d-8d9af349f957,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,Ni,oqmd-1215975,oqmd-1215975,static,True,-3.418124,6.966523,4.011898,4.453368,90.000000,97.128239,90.000000
9,0cc38446-7b41-4632-af4b-9970809c5296,062d2ba7-3903-40ae-a772-daa471d107c6,1985--Foiles-S-M--Ni-Cu--LAMMPS--ipr1,301f04ce-9082-4542-8590-489300cd19e8,1985--Foiles-S-M--Ni-Cu,Cu,A1--Cu--fcc,mp-30,box,False,-3.540000,3.615000,3.615000,3.615000,90.000000,90.000000,90.000000


### Identify unique crystals

In [44]:
# Create empty "unique" dataframe
unique = pd.DataFrame(columns=results.columns)

# Loop over all potential implementations
for implememtation_key in np.unique(results.potential_LAMMPS_key):
    imp_results = results[results.potential_LAMMPS_key == implememtation_key]
    
    # Loop over all compositions
    for composition in np.unique(results.composition):
        comp_unique = pd.DataFrame(columns=results.columns)
        comp_results = imp_results[imp_results.composition == composition]
        
        # Loop over all prototypes
        for prototype in np.unique(comp_results.prototype):
            proto_results = comp_results[comp_results.prototype == prototype]
            
            # Loop over calculation methods from most robust to least
            for method in ['dynamic', 'static', 'box']:
                
                # First try matching results where prototype == family
                for i, series in proto_results[(proto_results.prototype == proto_results.family)
                                              &(proto_results.method == method)
                                              &(~proto_results.transformed)].iterrows():
                    try:
                        matches = comp_unique[(np.isclose(comp_unique.E_coh, series.E_coh))
                                             &(np.isclose(comp_unique.a, series.a))
                                             &(np.isclose(comp_unique.b, series.b))
                                             &(np.isclose(comp_unique.c, series.c))
                                             &(np.isclose(comp_unique.alpha, series.alpha))
                                             &(np.isclose(comp_unique.beta, series.beta))
                                             &(np.isclose(comp_unique.gamma, series.gamma))]
                    except:
                        matches = []
                    if len(matches) == 0:
                        comp_unique = comp_unique.append(series)
                        
                # Next try matching results where prototype != family
                for i, series in proto_results[(proto_results.prototype != proto_results.family)
                                              &(proto_results.method == method)
                                              &(~proto_results.transformed)].iterrows():
                    try:
                        matches = comp_unique[(np.isclose(comp_unique.E_coh, series.E_coh))
                                             &(np.isclose(comp_unique.a, series.a))
                                             &(np.isclose(comp_unique.b, series.b))
                                             &(np.isclose(comp_unique.c, series.c))
                                             &(np.isclose(comp_unique.alpha, series.alpha))
                                             &(np.isclose(comp_unique.beta, series.beta))
                                             &(np.isclose(comp_unique.gamma, series.gamma))]
                    except:
                        matches = []
                    if len(matches) == 0:
                        comp_unique = comp_unique.append(series)
                        
        unique = unique.append(comp_unique)
unique.to_csv('unique_crystals.csv', index=False)

### Add info to PotentialProperties records

This is for generating XML records that the Interatomic Potential Repository uses to automatically build webcontent (done elsewhere).

In [45]:
for implememtation_key in np.unique(results.potential_LAMMPS_key):
    imp_results = results[results.potential_LAMMPS_key == implememtation_key]
    imp_unique = unique[unique.potential_LAMMPS_key == implememtation_key]
    potential_key = imp_results.iloc[0].potential_key
    potential_id = imp_results.iloc[0].potential_id
    implementation_id = imp_results.iloc[0].potential_LAMMPS_id
    
    record_name = 'properties.' + implementation_id
    try:
        record = database.get_record(name=record_name, style='PotentialProperties')
    except:
        new = True
        content = DM()
        content['per-potential-properties'] = DM()
        content['per-potential-properties']['potential'] = DM()
        content['per-potential-properties']['potential']['key'] = potential_key
        content['per-potential-properties']['potential']['id'] = potential_id
        content['per-potential-properties']['implementation'] = DM()
        content['per-potential-properties']['implementation']['key'] = implememtation_key
        content['per-potential-properties']['implementation']['id'] = implementation_id
    else:
        content = DM(record.content)
        new = False
    
    content['per-potential-properties']['crystal-structure'] = model = DM()
    
    # Build prototype-ref-set elements
    for composition in np.unique(imp_results.composition):
        comp_results = imp_results[imp_results.composition == composition]
        for prototype in np.unique(comp_results.prototype):
            proto_results = comp_results[comp_results.prototype == prototype]
            refs = []
            for family in np.unique(proto_results.family):
                if family != prototype:
                    refs.append(family)
            if len(refs) > 0:
                proto_ref_set = DM()
                proto_ref_set['composition'] = composition 
                proto_ref_set['prototype'] = prototype
                for ref in refs:
                    proto_ref_set.append('ref', ref)
                model.append('prototype-ref-set', proto_ref_set)    
    
    # Build crystal elements
    for series in imp_unique.sort_values(['composition', 'E_coh']).itertuples():
        crystal = DM()
        crystal['composition'] = series.composition
        crystal['prototype'] = series.prototype
        crystal['method'] = series.method
        crystal['cohesive-energy'] = DM([('value', '%#.4f'%series.E_coh), ('unit', 'eV')])
        crystal['a'] = DM([('value', '%#.4f'%series.a), ('unit', 'angstrom')])
        crystal['b'] = DM([('value', '%#.4f'%series.b), ('unit', 'angstrom')])
        crystal['c'] = DM([('value', '%#.4f'%series.c), ('unit', 'angstrom')])
        crystal['alpha'] = DM([('value', '%#.1f'%series.alpha), ('unit', 'degree')])
        crystal['beta'] = DM([('value', '%#.1f'%series.beta), ('unit', 'degree')])
        crystal['gamma'] = DM([('value', '%#.1f'%series.gamma), ('unit', 'degree')])
        model.append('crystal', crystal)
        
    if new:
        database.add_record(name=record_name, style='PotentialProperties', content=content.xml())
    else:
        database.update_record(record=record, content=content.xml())