# Pure Phase SymPy Code Generation

A simple notebook to generate the Phase interface for pure phases which we treat her as solution phases with a single endmember so that the total Gibbs free energy for a pure phase is just

$$
  G = n\mu(T,P)
$$

where $\mu$ is the chemical potential of the single endmember at $T,P$, and $n is the number of moles of endmember

In [None]:
import os,sys
import pandas as pd
import numpy as np
import sympy as sym
import hashlib
import time
sym.init_printing()

import molmass as mm

Required ENKI packages

In [None]:
from thermocodegen.coder import coder

### let's set up some directory names for clarity

In [None]:
HOME_DIR = os.path.abspath(os.curdir)
SPUD_DIR = HOME_DIR+'/../phases'

try:
    os.mkdir(SPUD_DIR)
except:
    pass

Set a reference string for this Notebook

In [None]:
reference = 'Thermocodegen-v0.6/share/thermocodegen/examples/Systems/fo_h20/notebooks/Generate_phases.ipynb'

## Pure Phase Coder model
This notebook will use the SimpleSolutionModel from coder to simply wrap a single endmember
as a phase.  For example,  we will create an Olivine Phase that is just pure Forsterite using
the Forsterite_berman as an endmember

## Number of solution components
A pure phase has only one component

In [None]:
c = 1

## Create a simple solution model
... with the specified number of endmember thermodynamic components

In [None]:
model = coder.SimpleSolnModel.from_type(nc=c)

## Retrieve primary compositional variables
- $n$ is a vector of mole numbers of each component  
- $n_T$ is the total number of moles in the solution
### and construct a derived mole fraction variable
- $X$ is a vector of mole fractions of components in the system

In [None]:
n = model.n
nT = model.nT
X = n/nT
n, nT, X

## Retrieve the temperature, pressure, and standard state chemical potentials
- $T$ is temperature in $K$
- $P$ is pressure in $bars$
- $\mu$ in Joules

In [None]:
T = model.get_symbol_for_t()
P = model.get_symbol_for_p()
mu = model.mu
T,P,mu

## Define the standard state contribution to solution properties

In [None]:
G_ss = (n.transpose()*mu)[0]
G_ss

## Define the Gibbs free energy of the Phase

In [None]:
G = G_ss
G

## Add the Gibbs free energy of solution to the model

In [None]:
model.add_potential_to_model('G',G)

### let's inspect the dictionary and unset parameters

In [None]:
model.model_dict

## Create dataframe for selected Pure phases

Here we will consider:
* an Olivine phase that is pure Forsterite
* a Serpentine phase that is pure Chrysotile
* a pure phase Brucite

In [None]:
pure_phase_info = pd.read_csv('data/thermoengine_pure_phases.csv')
soln_phase_info = pd.read_csv('data/thermoengine_soln_phases.csv')

In [None]:
pure_phase_info.head()

In [None]:
pure_phase_info[pure_phase_info['Name'].isin(['Serpentine'])]

In [None]:
soln_phase_info[soln_phase_info['Name'].isin(['Serpentine'])]
soln_phase_info

### add a useful little function for extracting a field for a given phase 

Looks in both pure phases and solution phases and does a bit of error checking

In [None]:
def get_field(name,field):
    abbrev = None
    try:
        abbrev = pure_phase_info.loc[pure_phase_info['Name']==name, field].values[0]
    except IndexError:
        pass
    if abbrev is not None:
        return abbrev
    else:
        try:
            abbrev = soln_phase_info.loc[soln_phase_info['Name']==name, field].values[0]
        except IndexError as err:
            print('Warning: phase name {} can\'t be found in database')
        except KeyError as e:
            print(e)
        
        return abbrev  

### Choose phase names and endmembers and put in a dictionary

In [None]:
phase_dict = dict(Olivine = 'Forsterite_berman',
                  Chrysotile = 'Chrysotile_berman',
                  Brucite = 'Brucite_berman',
                  Water = 'SWIM_water')

extract abbreviations and formulas from the databases

In [None]:
names = list(phase_dict.keys())
endmembers = list(phase_dict.values())
abbrevs = [ get_field(name,'Abbrev') for name in names ]
formulas = [ get_field(name.replace('_berman',''),'Formula') for name in endmembers ]

In [None]:
print(names)
print(endmembers)
print(abbrevs)
print(formulas)

Fix up water abbrev and formula

In [None]:
abbrevs[-1] = 'H2O'
formulas[-1] = 'H2O'

create dataframe from lists and clean up assorted fields

In [None]:
df = pd.DataFrame.from_dict(dict(name=names,abbrev=abbrevs,endmembers=endmembers,formula=formulas))
df

In [None]:
## let's try to convert the formula string into the proper coder version
form = df.iloc[-1].formula
print(form)


In [None]:
def get_formula_info(string):
    form = mm.Formula(string)
    comp = form.composition()
    elements = np.array([ c[0] for c in comp ])
    index = np.argsort(np.array([ string.find(e) for e in elements]))
    elements = elements[index]
    formula_string = ''.join([ '{}[{}]'.format(e,e) for e in elements])
    conversion_string = [ '[0] = [{}]/{}'.format(comp[0][0],float(comp[0][1])) ]
    return dict(formula_string=formula_string,conversion_string=conversion_string)

In [None]:
print(get_formula_info('H2O'))
form = 'Mg(OH)2'
print(get_formula_info(form))

## Loop over phases and dump spud-files

Unfortunately, this version will probably end up munging the phase formulas because I'm not quite sure how to do the proper conversion from the actual pure phase formula to the solution formula

In [None]:
values_dict = model.get_values()
values_dict['reference']=reference
values_dict['test_string']= [ '[0] > 0.0' ]
values_dict

### Write out spud files for all pure phases

In [None]:
for i, row in df.iterrows():
    row_dict = row.to_dict()
    formula = row_dict.pop('formula')
    values_dict.update(row_dict)
    values_dict.update(get_formula_info(formula))
    print(values_dict['conversion_string'])
    values_dict['endmembers'] = [ values_dict['endmembers'] ]
    print('Writing {}: {}'.format(values_dict['name'],formula))
    model.set_values(values_dict)
    model.to_xml(path=SPUD_DIR)