-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Implement writing of ChemSage DAT files #422
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Max, this is looking great! Thanks so much for working on this. Besides the specific comments, in general this PR is going to need several tests. Any DAT files you're able to contribute to the test suite would be highly appreciated. In particular I think we'll need tests for:
- DAT read/write/read (round-trip) Database equality comparison, for a few cases
- TDB->DAT conversion, where supported (can we achieve Database object equality on round-trip conversion?)
@@ -12,6 +12,9 @@ | |||
from symengine import S, log, Piecewise, And | |||
from pycalphad import Database, variables as v | |||
from .grammar import parse_chemical_formula | |||
import datetime | |||
import getpass | |||
from sympy import simplify, piecewise_fold, expand |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pycalphad doesn't depend on sympy
anymore, only on symengine
. I'm still reading so I'll see if I can advise on how to eliminate the dependency.
solution_phase_species = [] | ||
|
||
# DAT *always* includes ideal gas phase (gas_ideal) in header (in first solution phase position), even if not used | ||
solution_phases.insert(0,'GAS_IDEAL') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this logic still work if we actually have an ideal gas phase in the Database? What if it's a GAS
phase with interaction parameters, pressure dependence, etc.? This is one of the cases that needs a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly works with an ideal gas in. Frankly I don't yet understand other gas models well enough yet, I should probably read up on the options in both the TDB and DAT formats before deciding how to handle.
In my experience with other TDB -> DAT converters, they often break on pressure-dependent terms. This might be a tricky spot.
pycalphad/io/cs_dat.py
Outdated
species = [] | ||
for i in range(len(constituents[0])): | ||
for j in range(i,len(constituents[0])): | ||
for k in range(len(constituents[1])): | ||
for l in range(k,len(constituents[1])): | ||
species.append(f'{constituents[0][i]},{constituents[0][j]}/{constituents[1][k]},{constituents[1][l]}') | ||
solution_phase_species.append(species) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bocklund Please check this during your review
def iterative_substitution(param,symbols): | ||
# Iteratively subsitutes a list of symbols into an equation | ||
# Doesn't check for circular dependence | ||
subs_param = param | ||
requires_substitution = True | ||
while requires_substitution: | ||
requires_substitution = False | ||
param_symbols = [s.name for s in subs_param.free_symbols] | ||
# Check if any of the symbols in symbols are still in the parameter | ||
common_symbols = [s for s in symbols if s in param_symbols] | ||
if common_symbols: | ||
# Do another round of substitutions if so | ||
subs_param = piecewise_fold(simplify(subs_param.subs(symbols))) | ||
requires_substitution = True | ||
return subs_param |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can replace this function with Model.symbol_replace
and eliminate the sympy dependency:
# Maximum number of levels deep we check for symbols that are functions of
# other symbols
_MAX_PARAM_NESTING = 32
def symbol_replace(obj, symbols):
"""
Substitute values of symbols into 'obj'.
Parameters
----------
obj : SymEngine object
symbols : dict mapping symengine.Symbol to SymEngine object
Returns
-------
SymEngine object
"""
try:
# Need to do more substitutions to catch symbols that are functions
# of other symbols
for iteration in range(_MAX_PARAM_NESTING):
obj = obj.xreplace(symbols)
undefs = [x for x in obj.free_symbols if not isinstance(x, v.StateVariable)]
if len(undefs) == 0:
break
except AttributeError:
# Can't use xreplace on a float
pass
return obj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calls to piecewise_fold
and simplify
here are crucial to getting the equations formatted into simple intervals that can be written to a DAT. I tried switching to symbol_replace
as you highlighted, but it returns a nested piecewise equation that can't be used.
For more details of what needs to happen, you might look at my recent question on StackExchange.
def simplify_reference_gibbs(equation,symbols): | ||
# Determine equation type and number of intervals | ||
# Is this monstrosity necessary? Yes, it would seem so. | ||
simplified_parameter = iterative_substitution(equation,symbols) | ||
gibbs_equation = expand(simplify(simplify(simplified_parameter))).args | ||
if not gibbs_equation: | ||
# If all ranges have 0 value, simplify returns empty set, so revert | ||
simplified_parameter = equation.args | ||
gibbs_equation = [] | ||
for i in range(int(len(simplified_parameter)/2)): | ||
gibbs_equation.append((simplified_parameter[i*2],simplified_parameter[i*2+1])) | ||
return gibbs_equation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's probably another way to accomplish the goal here, but I don't fully understand the purpose of this function yet. I'd like to avoid reintroducing sympy as a dependency as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a great function. Basically I ran all these lines together in a couple of places, so I just grouped them into a function. All it is attempting to do is get the Gibbs reference energy equation into the same format whether it originated from a TDB (with nested piecewise) or a DAT (with just simple coefficients).
Piecewise equations have to be simplified as far as possible, but then I needed expand
to get coefficients of simple powers of T
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After playing around with it, I'll add that simplified_parameter.simplify()
doesn't yield the same results as simplify(simplified_parameter)
(and the difference results in an error).
@maxposchmann sorry for my delay, this is still high on my list, but it's been a busy month or two for me. I'm hoping to get back to properly reviewing and trying to help avoid re-introducing SymPy soon. |
Adds support for writing of ChemSage DAT files with IDMX, QKTO, RKMP, SUBL, SUBG, SUBQ phase models, plus magnetic variants of RKMP and SUBL.
Notwithstanding open issues on DAT parsing, DAT files can be read and an equivalent DAT produced for cases I've tested. These include the databases
Kaye_NobleMetals.dat
,FeCuCbase.dat
, andFeTiVO.dat
available on the Thermochimica repo.Reasonable DAT files have been produced by converting TDBs
Al-Mg_Zhong.tdb
,Al-Cu-Y.tdb
, andalfe_sei.TDB
from theexamples
directory.I've tried to be verbose in comments, but the code could probably be factored better. Input on this is desired.
This is intended to close #413, no doubt with many enhancements requiring new issues in the future.