ChemSage DAT support: Constituent parsing bug #419

maxposchmann · 2022-06-10T21:26:39Z

The constituents of a multi-sublattice phase listed in a DAT can be chemically complex or contain information about the charge state of the constituent, or really anything else at all. See here for an example:

 Fe[3+]                   Fe[2+]                   V2O3
 Ti[4+]                   Ti[3+]
 O

Fe[3+] contains a number that tells you about the charge state, while V2O3 contains numbers that tell you about the stoichiometry of the constituent.

It appears that currently, pycalphad counts on the square brackets being there around the charges. So things work fine for Fe[3+], but break down if one has Fe3+ instead, as it reads this as 3 irons. Due to this issue, incorrect results will be computed for many DAT files.

However, you cannot count on this convention being followed in DAT files, and in fact it rarely is. Further, one really ought not to be using the constituent names for anything. They are there for readability only, and follow no rules at all. Renaming all five constituents to Bob should result in no changes to the calculation.

There is in fact no way at all to determine the stoichiometry of individual constituents in an MQMQA phase from a DAT file. Fortunately, these aren't necessary as the quadruplet stoichiometries can be determined from the stoichiometries of the endmembers, the coordinations, and the arrays that link the endmembers to the constituents.

The text was updated successfully, but these errors were encountered:

bocklund · 2022-06-13T19:49:56Z

There may be some bugs hiding, having a reproducible example would be helpful if one is found.

Species objects have a name and a constituents attributes. The name is supposed to be arbitrary and the constituents are what matters. As long as pycalphad gets the constituents correctly, everything else should just work. Species also have the feature of being capable of parsing the constituents if constructed from only a name if the constituents are not given, but I don't remember exactly how this is used in practice, except as a user-facing shortcut to make v.Species("CU") work as expected.

The intent of pycalphad.io.cs_dat.Endmember is to construct Species objects using the pure element constituents according to the stoichiometry in the DAT, not to rely on the names in the models.

IIRC, I think the parsed names of Species (e.g. Species("Bob")) do indeed get added to the Database object, but that should not affect anything in practice. If that is what's happening, we should probably avoid doing that.

maxposchmann · 2022-06-13T20:29:58Z

Try out the database I linked above, you will see issues with the constituents immediately.

Here's an example:
species.name = FE[3+]+3.0
species.constituents = {'FE': 1.0, '[': 3.0, ']': 1.0}

Or, if I delete the []:
species.name = FE3++3.0
species.constituents = {'FE': 3.0}

So both of those are wrong in different ways. The first shouldn't consider [ or ] as elements, and the second gets the Fe stoichiometry wrong.

I've not had time to see where this goes wrong, but since all that I changed in the DAT was the constituent name, it certainly seems that more information is being extracted from that name than is safe.

bocklund · 2022-06-13T20:49:56Z

Yeah that's, definitely not correct for the species, but do you know if it's affecting the correctness for any phases or models? For example, the species should be similarly incorrect for the Shishin Fe-Sb-O-S database, but we have some explicit tests in the test suite with that compares pycalphad energies to Thermochimica energies.

maxposchmann · 2022-06-14T00:42:14Z

I set up a calculation as follows:

db = Database('FeTiVO.dat')
my_phases = ['SLAGBSOLN']
comps = ['TI', 'FE', 'O']

calc_result = calculate(db, comps, 'SLAGBSOLN', P=101325, T=500)
print(calc_result)

With the database as it is originally, my output is:

pycalphad.core.errors.ConditionError: None of the passed phases (['SLAGBSOLN']) are active. List of possible phases: ['BROK1SOLN', 'FE(S)', 'FE2O3_HIGH-PRESSURE-(S2)', 'FE2O3_HIGH-PRESSURE-(S3)', 'FE2O5TI_FERRIC_PSEUDO(S)', 'FE3O4_HIGH-PRESSURE-(S3)', 'FE3O4_HIGH-PRESSURE-(S4)', 'FEO_WUSTITE(S)', 'FE_BCC(S)', 'GAS_IDEAL', 'HEMASOLN', 'O2(S)', 'RUTILESOLN', 'TI(S)', 'TI2O3_SOLID-A(S)', 'TI2O3_SOLID-B(S2)', 'TIO2_ANATASE(S2)', 'TIO2_RUTILE(S)', 'TI_SOLID_ALPHA(S)'].

With the brackets removed from Fe[3+] only the output is:

<xarray.Dataset>
Dimensions:    (N: 1, P: 1, T: 1, points: 1, component: 3, internal_dof: 1)
Coordinates:
  * component  (component) <U2 'FE' 'O' 'TI'
  * N          (N) float64 1.0
  * P          (P) float64 1.013e+05
  * T          (T) float64 500.0
Dimensions without coordinates: points, internal_dof
Data variables:
    X          (N, P, T, points, component) float64 0.4 0.6 0.0
    Phase      (N, P, T, points) <U9 'SLAGBSOLN'
    Y          (N, P, T, points, internal_dof) float64 1.0
    GM         (N, P, T, points) float64 -1.643e+05
Attributes:
    phase_indices:  {'SLAGBSOLN': slice(0, 1, None)}

With all brackets removed the output is:

<xarray.Dataset>
Dimensions:    (N: 1, P: 1, T: 1, points: 108010, component: 3, internal_dof: 10)
Coordinates:
  * component  (component) <U2 'FE' 'O' 'TI'
  * N          (N) float64 1.0
  * P          (P) float64 1.013e+05
  * T          (T) float64 500.0
Dimensions without coordinates: points, internal_dof
Data variables:
    X          (N, P, T, points, component) float64 0.5 0.5 ... 0.5546 0.09745
    Phase      (N, P, T, points) <U9 'SLAGBSOLN' 'SLAGBSOLN' ... 'SLAGBSOLN'
    Y          (N, P, T, points, internal_dof) float64 1.0 1e-14 ... 0.02869
    GM         (N, P, T, points) float64 -1.384e+05 -1.497e+05 ... -1.877e+05
Attributes:
    phase_indices:  {'SLAGBSOLN': slice(0, 108010, None)}

I want to emphasize that none of these databases is more or less correct than the others.

About the tests: they have the quadruplet concentrations set directly, so it is believable that some error modes are sidestepped.

bocklund mentioned this issue Aug 7, 2022

Duplicate Species name error when parsing FactSage DAT into Database #425

Open

richardotis mentioned this issue Aug 7, 2022

FIX: ChemSage DAT: Case sensitivity of compound names in endmembers #426

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChemSage DAT support: Constituent parsing bug #419

ChemSage DAT support: Constituent parsing bug #419

maxposchmann commented Jun 10, 2022

bocklund commented Jun 13, 2022

maxposchmann commented Jun 13, 2022

bocklund commented Jun 13, 2022

maxposchmann commented Jun 14, 2022

ChemSage DAT support: Constituent parsing bug #419

ChemSage DAT support: Constituent parsing bug #419

Comments

maxposchmann commented Jun 10, 2022

bocklund commented Jun 13, 2022

maxposchmann commented Jun 13, 2022

bocklund commented Jun 13, 2022

maxposchmann commented Jun 14, 2022