Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add support for the modified quasichemical model in the quadruplet approximation #389

Merged
merged 165 commits into from
Mar 22, 2022

Conversation

bocklund
Copy link
Collaborator

@bocklund bocklund commented Dec 20, 2021

What's in this PR:

  • Adds a Database() reader for the ChemSage .dat format. It supports reading the most common models that I have seen in the wild and have acquired databases for. Currently supported models include the modified quasichemical model in the quadruplet approximation (MQMQA), CEF models with Redlich-Kister-Muggianu mixing, stoichiometric phases (cast to stoichiometric CEF representations). Not yet supported are CEF models with the generalized Kohler-Toop mixing, real gases, aqueous phases, and others not mentioned previously. A ChemSage .dat writer is not implemented here.
  • Add pycalphad.models subpackage for holding future models
  • Implement the MQMQA in pycalphad.models.model_mqmqa
  • Add support QKTO mixing to Model ("QKTO" is a CEF model that uses generalized polynomial mixing with Kohler or Kohler-Toop extrapolation)
  • Add pytest fixture machinery to load database files from pycalphad.tests.databases
  • Slight modification of the check_energy test helper, where we test on the GM property, rather than ast. IMO the AST is not guaranteed to be a part of the Model API (hopefully we formalize this at some point with a protocol). In-fact the MQMQA has all contributions in terms of moles of formula such that G=ast; GM=G/normalization instead of GM=ast; G=GM*normalization, as in the Model class. The former should make for smaller, easier to debug and faster to compile functions.

What's left for this PR:

  • Parser
    • Do another pass through to try to clear out the remaining TODO items
    • SUBQ zeta support in the pairs
    • Cleanup nomenclature of the MQMQA to better match up with ModelMQMQA implementation (i.e. "Endmembers" vs. "Pairs")
  • ModelMQMQA
    • Adding tests for anion mixing and reciprocal quadruplets
    • Testing and supporting MQMQA models with different "zeta" and SUBQ type
    • Consolidate/clean up the tinydb constituent_array tests

Open questions, looking for feedback:

  • Is there any more work that needs to go in to the DAT reader for this PR (that can't be deferred)?
  • changes in pycalphad.io.grammar
  • generally, the approach of having mangled quadruplet species for site fractions -- In the MQMQA cation and anion species names are mangled (by adding the charge) for MQMQA to be able to disambiguate quadruplets that use the same pure element of different valence states (e.g. Fe+2 and Fe+3) when they are put into a fictitious quadruplet species. Without the name mangling a quadruplet v.Species object would not be able to differentiate Fe+2Fe+2XX, Fe+2Fe+3XX, and Fe+3Fe+3XX quadruplets.
  • In the DAT parser, I write some code to workaround assumptions in the Database.add_parameter method that assumes that the parameter form will be CEF-like, as we have been using. I could use feedback on whether we should modify the add_parameter API or continue using this workaround for now.
  • I removed TDB read/write support for the older style of MQMQA parameters, so there's no officially supported way to get the MQMQA model out. Do we need basic XML or alt-database support for this PR?

Things I'd want to defer:

  • Implementing other models (e.g. generalized polynomial Kohler-Toop in CEF models some QKTO support added) (these are the key missing aspects in coverage of the parser: other models and representations of the energy: heat capacity coefficients, magnetic parameters, and other models (real gas, Pitzer, etc.)
    - Any conflicts with SymEngine readiness in the parser or model (ENH: Use SymEngine everywhere #376) (symengine support added)

bocklund and others added 30 commits June 7, 2021 13:23
… model regarding ternary interaction parameters
… well as be able to have Pycalphad write and read tdb files based on the types. The types has been added to the model_hints
…h I have separately confirmed to work in a separate cell in Jupyter. Believe that there is some database formatting issue in the test files that is causing the error. Will work to fix that in the future.
…lculation and was able to get an actual value instead of nan
…ading Gas phase and SUBL phases. Fixed bugs in the excess energy function. Added new tests to the test equilibrium
… and fixed calculate.py so that it recognizes MQMQA
@richardotis
Copy link
Collaborator

Re: _database_add_parameter static method: Can we change Database.add_parameter to accept an additional dictionary as a keyword argument? Would that eliminate the need for a custom method?

@richardotis
Copy link
Collaborator

Re: name mangling, it's not ideal from a semantic perspective, but the alternative would require the Species object and a lot of dependent code to be reworked. I think this approach is the least-bad solution.

Copy link
Collaborator

@richardotis richardotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is complete. Pending your responses to the remaining items, this can be merged. The coverage issue is a minor concern but appears to mostly be due to implementation stubs for additional DAT parsing support, which is okay to defer.

@bocklund
Copy link
Collaborator Author

bocklund commented Mar 21, 2022

Re: _database_add_parameter static method: Can we change Database.add_parameter to accept an additional dictionary as a keyword argument? Would that eliminate the need for a custom method?

I just had it accept kwargs instead (in 35e6c43), which seemed logical based on how the function works otherwise. This change is kind of nice because it allows for users to develop their own parameter dictionaries for their own models that may need additional custom metadata.

If the kwargs approach is acceptable, we should consider changing the API of add_parameter in the future so that the arguments match the names of the arguments in the parameter dictionaries (e.g. param_order -> parameter_order). And we should also consider thinking about which arguments are required and which are not.

Re: name mangling, it's not ideal from a semantic perspective, but the alternative would require the Species object and a lot of dependent code to be reworked. I think this approach is the least-bad solution.

Agreed. It's mostly a cosmetic issue for the string representations, so it should be possible to revisit without significant breakage.

pycalphad/io/cs_dat.py Outdated Show resolved Hide resolved
@bocklund
Copy link
Collaborator Author

@richardotis I think everything is addressed and this is ready to merge. Thanks for your review!

Huge thanks to @jorgepazsoldanpalma for leading the initial implementation and to @maxposchmann for all the guidance you provided along the way!

@richardotis
Copy link
Collaborator

I gave it one more look. Once again, excellent work by the team. Please merge when ready. The "WIP" marker in the PR title can be removed.

@bocklund bocklund changed the title WIP: Add support for the modified quasichemical model in the quadruplet approximation ENH: Add support for the modified quasichemical model in the quadruplet approximation Mar 22, 2022
@bocklund
Copy link
Collaborator Author

Sorry, I made a few small changes to the TokenParser if you want to take one last look at the recent changes.

Now there is support for parsing tokens files line-by-line. The line-by-line parsing is completely transparent to TokenParser callers, but allows us to raise more informative error messages when there's a parsing failure. We're not able to do any detailed explanations, but it will at least now explain to users which line we were on when the parsing error was encountered. For example, some output looks like this:

pycalphad.io.cs_dat.TokenParserError: Error at line number 1805: ("invalid literal for int() with base 10: 'G'",) for line:
     G   1   6  17  17   0   1   0   0

In the database in question, there should be a 3 between lines 1804 and 180-5

   3                                                                          # line 1799
 G   1   6  17  17   0   0   0   0                                            # line 1800
     0.000000000   1.00     0.000000000   1.00     0.000000000   1.00         # line 1801
     0.000000000   0.00     0.000000000   0.00     0.000000000   0.00         # line 1802
   0   0 -2639.40000     0.921000000         0.000000000     0.000000000      # line 1803
     0.000000000     0.000000000                                              # line 1804
 G   1   6  17  17   0   1   0   0                                            # line 1805
     0.000000000   1.00     0.000000000   1.00     0.000000000   1.00         # line 1806
     0.000000000   0.00     0.000000000   0.00     0.000000000   0.00         # line 1807
   0   0  50.0000000         0.000000000     0.000000000     0.000000000      # line 1808
     0.000000000     0.000000000                                              # line 1809
   3                                                                          # line 1810
 G   6  13  17  17   0   0   0   0                                            # line 1811
     0.000000000   1.00     0.000000000   1.00     0.000000000   1.00         # line 1812
     0.000000000   0.00     0.000000000   0.00     0.000000000   0.00         # line 1813
   0   0 -12087.0000     -5.00000000         0.000000000     0.000000000      # line 1814
     0.000000000     0.000000000                                              # line 1815

It will still probably require someone experienced with DAT files to understand what causes the error, but hopefully it makes our lives easier as developers to be able to track down where errors are happening and if it's a pycalphad parsing issue or a real issue with the database.

@bocklund bocklund merged commit 2ea3c67 into pycalphad:develop Mar 22, 2022
Path to 1.0 automation moved this from In progress to Done Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants