ENH: General charge-constrained phases support #386

richardotis · 2021-11-20T21:37:35Z

This PR is based on a branch of work by @HUISUN24 - I was able to build on her implementation, while taking advantage of some quantities pycalphad's minimizer was already calculating to reduce the amount of new code.

With these changes, pycalphad will be able to compute equilibria involving phases with charged species in one or more sublattices (type code I in TDB syntax). This is in addition to long-standing support for the two-sublattice ionic liquid model (gh-161), which did not require explicit charge balance.

For global minimization of charge-constrained phases, we follow the approach of Sundman et al in that we sample linear combinations of the neutral, constructed "pseudo-endmembers." We augment these points with a technique called hit-and-run (HR) sampling, which has been around for decades but I found the clearest description in a recent paper, "Novel Matrix Hit and Run for Sampling Polytopes and Its GPU Implementation" by Corte and Mantiel, 2021 (https://arxiv.org/abs/2104.07097), which explains the technique along the way to explaining a more sophisticated extension (which I have not implemented here).

The key idea is you start from a feasible solution to the under-determined system of equations defined by the constraints. Then I use the null space projection matrix from the QR to take steps in the null space ("tangent space") of the constraints. The intuition here is that any step taken in the null space, starting from a feasible point, is guaranteed to be feasible with respect to the equality constraints.

We can then choose a random unit vector as a direction toward another feasible point. What size step should we take? Now we have to recall our inequality constraints, specifically the non-negativity constraint on the site fractions. We apply the algorithm from HR sampling to compute minimum and maximum step sizes that still stay within the feasible space, for the given direction, and we make a uniform-random choice between those two values. Now we can take a step to a new feasible point, and repeat the algorithm until we have the desired number of points. Because this is a Markov (memoryless) process, our point sample will converge to the uniform distribution on the convex polytope defined by our constraints.

There is a lot of overlap with this feature and pseudo-binary oxide systems. During testing we have seen pycalphad's minimizer become poorly behaved when doing certain calculations in oxide pseudo-binary systems. One example is in this PR: test_eq_charge_halite. This happens because the chemical potential of oxygen can be independent of the system composition, and so the driving force (used to determine convergence) becomes ill-defined. There are multiple approaches to dealing with this, such as the use of virtual phases [1], but here we defer the issue for future work by changing the oxygen condition in test_eq_charge_halite to be in terms of the chemical potential, instead of the mole fraction. We also include two other tests to exercise the capability for multi-phases cases, including a case with a miscibility gap.

References
[1] Shobu, Kazuhisa, and Tatsuo Tabaru. "Development of new equilibrium calculation software: CaTCalc." Materials transactions 46.6 (2005): 1175-1179.

Other items

Documentation example
The documentation examples have been regenerated if the Jupyter notebooks in the examples/ have changed. To regenerate the documentation examples, run jupyter nbconvert --to rst --output-dir=docs/examples examples/*.ipynb from the top level directory)

@HUISUN24

…lementation by @HUISUN24

codecov · 2021-11-20T21:45:44Z

Codecov Report

Merging #386 (fe590f5) into develop (58fec40) will increase coverage by 0.33%.
The diff coverage is 97.33%.

@@             Coverage Diff             @@
##           develop     #386      +/-   ##
===========================================
+ Coverage    89.98%   90.32%   +0.33%     
===========================================
  Files           44       44              
  Lines         4383     4567     +184     
===========================================
+ Hits          3944     4125     +181     
- Misses         439      442       +3

Impacted Files	Coverage Δ
pycalphad/core/calculate.py	`93.54% <95.60%> (+0.87%)`	⬆️
pycalphad/model.py	`92.01% <100.00%> (+0.18%)`	⬆️
pycalphad/tests/datasets.py	`100.00% <100.00%> (ø)`
pycalphad/tests/test_calculate.py	`100.00% <100.00%> (ø)`
pycalphad/tests/test_equilibrium.py	`97.94% <100.00%> (+0.24%)`	⬆️
pycalphad/io/grammar.py	`100.00% <0.00%> (+5.26%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58fec40...fe590f5. Read the comment docs.

bocklund · 2021-11-21T18:29:37Z

One reason to like explicitly constructing neutral combinations of endmember pairs is that you are guaranteed to get them all, which is not true with the approach of masking grid points.

Consider the model (Al+3,Va)1(Cl-1)1. None of the endmembers are neutral, so no endmembers make it past the neutral_mask. There is exactly one correct set of site fractions: [1/3, 2/3, 1], so grid sampling and then masking probably won't work either because our regular fixed_grid on a reasonable point density probably wouldn't meet the ALLOWED_CHARGE threshold. Here's an example that currently fails.

from pycalphad import Database, calculate
tdb = """
ELEMENT Al FCC_A1 0 0 0 !
ELEMENT CL GAS 0 0 0 !
ELEMENT VA VACUUM 0 0 0 !

SPECIES AL+3 AL1/+3 !
SPECIES CL-1 CL1/-1 !

PHASE ALCL3 % 2 1 1 !
CONSTITUENT ALCL3 : AL+3, VA : CL-1 : !

PARAMETER G(ALCL3,AL+3:CL-1;0) 1 -100000; 10000 N !
PARAMETER G(ALCL3,VA:CL-1;0) 1 0; 10000 N !
"""
dbf = Database(tdb)
calc_res = calculate(dbf, ['AL', 'CL', 'VA'], ['ALCL3'], N=1, P=101325, T=300)
assert calc_res.points.size > 0

…d spaces

richardotis · 2021-12-03T23:40:17Z

I've pushed a sketch of a general approach for uniformly sampling site fractions subject to linear constraints. It uses a technique called hit-and-run (HR) sampling, which has been around for decades but I found the clearest description in a recent paper, "Novel Matrix Hit and Run for Sampling Polytopes and Its GPU Implementation" by Corte and Mantiel, 2021 (https://arxiv.org/abs/2104.07097), which explains the technique along the way to explaining a more sophisticated extension (which I have not implemented here).

The key idea is you start from a feasible solution to the under-determined system of equations defined by the constraints. ~~In this case, I chose the minimum-norm solution (which you can obtain by the pseudo-inverse method or QR decomposition)~~ (see comment below). Then I use the null space projection matrix from the QR to take steps in the null space ("tangent space") of the constraints. The intuition here is that any step taken in the null space, starting from a feasible point, is guaranteed to be feasible with respect to the equality constraints.

We can then choose a random unit vector as a direction toward another feasible point. What size step should we take? Now we have to recall our inequality constraints, specifically the non-negativity constraint on the site fractions. We apply the algorithm from HR sampling to compute minimum and maximum step sizes that still stay within the feasible space, for the given direction, and we make a uniform-random choice between those two values. Now we can take a step to a new feasible point, and repeat the algorithm until we have the desired number of points. Because this is a Markov (memoryless) process, our point sample will converge to the uniform distribution on the convex polytope defined by our constraints. There are practical issues with applying HR sampling to ill-shaped spaces ("getting stuck"), but I don't think they apply to the types of constraints we're likely to use.

There's still a bit of debugging code in there, and I'm not sure about how to factor all the sampling logic specifically, but it seems to work. I'm also not sure on performance yet, though it seems to work well enough for moderate sized systems. I have not verified that the default point density setting approximates a uniform distribution well.

richardotis · 2021-12-18T22:53:48Z

The minimum-norm solution is not always feasible with respect to our non-negativity bounds, it turns out. The pseudo-endmember construction approach developed by @bocklund does guarantee a feasible solution (if one exists), though it does not generalize easily to other types of constraints. For those, we would need to look at Chebyshev center calculations for the initial point, similar to what is done in https://github.com/DavidWalz/polytope-sampling
One of the new tests has an infeasible minimum-norm solution, so we should avoid regressions related to sampling in that way.

pycalphad/core/calculate.py

pycalphad/model.py

Co-authored-by: Brandon Bocklund <brandonbocklund@gmail.com>

… into ionic-solids

bocklund · 2022-01-31T23:42:32Z

Documentation builds are failing when trying to do the pip install --no-build-isolation --editable . step, which appears to be an upstream issue in setuptools: pypa/setuptools#3063

ENH: General charge-constrained phases support, with tests. Draft imp…

5748e12

…lementation by @HUISUN24

richardotis added the enhancement label Nov 20, 2021

FIX: calculate: Enforce memory contiguity of phase composition array

0d001b8

richardotis added 3 commits December 3, 2021 13:52

ENH: calculate: sketch of hit-and-run sampler for linearly-constraine…

c60d1ea

…d spaces

FIX: Check for infeasible phase and fixed degrees of freedom

921c93e

TST: add tests for complex charged species

e7d3d42

richardotis added 2 commits December 5, 2021 10:09

Use constructed pseudo-endmembers along with hit-and-run sampling

6cc324b

Fixes for the HR sampler getting stuck, and choose better initial point

78e4da6

richardotis added 4 commits December 14, 2021 15:42

FIX: Include mass residual in convergence criteria

bb04ffd

DOC: ChargedPhases: add example

b326a85

DOC/FIX: Fixup examples and confirm working

55bad23

DOC: Rebuild RST from ipynb examples

1fd9f98

richardotis requested a review from bocklund December 21, 2021 23:31

bocklund requested changes Jan 31, 2022

View reviewed changes

pycalphad/core/calculate.py Show resolved Hide resolved

pycalphad/model.py Outdated Show resolved Hide resolved

richardotis and others added 3 commits January 31, 2022 08:39

Merge branch 'develop' into ionic-solids

716c5c1

MAINT: No need to calculate total_site_ratios for constraint

a626c01

Co-authored-by: Brandon Bocklund <brandonbocklund@gmail.com>

Merge branch 'ionic-solids' of https://github.com/richardotis/pycalphad…

704076f

… into ionic-solids

FIX/BLD: Temporary workaround for pypa/setuptools#3063

fe590f5

richardotis requested a review from bocklund February 1, 2022 22:52

bocklund approved these changes Feb 2, 2022

View reviewed changes

richardotis merged commit 8199960 into pycalphad:develop Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: General charge-constrained phases support #386

ENH: General charge-constrained phases support #386

richardotis commented Nov 20, 2021 •

edited

Loading

codecov bot commented Nov 20, 2021 •

edited

Loading

bocklund commented Nov 21, 2021

richardotis commented Dec 3, 2021 •

edited

Loading

richardotis commented Dec 18, 2021 •

edited

Loading

bocklund commented Jan 31, 2022

ENH: General charge-constrained phases support #386

ENH: General charge-constrained phases support #386

Conversation

richardotis commented Nov 20, 2021 • edited Loading

codecov bot commented Nov 20, 2021 • edited Loading

Codecov Report

bocklund commented Nov 21, 2021

richardotis commented Dec 3, 2021 • edited Loading

richardotis commented Dec 18, 2021 • edited Loading

bocklund commented Jan 31, 2022

richardotis commented Nov 20, 2021 •

edited

Loading

codecov bot commented Nov 20, 2021 •

edited

Loading

richardotis commented Dec 3, 2021 •

edited

Loading

richardotis commented Dec 18, 2021 •

edited

Loading