Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: General charge-constrained phases support #386

Merged
merged 15 commits into from Feb 2, 2022

Conversation

richardotis
Copy link
Collaborator

@richardotis richardotis commented Nov 20, 2021

This PR is based on a branch of work by @HUISUN24 - I was able to build on her implementation, while taking advantage of some quantities pycalphad's minimizer was already calculating to reduce the amount of new code.

With these changes, pycalphad will be able to compute equilibria involving phases with charged species in one or more sublattices (type code I in TDB syntax). This is in addition to long-standing support for the two-sublattice ionic liquid model (gh-161), which did not require explicit charge balance.

For global minimization of charge-constrained phases, we follow the approach of Sundman et al in that we sample linear combinations of the neutral, constructed "pseudo-endmembers." We augment these points with a technique called hit-and-run (HR) sampling, which has been around for decades but I found the clearest description in a recent paper, "Novel Matrix Hit and Run for Sampling Polytopes and Its GPU Implementation" by Corte and Mantiel, 2021 (https://arxiv.org/abs/2104.07097), which explains the technique along the way to explaining a more sophisticated extension (which I have not implemented here).

The key idea is you start from a feasible solution to the under-determined system of equations defined by the constraints. Then I use the null space projection matrix from the QR to take steps in the null space ("tangent space") of the constraints. The intuition here is that any step taken in the null space, starting from a feasible point, is guaranteed to be feasible with respect to the equality constraints.

We can then choose a random unit vector as a direction toward another feasible point. What size step should we take? Now we have to recall our inequality constraints, specifically the non-negativity constraint on the site fractions. We apply the algorithm from HR sampling to compute minimum and maximum step sizes that still stay within the feasible space, for the given direction, and we make a uniform-random choice between those two values. Now we can take a step to a new feasible point, and repeat the algorithm until we have the desired number of points. Because this is a Markov (memoryless) process, our point sample will converge to the uniform distribution on the convex polytope defined by our constraints.

There is a lot of overlap with this feature and pseudo-binary oxide systems. During testing we have seen pycalphad's minimizer become poorly behaved when doing certain calculations in oxide pseudo-binary systems. One example is in this PR: test_eq_charge_halite. This happens because the chemical potential of oxygen can be independent of the system composition, and so the driving force (used to determine convergence) becomes ill-defined. There are multiple approaches to dealing with this, such as the use of virtual phases [1], but here we defer the issue for future work by changing the oxygen condition in test_eq_charge_halite to be in terms of the chemical potential, instead of the mole fraction. We also include two other tests to exercise the capability for multi-phases cases, including a case with a miscibility gap.

References
[1] Shobu, Kazuhisa, and Tatsuo Tabaru. "Development of new equilibrium calculation software: CaTCalc." Materials transactions 46.6 (2005): 1175-1179.

Other items

  • Documentation example
  • The documentation examples have been regenerated if the Jupyter notebooks in the examples/ have changed. To regenerate the documentation examples, run jupyter nbconvert --to rst --output-dir=docs/examples examples/*.ipynb from the top level directory)

@codecov
Copy link

codecov bot commented Nov 20, 2021

Codecov Report

Merging #386 (fe590f5) into develop (58fec40) will increase coverage by 0.33%.
The diff coverage is 97.33%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #386      +/-   ##
===========================================
+ Coverage    89.98%   90.32%   +0.33%     
===========================================
  Files           44       44              
  Lines         4383     4567     +184     
===========================================
+ Hits          3944     4125     +181     
- Misses         439      442       +3     
Impacted Files Coverage Δ
pycalphad/core/calculate.py 93.54% <95.60%> (+0.87%) ⬆️
pycalphad/model.py 92.01% <100.00%> (+0.18%) ⬆️
pycalphad/tests/datasets.py 100.00% <100.00%> (ø)
pycalphad/tests/test_calculate.py 100.00% <100.00%> (ø)
pycalphad/tests/test_equilibrium.py 97.94% <100.00%> (+0.24%) ⬆️
pycalphad/io/grammar.py 100.00% <0.00%> (+5.26%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58fec40...fe590f5. Read the comment docs.

@bocklund
Copy link
Collaborator

One reason to like explicitly constructing neutral combinations of endmember pairs is that you are guaranteed to get them all, which is not true with the approach of masking grid points.

Consider the model (Al+3,Va)1(Cl-1)1. None of the endmembers are neutral, so no endmembers make it past the neutral_mask. There is exactly one correct set of site fractions: [1/3, 2/3, 1], so grid sampling and then masking probably won't work either because our regular fixed_grid on a reasonable point density probably wouldn't meet the ALLOWED_CHARGE threshold. Here's an example that currently fails.

from pycalphad import Database, calculate
tdb = """
ELEMENT Al FCC_A1 0 0 0 !
ELEMENT CL GAS 0 0 0 !
ELEMENT VA VACUUM 0 0 0 !

SPECIES AL+3 AL1/+3 !
SPECIES CL-1 CL1/-1 !

PHASE ALCL3 % 2 1 1 !
CONSTITUENT ALCL3 : AL+3, VA : CL-1 : !

PARAMETER G(ALCL3,AL+3:CL-1;0) 1 -100000; 10000 N !
PARAMETER G(ALCL3,VA:CL-1;0) 1 0; 10000 N !
"""
dbf = Database(tdb)
calc_res = calculate(dbf, ['AL', 'CL', 'VA'], ['ALCL3'], N=1, P=101325, T=300)
assert calc_res.points.size > 0

@richardotis
Copy link
Collaborator Author

richardotis commented Dec 3, 2021

I've pushed a sketch of a general approach for uniformly sampling site fractions subject to linear constraints. It uses a technique called hit-and-run (HR) sampling, which has been around for decades but I found the clearest description in a recent paper, "Novel Matrix Hit and Run for Sampling Polytopes and Its GPU Implementation" by Corte and Mantiel, 2021 (https://arxiv.org/abs/2104.07097), which explains the technique along the way to explaining a more sophisticated extension (which I have not implemented here).

The key idea is you start from a feasible solution to the under-determined system of equations defined by the constraints. In this case, I chose the minimum-norm solution (which you can obtain by the pseudo-inverse method or QR decomposition) (see comment below). Then I use the null space projection matrix from the QR to take steps in the null space ("tangent space") of the constraints. The intuition here is that any step taken in the null space, starting from a feasible point, is guaranteed to be feasible with respect to the equality constraints.

We can then choose a random unit vector as a direction toward another feasible point. What size step should we take? Now we have to recall our inequality constraints, specifically the non-negativity constraint on the site fractions. We apply the algorithm from HR sampling to compute minimum and maximum step sizes that still stay within the feasible space, for the given direction, and we make a uniform-random choice between those two values. Now we can take a step to a new feasible point, and repeat the algorithm until we have the desired number of points. Because this is a Markov (memoryless) process, our point sample will converge to the uniform distribution on the convex polytope defined by our constraints. There are practical issues with applying HR sampling to ill-shaped spaces ("getting stuck"), but I don't think they apply to the types of constraints we're likely to use.

There's still a bit of debugging code in there, and I'm not sure about how to factor all the sampling logic specifically, but it seems to work. I'm also not sure on performance yet, though it seems to work well enough for moderate sized systems. I have not verified that the default point density setting approximates a uniform distribution well.

@richardotis richardotis added this to In progress in Path to 1.0 Dec 13, 2021
@richardotis
Copy link
Collaborator Author

richardotis commented Dec 18, 2021

The minimum-norm solution is not always feasible with respect to our non-negativity bounds, it turns out. The pseudo-endmember construction approach developed by @bocklund does guarantee a feasible solution (if one exists), though it does not generalize easily to other types of constraints. For those, we would need to look at Chebyshev center calculations for the initial point, similar to what is done in https://github.com/DavidWalz/polytope-sampling
One of the new tests has an infeasible minimum-norm solution, so we should avoid regressions related to sampling in that way.

pycalphad/core/calculate.py Show resolved Hide resolved
pycalphad/model.py Outdated Show resolved Hide resolved
@bocklund
Copy link
Collaborator

Documentation builds are failing when trying to do the pip install --no-build-isolation --editable . step, which appears to be an upstream issue in setuptools: pypa/setuptools#3063

@richardotis richardotis merged commit 8199960 into pycalphad:develop Feb 2, 2022
Path to 1.0 automation moved this from In progress to Done Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants