Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
ENH: Binary phase diagram mapping #209
This improves on the efficiency of the brute force equilibrium calculations that are implemented now for binary phase diagrams.
There are two main sources for the performance bump compared to the brute force method, where every broadcasted condition is calculated:
Therefore, the most significant speed bumps are going to come from systems that have wide regions of single phase or two phases present. The slowest will have many small two phase regions. Assuming convex hull calculations are much slower than equilibrium calculations, this should never be slower than the brute force method.
~Since equilibrium calculations are looped, there is some small performance bump on the table to run
There are some performance improvements and optimizations in this branch, with help from @richardotis.
All images from the new binplot_map.
Timings do not include the JIT time and are dependent on hardware. Callables are not pre-built and passed for these.
19s for binplot_map
The main idea of the algorithm is the following:
Two assumptions are made in several places in this approach:
The T-X assumption is fairly core to the code here. If we wanted to do isothermal calculations, e.g. isothermal ternaries, that would require a new mapping calculation function and significant changes to the ZPFBoundarySets object (maybe a new object would be more appropriate?).
We could get rid of the binary assumption and enable plotting of isopleths (including pseudo-binaries!). This would require
Most of the plotting code (ZPFBoundarySets, TwoPhaseRegions) are relatively decoupled from the binary assumption, and it's just a matter of creating the CompSet2D objects in the mapping function to be in terms of the phases and compositions projected into the reaction coordinate.
Can we improve test coverage for
See #209 for a complete discussion of this PR. Core changes are as follows: * Implementation of a mapping algorithm for binary phase diagrams ** Significant performance improvement due to the algorithm with no regressions. Visually, this should produce exactly the same results as the current brute force method. It's just faster. ** ZPFBoundarySets, BinaryCompset, and CompsetPair objects introduced to facilitate storing and transforming the composition sets corresponding to the zero phase boundaries that are calculated here. * dask is now removed from pycalphad. It was rarely being used by anyone and according to #101, it was not able to be used successfully by default. This closes #101 and updates the parallelism section in the FAQ. Also makes debugging and the code simpler since we don't pass through a confusing dask redirection layer or delayed objects. Drops dask and dill from requirements. * Creating and accessing xarray datasets has been a performance issue for quite some time. This creates a thin interop layer, `LightDataset`, which creates the NumPy arrays directly and has a similar access structure (without the `sel`, `loc`, and virtually all other xarray methods). pycalphad now uses LightDataset objects internally, which greatly avoids overhead in tight-looped operations. The LightDataset preserves the inputs from construction and can be coerced to xarray Datasets through a method call, which is the default for calculate and equilibrium (users see no external change), but an option exists for those functions to return LightDataset objects for tight loops. * Removes pickle hack code which is no longer necessary for supported parallelism and Python versions. * Regex performance optimization for chemical formula parsing (improve Species parsing performance) * Fix LaTeX printing bug in pycalphad variables objects (@richardotis)