Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
ENH: Optimize calculate #175
For a set of 30M points (x16 DOF = 500M array elements),
When working with a large (8 sublattice) binary system, which results in points arrays of approximately 500M points at a point density of 1000. calling calculate was very slow.
This PR introduces some optimizations to speed up calculate.
For reference, here is the script that was run:
from pycalphad import calculate, Database dbf = Database('/Users/brandon/Projects/notebooks/pycalphad/AB8SL_v4.TDB') from pycalphad import Model mod = Model(dbf, ['A', 'B'], 'FCC') # save time initializing models on each call # warm the cache calc_res = calculate(dbf, ['A', 'B'], 'FCC', pdens=1000, model=mod) % timeit calc_res = calculate(dbf, ['A', 'B'], 'FCC', pdens=1000, model=mod)
Running this on develop master gives:
The main changes are
Except for the additional overhead of the if statement checking the array shapes, this should across the board be cheaper, even for small points arrays, and should allow us to handle more sublattice and multicomponent systems more gracefully.
For the record:
After this optimization, assuming a precompiled Model that doesn't have to be initialized in
Of that time, 92.3% is spent in
So in order to be more performant for bigger systems we need to (IMO) do the following.
Thanks for this work. If you want to eliminate some more array allocations, I would also look at
My reasoning for not doing looking into padding here was that I think
Will passing in the view still work?
There's a codegen'd Cython wrapper over the raw function which I think will check the dof array shape at call time and do the pointer arithmetic, so all the array indexing operations should safely ignore any
We should probably also consider Cythonization of
referenced this pull request
Jun 5, 2018
Merging this. For future reference line profiling has pointed out the following hotspots that could be optimized in the future (in order of time, time in seconds for order of magnitude gauge). Even after these changes,