Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Refactor build_callables #214

Merged
merged 6 commits into from May 17, 2019

Conversation

Projects
None yet
2 participants
@bocklund
Copy link
Collaborator

commented May 15, 2019

The following changes are made:

  1. build_callables now becomes build_callables and build_phase_records.
    • Both require that correctly instantated model dicts, i.e. {'PHASE_NAME': Model()}
    • build_callables should be called by users and just builds the callable functions and JIT compiles them. It returns a dict of {'output': {'callables': {'PHASE_NAME': AutowrapFunc()}, 'massfuncs': {'PHASE_NAME': AutowrapFunc()}, ...}}. These callables are passed to equilibrium and calculate.
    • Calculate and equilibrium should be calling build_phase_records, which optionally take callables. If callables are provided for the output, they are not recomputed (if users pass callables, they are responsible for determine if they are compatible). If no callables are passed, compatible ones will be built automatically.
    • equilibrium should not be building callables or phase records for other outputs. It just passes along the user callables (if they are provided) and calculate handles building the phase records for the particular outputs in _eqcalculate.
    • It is the responsibility of the caller of build_callables to know whether gradients and Hessians are required. This means two cases can exist. Either the user calls build_callables (the user is responsible) or build_callables is called by build_phase_records (calculate/equilibrium are responsible for propagating the correct settings). For calculate, no gradients or Hessians are built. For equilibrium, gradients are always built, and Hessians are built if any of the conditions are instances of the classes in pycalphad.variables.CONDITIONS_REQUIRING_HESSIANS.
  2. This also refactors all the code where state variables and models were retrieved from the result of the old build_callables. Getting these additional outputs violates single the responsibility principle. A convenience function for getting state variables from models and or conditions now exists for calculate, equilibrium, and build_callables/build_phase_records to use.
  3. The following code in the test test_eq_build_callables_with_parameters was removed, because this is not valid - parameters are not added automatically through build callables because PhaseRecords are no longer built in build_callables.
    # Check that passing callables should skip the build phase, but use the values from 'VV0000' saved in callables
    eq_res = equilibrium(dbf, comps, phases, conds, callables=callables)
    np.testing.assert_allclose(eq_res.GM.values.squeeze(), 20000.0)
  1. v.N now gets specified automatically in calculate. Previously, it was automatically specified in equilibrium, but not calculate. When calculate was called from equilibrium that there was an extra dimension for v.N that is not present when calculate alone was called. A user-friendly alternative to the ConditionError error message is written if v.N is not in the equilibrium or calculate conditions. This would cause the callables to give incorrect answers in either calculate or equilibrium if called separately. All callables should be built with the v.N statevariable (while v.N=1 is a required condition internally) so that the behavior is constistent with both. Users building callables by hand should be aware that they may need to add v.N to the callables by the additional_statevars argument of build_callables. test_issue116 was updated to reflect that N is added automatically. As much as possible, we should try to progressively expose pycalphad variables objects to the end user - IMO we state variables should be part of a state variables dict to calculate.
@bocklund

This comment has been minimized.

Copy link
Collaborator Author

commented May 15, 2019

I think the main ideas are in place following the discussion from #208. I am going to take another look at #208, #212, and #207 to see if there's any tests that need to be added.

@bocklund bocklund added this to the 0.8 milestone May 15, 2019

@richardotis

This comment has been minimized.

Copy link
Collaborator

commented May 15, 2019

Broad, forward-thinking question: In a world where we want to read phase records from binary files on disk ("compiled databases"), does this new design make it easy to add support for that?

@bocklund

This comment has been minimized.

Copy link
Collaborator Author

commented May 16, 2019

That's the main idea, yeah.

Currently, PhaseRecord objects require conditions that are specific to the equilibrium problem being solved, so they should be built and compiled JIT. On the other hand, the callables compiled by build_callables can be used in any problem that retains the same state variables, so it makes sense to split them up.

In the future, we could also break out the ability to build constraints into it's own function, but that's relatively easy to extract with this design.

@bocklund

This comment has been minimized.

Copy link
Collaborator Author

commented May 16, 2019

Regarding the code in #208,

This

from pycalphad import Database, Model, calculate, variables as v
import numpy as np
import matplotlib.pyplot as plt

# Calculation Settings
dbf = Database('tdbs/alfe_sei.TDB')
comps = ['AL', 'FE', 'VA']
phase_name = 'B2_BCC'
output = 'CPM'
endmember = [[0.5, 0.5, 0.5, 0.5, 1.0]]
temp_range = [300, 1000]

mod = Model(dbf, comps, phase_name)
contributions = mod.models.copy()
results = {}

for contribution, value in contributions.items():
    mod.models.clear()
    mod.models[contribution] = value
    result = calculate(dbf, comps, phase_name,
                       model=mod, output=output, points=endmember, T=(temp_range[0], temp_range[1], 5), P=1e5)
    if (getattr(result, output) == 0).all():
        continue
    results[contribution] = getattr(result, output)
# plot
cumulative = np.zeros_like(results['ref'].squeeze())
for contrib, arr in results.items():
    a = arr.squeeze()
    plt.fill_between(np.arange(temp_range[0], temp_range[1], 5), cumulative, a+cumulative, label=contrib)
    cumulative += a
plt.legend()

Produces this:

cpm-contribs

because the Model instance gets sent to instantiate_models, which calls unpack_kwarg(Model_instance, Model), creating a dict {'B2_BCC': Model_instance}.

However, this can cause an issue if there are multiple phases and a Model instance, since the default_dict created will return the instances. So instantiate_models now raises a friendly error if len(phases) > 1 and isinstance(model, Model).

@bocklund

This comment has been minimized.

Copy link
Collaborator Author

commented May 16, 2019

#207 and #212 both have to do with the fact that callables did not have the right set of state variables.

There was a subtle bug, related to but not causing those issues, where calculate did not automatically add the N state variable, but equilibrium did implicitly add the N state variable, which was passed to calculate. This means that a set of callables that works with equilibrium did not work with calculate unless the N state variable was added to calculate calls (which equilibrium did under the hood, so equilibrium was internally consistent). This PR fixes that bug by always making sure N is in the state variables of calculate implicitly, like in equilibrium.

There's also the direct issue of users creating incompatible callables. Since N is implicitly added to every calculation, T is in about every Model model and pressure is the other condition specified, we give a warning if a users build_callables does not build callables with the N, P, T state variables and we instruct users to add the N, P, T state variables through the additional_statevars argument of build_callables. We don't raise in case there's a legitimate region to not use N, P, T.

Tests were added for #208, but I don't think these issues warrant any tests, so this PR should close #207, #208 and #212 and is ready for review

@bocklund bocklund requested a review from richardotis May 16, 2019

@richardotis
Copy link
Collaborator

left a comment

Looks great. Really impressed with the separation-of-concerns work here. I think it's going to save us a lot of work in the future.

Show resolved Hide resolved pycalphad/core/utils.py

@bocklund bocklund merged commit 0799bb3 into develop May 17, 2019

0 of 5 checks passed

continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
continuous-integration/travis-ci/push The Travis CI build could not complete due to an error
Details
continuous-integration/appveyor/branch AppVeyor build failed
Details
continuous-integration/appveyor/pr AppVeyor build failed
Details
coverage/coveralls Coverage decreased (-0.2%) to 88.581%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.