New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Refactor callables creation in equilibrium() and calculate() #192

Merged
merged 19 commits into from Nov 13, 2018

Conversation

Projects
None yet
2 participants
@richardotis
Collaborator

richardotis commented Oct 29, 2018

Fixes gh-189.

  • Creates a new codegen subdirectory; move some existing codegen functionality there
  • New build_callables function

FIX: equilibrium/calculate: Properly filter phases that cannot be built with build_callables()
MAINT: Remove Hessian callable support and clean up PhaseRecord pickling
FIX: solver: Tweak to attempt fix for stochastic failure in test_eq_issue43_chempots_misc_gap
BLD: appveyor: Build fixes
BLD: New implicit dependency on numpy>=1.13 from xarray
MAINT: dask>=0.18 compatibility
MAINT: core.utils: Add get_pure_elements from ESPEI
MAINT: codegen: Move sympydiff_utils and custom_autowrap to separate subdirectory

@richardotis richardotis added this to the 0.7.1 milestone Oct 29, 2018

@richardotis richardotis changed the title from WIP/MAINT: Refactor callables creation in equilibrium() and calculate() to MAINT: Refactor callables creation in equilibrium() and calculate() Nov 1, 2018

@richardotis richardotis requested a review from bocklund Nov 1, 2018

@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 1, 2018

One possible TODO might be to take the callables nested dict object and turn it into a real class, to make it a little more self-documenting

@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 2, 2018

How do I use this to pass different parameter sets without rebuilding callables?

I would think that passing different parameter sets to equilibrium when passing callables would result in the callables not being rebuilt (the parameters are in the callables symbolically), but the new parameters get built in the new phase records.

What am I doing wrong here?

DO_ASSERTS = False

from pycalphad.tests.datasets import ALFE_TDB
from pycalphad import Database, equilibrium, variables as v
from pycalphad.codegen.callables import build_callables
comps = ["AL"]
TDB = """
ELEMENT AL   FCC_A1                    26.981539   4577.296    28.3215!
FUNCTION VV0000     298.15   5000;              6000.00 N !

PHASE FCC_A1  %  1 1 !
   CONSTITUENT FCC_A1  :AL:  !
   PARAMETER G(FCC_A1,AL;0)      298.15   +VV0000#;       2900 N !
"""
dbf = Database(TDB)
phases = ['FCC_A1']
conds = {v.P: 101325, v.T: 500}

print('Check that current database should work as expected...')
eq_res = equilibrium(dbf, comps, phases, conds)
if DO_ASSERTS: assert eq_res.GM.values.squeeze() == 5000.0
if not DO_ASSERTS: print(eq_res.GM.values.squeeze() == 5000.0)


print('Check that overriding parameters works...')
eq_res = equilibrium(dbf, comps, phases, conds, parameters={'VV0000': 10000})
if DO_ASSERTS: assert eq_res.GM.values.squeeze() == 10000.0
if not DO_ASSERTS: print(eq_res.GM.values.squeeze() == 10000.0)

# build callables with a parameter of 20000.0
callables = build_callables(dbf, comps, phases, parameters={'VV0000': 20000})

print("Check that passing callables should skip the build phase, but use the values from 'VV0000' in the tdb...")
eq_res = equilibrium(dbf, comps, phases, conds, callables=callables)
if DO_ASSERTS: assert eq_res.GM.values.squeeze() == 5000.0
if not DO_ASSERTS: print(eq_res.GM.values.squeeze() == 5000.0)

print("Check that passing callables should skip the build phase, but use the values from 'VV0000' as passed in parameters...")
eq_res = equilibrium(dbf, comps, phases, conds, callables=callables, parameters={'VV0000': 10000})
if DO_ASSERTS: assert eq_res.GM.values.squeeze() == 10000.0
if not DO_ASSERTS: print(eq_res.GM.values.squeeze() == 10000.0)

from sympy import Symbol
print("Check that passing callables should skip the build phase, but use the values from Symbol('VV0000') as passed in parameters...")
eq_res = equilibrium(dbf, comps, phases, conds, callables=callables, parameters={Symbol('VV0000'): 10000})
if DO_ASSERTS: assert eq_res.GM.values.squeeze() == 10000.0
if not DO_ASSERTS: print(eq_res.GM.values.squeeze() == 10000.0)
Check that current database should work as expected...
True
Check that overriding parameters works...
True
Check that passing callables should skip the build phase, but use the values from 'VV0000' in the tdb...
False
Check that passing callables should skip the build phase, but use the values from 'VV0000' as passed in parameters...
False
Check that passing callables should skip the build phase, but use the values from Symbol('VV0000') as passed in parameters...
False
@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 11, 2018

Thank you for this. I am now working to resolve a state issue with build_callables and parameters, but we need to discuss some of it before the fix can be completed.

Check that passing callables should skip the build phase, but use the values from 'VV0000' in the tdb
I am not sure that the behavior here is wrong but let's discuss it. callables now keeps a dict of PhaseRecords by default, and there the specified parameter values are stored. If I call build_callables(dbf, comps, phases, parameters={'VV0000': 20000}), and use that object to do calculations, I expect the value of VV0000 to be treated as 20000, right?

@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 12, 2018

The ideal case (IMO) is that build_callables should be able to resolve parameters just in time without triggering a recompile. The resolution order should be

  1. Any parameters passed to calculate or equilibrium
  2. The symbols in the Database

Are there any considerations that make this not possible? E.g. runtime performance?

@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 12, 2018

The ideal case (IMO) is that build_callables should be able to resolve parameters just in time without triggering a recompile.

I think that's how things work now (as of 52fe6c9 at least). We only disagree on the resolution order. This is how it looks as implemented in 52fe6c9:

  1. Any parameters passed to calculate or equilibrium
  2. The parameter values passed to build_callables

PhaseRecords don't know which symbols (or even which Database) their parameters correspond to, so if we wanted to change the default parameter values, we'd need to create additional state somewhere. But using the values passed into build_callables seem like a natural default to me; otherwise, what's the point of passing in default values at all?

Alternatively, we could add support for passing in a list of symbols to the parameters kwarg (instead of a dict), and choose defaults based on the values in the Database -- however, entries in Database.symbols are not guaranteed to be single, real values; in general, they may be functions. The logic for making this work would get messy. I think a good workaround for this functionality would be to choose your default values in build_callables to be the same as those in Database.symbols.

@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 12, 2018

I agree with your resolution order and I think your current solution is fine. Any looking into the Database in build_callables would just be a convenience to passing them programmatically or by hand.

I have this working for ZPF error in ESPEI, so there are no blockers for my review of this PR.

Show resolved Hide resolved pycalphad/model.py Outdated
@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 13, 2018

Your recent changes to the build callables seem to work, but there's a couple chances for you to shoot yourself in the foot.

  1. If you instantiate your models before passing them to build_callables, then they won't have the symbolic parameters popped (unless you explicitly pass parameter symbols in the Model(dbf, comps, phases, parameters=my_symbols_list), but the catch is now you have to pass in parameters to your calculate or equilibrium, otherwise it gets evaluated to zero.

I think this is the current behavior:

I. If no Model instance is passed to build_callables (e.g. the class Model or a subclass are passed), then things are replaced as expected.
II. If Model instances with Model(..., parameters=None) or Model(..., parameter=<dict>), the values are essentially hardcoded and any parameters passed to calculate, equilibrium and build_callables don't take effect
III. If the Model(..., parameters=<list>), then build_callables works as intended

These cases are demonstrated in the script at the end.

  1. If callables are passed to equilibrium or calculate, anything passed to the model kwarg will be disregarded. Perhaps we should warn the user.
DO_ASSERTS = False

from pycalphad.tests.datasets import ALFE_TDB
from pycalphad import Database, calculate, Model, variables as v
from pycalphad.codegen.callables import build_callables
import numpy as np
from sympy import Symbol
comps = ["AL", "NI"]
OUTPUT = 'HM_MIX'

TDB = """
ELEMENT AL   FCC_A1                    26.981539   4577.296    28.3215!
ELEMENT NI   FCC_A1                    0   0    0!
FUNCTION VV0000     298.15   5000;              6000.00 N !
FUNCTION VV0001     298.15   10;              6000.00 N !

PHASE FCC_A1  %  1 1 !
   CONSTITUENT FCC_A1  :AL,NI:  !
   PARAMETER G(FCC_A1,AL;0)      298.15   -10000;       2900 N !
   PARAMETER G(FCC_A1,NI;0)      298.15   -10000;       2900 N !
   PARAMETER L(FCC_A1,AL,NI;0)      298.15   4*VV0000#+4*VV0001#*T;       2900 N !
"""
dbf = Database(TDB)
phases = ['FCC_A1']
conds = {v.P: 101325, v.T: 500}

print('Case 0: (no callables built) preinstantiating the Model without passing parameters substitutes correctly in calculate...')
models = {'FCC_A1': Model(dbf, comps, phases[0])}

calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], parameters={'VV0000': 10000})
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 10000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 10000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())


print('Case 1: preinstantiating the Model without passing parameters substitutes correctly in build_callables...')
models = {'FCC_A1': Model(dbf, comps, phases[0])}

# build callables with a parameter of 20000.0
callables = build_callables(dbf, comps, phases, model=models, parameters={'VV0000': 20000}, output=OUTPUT, build_gradients=False)

calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], callables=callables, parameters={'VV0000': 10000})
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 10000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 10000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())


print('Case 2: preinstantiating the Model with popping the parameters we\'ll substitute are substituted correctly in build_callables...')
models = {'FCC_A1': Model(dbf, comps, phases[0], parameters=[Symbol('VV0000')])}

# build callables with a parameter of 20000.0
callables = build_callables(dbf, comps, phases, model=models, parameters={'VV0000': 20000}, output=OUTPUT, build_gradients=False)

calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], callables=callables, parameters={'VV0000': 10000})
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 10000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 10000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())

print('If no parameters are called, we adopt what was in the Database (or in build_callables, depending on the decided resultion order)...')
# What actually happens is that if we don't pass any parameters, the value just goes to zero instead of taking on the parameter in the database.
# Possible way to shoot yourself in the foot, probably not intended behavior
calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], callables=callables)
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 5000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 5000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())


print('Case 3: preinstantiating the Model with parameter dict substities correctly...')
models = {'FCC_A1': Model(dbf, comps, phases[0], parameters={'VV0000': 8000})}

# build callables with a parameter of 20000.0
callables = build_callables(dbf, comps, phases, model=models, parameters={'VV0000': 20000}, output=OUTPUT, build_gradients=False)

print('Override the parameters with calculate works...')
calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], callables=callables, parameters={'VV0000': 10000})
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 10000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 10000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())

print('With no parameters passed, we adopt what was passed to the Model...')
calc_res = calculate(dbf, comps, phases, conds, output=OUTPUT, points=[[0.5, 0.5]], callables=callables)
if DO_ASSERTS: assert calc_res.HM_MIX.values.squeeze() == 8000.0
if not DO_ASSERTS: print(np.isclose(calc_res.HM_MIX.values.squeeze(), 8000.0))
print("Parameter value we got: ", calc_res.HM_MIX.values.squeeze())

With output

Case 0: (no callables built) preinstantiating the Model without passing parameters substitutes correctly in calculate...
True
Parameter value we got:  10000.0
Case 1: preinstantiating the Model without passing parameters substitutes correctly in build_callables...
False
Parameter value we got:  5000.0
Case 2: preinstantiating the Model with popping the parameters we'll substitute are substituted correctly in build_callables...
True
Parameter value we got:  10000.0
If no parameters are called, we adopt what was in the Database (or in build_callables, depending on the decided resultion order)...
False
Parameter value we got:  20000.0
Case 3: preinstantiating the Model with parameter dict substities correctly...
Override the parameters with calculate works...
False
Parameter value we got:  8000.0
With no parameters passed, we adopt what was passed to the Model...
True
Parameter value we got:  8000.0
@bocklund

I don't have many comments. This looks great.

I wish calculate were this clear when I started looking at this codebase. equilibrium looks a lot better too, but there's still some more room to factor things out.

You can try to fix the corner case of Model instances and build_callables if you want, but I don't consider it blocking. It's a point of confusion, but probably a less trodden code path (and it's improved over the previous case that required some magic deletion of symbols in the Database).

Show resolved Hide resolved pycalphad/codegen/callables.py
Show resolved Hide resolved pycalphad/codegen/callables.py Outdated
@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 13, 2018

It seems like the easiest way to resolve the footguns with Model instances is to remove the ability to pass your own Model instances at all, that is, only allow the class/subclass itself to be passed and require build_callables to do the instantiation. Would doing that break any important functionality?

@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 13, 2018

It potentially breaks stuff like

mod =  Model(dbf, comps, phase_name, parameters=[sympy.Symbol(param)])
mod.models['idmix'] = 0

or other hacky stuff you want to do with Models after you have instantiated them or any other special things you might want to do on an instance.

@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 13, 2018

Alright, I'll leave that alone for now then.

@richardotis

This comment has been minimized.

Collaborator

richardotis commented Nov 13, 2018

@bocklund If you want to do one more run against ESPEI's test suite, I think this is the version I will merge once tests pass.

@bocklund

This comment has been minimized.

Collaborator

bocklund commented Nov 13, 2018

ESPEI works. Looks good to me.

@richardotis richardotis merged commit 9a2a907 into develop Nov 13, 2018

5 checks passed

continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.4%) to 88.161%
Details

@richardotis richardotis referenced this pull request Nov 13, 2018

Closed

AppVeyor CI issues #185

@richardotis richardotis deleted the refactor_callables branch Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment