Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialisation of models #3397

Merged
merged 32 commits into from Nov 28, 2023
Merged

Conversation

pipliggins
Copy link
Contributor

@pipliggins pipliggins commented Oct 2, 2023

Description

Creates a serialisation method for writing PyBaMM models to JSON. Models can be written to/read-in from JSON by saving the discretised model properties. Variables, meshes and geometry can be optionally saved if users wish to use PyBaMM's plotting functionality when a model is read back in and solved.

A Serialise class is created to contain the methods, including custom JSON encoders for PyBaMM objects which can iterate through expression trees.
Symbols and meshes have to_json and from_json functions added to enable PyBaMM objects to be serialised using JSON.

Fixes # 2787

Type of change

Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist:

  • No style issues: $ pre-commit run (or $ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)
  • All tests pass: $ python run-tests.py --all (or $ nox -s tests)
  • The documentation builds: $ python run-tests.py --doctest (or $ nox -s doctests)

You can run integration tests, unit tests, and doctests together at once, using $ python run-tests.py --quick (or $ nox -s quick).

Further checks:

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Create to_json() functions in corresponsing classes
Working basic de/serialisation
Creates _from_json() functionality
Stores save_/load_model functions.
Currently working for default models.

Add errors, make accessible from Simulation
Option to save mesh, variables and geometry

Draft notebook written
Added warning if variables are not provided and try to plot
Put warning in for BaseModel - atm requires more model information
to re-create the model.
allow interpolant to be serialised
fix concatenation with debug mode
switch msmr warnings
allow BaseModel to run without rhs
Add to_from_json test for Events
add to_json tests for meshes
All but 3 int. tests passing, high accuracy diff failures
update integration test to pass at lower accuracy
Remove outputs from example notebook
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@pipliggins pipliggins linked an issue Oct 2, 2023 that may be closed by this pull request
Copy link
Contributor

@martinjrobins martinjrobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pipliggins, this is great work to serialise/deserialise such a large and complicated class(es)! I've only got up to pybamm/models/event.py in the review and need to stop for now, so I'll just add this part as a comment then finish the review later. I notice that there is a failing test to fix as well, looks like a small thing though

pybamm/expression_tree/array.py Show resolved Hide resolved
pybamm/expression_tree/unary_operators.py Outdated Show resolved Hide resolved
pybamm/meshes/one_dimensional_submeshes.py Show resolved Hide resolved
pybamm/meshes/one_dimensional_submeshes.py Show resolved Hide resolved
Copy link
Contributor

@martinjrobins martinjrobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finished going through this now, sorry for the delay! Looks excellent @pipliggins, happy to merge once you've dealt with the minor comments below

pybamm/plotting/quick_plot.py Outdated Show resolved Hide resolved
tests/unit/test_expression_tree/test_interpolant.py Outdated Show resolved Hide resolved
tests/unit/test_expression_tree/test_interpolant.py Outdated Show resolved Hide resolved
tests/unit/test_expression_tree/test_unary_operators.py Outdated Show resolved Hide resolved
* Add pybamm version to JSON file
* Re-word missing variable message
* Refactor unary_operator _from_json()
@codecov
Copy link

codecov bot commented Oct 20, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5322895) 99.58% compared to head (df35b91) 99.59%.
Report is 21 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3397      +/-   ##
===========================================
+ Coverage    99.58%   99.59%   +0.01%     
===========================================
  Files          256      257       +1     
  Lines        20131    20639     +508     
===========================================
+ Hits         20048    20556     +508     
  Misses          83       83              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@martinjrobins martinjrobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pipliggins, all looks good, except for a few added lines that arn't covered by tests, see codecov report.

@brosaplanella
Copy link
Sponsor Member

The failing mac tests are unrelated, so feel free to review and merge if you think it is ready.

Copy link
Contributor

@martinjrobins martinjrobins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, thanks @pipliggins. Happy to merge once changelog is added to

@TomTranter
Copy link
Contributor

@pipliggins @martinjrobins thanks very much for working on this. I think it will make PyBaMM much more transferable. I checked out the branch and ran into a couple of issues though when trying to couple a loaded model with an experiment. All the examples just solve with a time assuming fixed current.

This code fails

import pybamm
model = pybamm.lithium_ion.SPM()
experiment = pybamm.Experiment(["Discharge at 1C for 1 hour"])
sim = pybamm.Simulation(model, experiment=experiment)
sim.solve()
sim.save_model("spm_experiment", mesh=False, variables=False)

with

NotImplementedError:
PyBaMM can only serialise a discretised model.
Ensure the model has been built (e.g. run solve()) before saving.

and this code fails

# create the model
spm_model = pybamm.lithium_ion.SPM()
# set up and discretise ready to solve
geometry = spm_model.default_geometry
param = spm_model.default_parameter_values
param.process_model(spm_model)
param.process_geometry(geometry)
mesh = pybamm.Mesh(geometry, spm_model.default_submesh_types, spm_model.default_var_pts)
disc = pybamm.Discretisation(mesh, spm_model.default_spatial_methods)
disc.process_model(spm_model)
# Serialise the spm_model, providing the varaibles and the mesh
spm_model.save_model("example_model", variables=spm_model.variables, mesh=mesh)
# read back in
new_spm_model = pybamm.load_model("example_model.json")
experiment = pybamm.Experiment(["Discharge at 1C for 1 hour"])
new_spm_sim = pybamm.Simulation(new_spm_model, experiment=experiment)
# plot the solution
new_spm_sim.solve()
new_spm_sim.plot()

with
AttributeError: 'DomainConcatenation' object has no attribute '_full_mesh'

Is there a plan to implement this or does it require experiments to be serialized. Would be a shame if you couldn't re-use the model with different simulations.

@pipliggins
Copy link
Contributor Author

@TomTranter thanks for raising this! Models coupled to experiments isn't a workflow we'd considered yet, and given the capacity for a single experiment to hold multiple models this will require some work/thinking around how best to structure the serialised file(s), for import particularly.

Having spoken to @martinjrobins, the plan is to merge the current branch (I've added an explicit warning that serialising models coupled to experiments is not yet supported), close this pull request and create a new one focused on expanding the basic model serialisation established here to support the Experiment class.

@TomTranter
Copy link
Contributor

I can see why saving a model once coupled to a simulation doesn't work without extra development but the second example I posted where just saving and loading a model before putting into a sim with an experiment feels like it should work. It's a bit confusing having the sim save the model actually. I appreciate why it's been done that way because a sim is changing it's associated objects and Simulation probably contains too much logic rather than simply being a way to group together those components.

Copy link
Member

@Saransh-cpp Saransh-cpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chiming in to review the type hints 🙂

Edit: this can be ignored, given that the main motive of this PR was not to add type hints

@@ -57,6 +57,30 @@ def __init__(
name, domain=domain, auxiliary_domains=auxiliary_domains, domains=domains
)

@classmethod
def _from_json(cls, snippet: dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that PyBaMM supports Python 3.8+, each file using the newer type hints mus import -

from __future__ import annotations

at the top to ensure backward compatibility. This can also be automated using the isort rules in ruff.

@@ -244,6 +256,25 @@ class SpecificFunction(Function):
def __init__(self, function, child):
super().__init__(function, child)

@classmethod
def _from_json(cls, function: Callable, snippet: dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type should be narrowed down if possible - Callable[[..., ..., ...], ...]. Similarly, the dict type should also be narrowed down - dict[..., ...].

@rtimms
Copy link
Contributor

rtimms commented Nov 24, 2023

Thanks @pipliggins, this is a great addition!

My thoughts on serialising models for experiments...

The issue with trying to use a serialised model with an experiment is that the model changes depending on what the operating mode is -- you need to add an extra algebraic constraint to specify power or voltage (at least in the way we have formulated the model). This currently happens in Simulation.build_for_experiment, which creates the dict op_conds_to_built_models. For a serialised model to be used with any future experiment we would need to create the current/voltage/power control versions of the model at the point of serialisation, so they could all be accessed in the future. Not sure what the API for this would look like - Simulation expects to receive a single model, so there'd have to be some new function that sets up a simulation using e.g. a dict mapping the operating mode to the correct model.

An alternative approach would be to create a version of the model where the control is always implemented as an extra algebraic constraint, and then have an option in the simulation that uses this single model instead of creating a new model for each distinct operating mode. The control function would be something like (I_app - I_var)*I_switch + ... with similar switches for voltage and power, and everything would be controlled via input parameters. The drawback of this option is that it would make all models DAEs, but maybe that's not a huge deal in an experiment anyway. Would need to do some solver testing to see how performance is affected.

I don't think either of these options are ideal.

As mentioned in #3530, the logic for setting up experiments in the Simulation class is very complex, so perhaps a rehaul of that is in order before we decide how to handle serialising models for use with experiments.

And, of course, this will all break if we allow more generic experiment control in the future (#3530).

"# do the example\n",
"dfn_model = pybamm.lithium_ion.DFN()\n",
"dfn_sim = pybamm.Simulation(dfn_model)\n",
"dfn_sim.solve([0, 3600])\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very minor comment, but I wonder if the example (and the warning) should encourage the use of sim.build rather than sim.solve? I have a feeling many users aren't aware you can even call sim.build, but it seems like you might want to create a model and save it without actually running a simulation.

I can imagine a workflow where people end up just doing a dummy solve for a short time before saving a model as they think they need to solve it first.

@pipliggins
Copy link
Contributor Author

pipliggins commented Nov 24, 2023

Chiming in to review the type hints 🙂

Edit: this can be ignored, given that the main motive of this PR was not to add type hints

Thanks @Saransh-cpp - I'll leave these for this PR but as I'm doing type hints elsewhere for #3497 this is useful!

@TomTranter
Copy link
Contributor

Thanks @pipliggins, this is a great addition!

My thoughts on serialising models for experiments...

The issue with trying to use a serialised model with an experiment is that the model changes depending on what the operating mode is -- you need to add an extra algebraic constraint to specify power or voltage (at least in the way we have formulated the model). This currently happens in Simulation.build_for_experiment, which creates the dict op_conds_to_built_models. For a serialised model to be used with any future experiment we would need to create the current/voltage/power control versions of the model at the point of serialisation, so they could all be accessed in the future. Not sure what the API for this would look like - Simulation expects to receive a single model, so there'd have to be some new function that sets up a simulation using e.g. a dict mapping the operating mode to the correct model.

An alternative approach would be to create a version of the model where the control is always implemented as an extra algebraic constraint, and then have an option in the simulation that uses this single model instead of creating a new model for each distinct operating mode. The control function would be something like (I_app - I_var)*I_switch + ... with similar switches for voltage and power, and everything would be controlled via input parameters. The drawback of this option is that it would make all models DAEs, but maybe that's not a huge deal in an experiment anyway. Would need to do some solver testing to see how performance is affected.

I don't think either of these options are ideal.

As mentioned in #3530, the logic for setting up experiments in the Simulation class is very complex, so perhaps a rehaul of that is in order before we decide how to handle serialising models for use with experiments.

And, of course, this will all break if we allow more generic experiment control in the future (#3530).

I like option 2 better. The file size for a DFN with variables is already 13MB. Having lots of these will quickly add up

@pipliggins
Copy link
Contributor Author

Thanks for your thoughts @TomTranter @rtimms - I would tend to agree that something like option 2 would be better. The size/number of files that could be written out if we ended up trying to serialise every model permutation would get very unwieldy pretty fast. The downside is that it probably has a higher overhead to implement.

As already discussed, the issue with going between single model <-> experiment is that the structure of the built model changes. The reason for forcing the model to be discretised before serialising is ultimately to reduce the complexity of what is being written out, and make it more portable to other solvers; the downside is that a once-serialised model cannot be edited (for example to add in experimental constraints) once it's read back in, as the original RHS/algebraic... etc are lost. The with/without experiment options also follow pretty different flows through the Simulation class to set up for solving. If there are already plans afoot to reorganize this somewhat, possibly associated with #3530, my suggestion would be to put a pin in the integration of the Experiment options with serialisation until those changes have been made - hopefully with an eye on making the with/without experiment workflows more integrated - and in the meantime merging the current PR as a serialisation V1.

@martinjrobins what do you think?

@martinjrobins
Copy link
Contributor

I don't think we should do serialisation of model + experiment in this PR, the scope here was to serialize a single already built model, which is already complicated enough, and clearly there needs to be more discussion on how exactly we are going to support the serialisation of model + experiment.

It would be great if we could reformulate the models such that we could have a single model for all experiment steps! That would certainly make it simpler if the solve time wasn't affected too much. Definitely something to try out.

I take @TomTranter point about the sim being the object to save the model. There is a BaseModel.save_model function which is the natural place to put it, but since users generally interact with pybamm models via Simulation we wanted to add one there as well (which just calls the save_model function on the build model)

@rtimms @TomTranter, should we move the discussion on serialisation of model + experiment to a separate issue? We can then merge this in and use it as a basis for future work.

@martinjrobins martinjrobins merged commit 25b1e75 into pybamm-team:develop Nov 28, 2023
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

proposal for serialisation of models
6 participants