Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON model import and export #469

Merged
merged 12 commits into from Nov 13, 2019
Merged

JSON model import and export #469

merged 12 commits into from Nov 13, 2019

Conversation

@alubbock
Copy link
Member

alubbock commented Oct 19, 2019

Add JSON model import and export capabilities. JSON
is a text-serializable format, which is good for storing
models within other objects, or for purposes like HTTP
APIs.

SimulationResult.save() now saves models in JSON
rather than pickle.

See discussion on #467 which prompted the need for
a durable and serializable model storage format.

alubbock added 2 commits Oct 18, 2019
Add JSON model import and export capabilities. JSON
is a text-serializable format, which is good for storing
models within other objects, or for purposes like HTTP
APIs.
Replaces pickling the model
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 19, 2019

Codecov Report

❗️ No coverage uploaded for pull request base (master@cf08a76). Click here to learn what that means.
The diff coverage is 89.4%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #469   +/-   ##
=========================================
  Coverage          ?   80.08%           
=========================================
  Files             ?      101           
  Lines             ?    10343           
  Branches          ?        0           
=========================================
  Hits              ?     8283           
  Misses            ?     2060           
  Partials          ?        0
Impacted Files Coverage Δ
pysb/export/__init__.py 96% <ø> (ø)
pysb/annotation.py 93.33% <100%> (ø)
pysb/tests/test_exporters.py 90.47% <100%> (ø)
pysb/simulator/base.py 88.8% <50%> (ø)
pysb/importers/json.py 93.57% <93.57%> (ø)
pysb/export/json.py 98.48% <98.48%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf08a76...e4d8e8f. Read the comment docs.

@bgyori

This comment has been minimized.

Copy link
Contributor

bgyori commented Oct 21, 2019

This looks promising! One of the issues though, is that often, one of the main rationales behind pickling a PySB model is to save the result of having run bng.generate_equations on the model (which for certain models can be quite slow). The model can then be quickly reloaded and simulated without having to regenerate the reaction network. This is not immediately possible with the JSON approach, though perhaps could be optionally handled by encoding the equations in JSON as well.

alubbock added 3 commits Oct 21, 2019
Applies to both JSON export and import. Useful for
caching rxn net to avoid network re-generation.
@alubbock

This comment has been minimized.

Copy link
Member Author

alubbock commented Oct 22, 2019

@bgyori No problem, I've added an option for this: JsonExporter(model).export(include_netgen=True)

Copy link
Member

jmuhlich left a comment

This is impressive, and quickly turned around!

# restore the 'model' weakrefs on all components
self.__dict__.update(state)
# Set "tags" attribute for older, pickled models
self.__dict__.setdefault('tags', ComponentSet())

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 8, 2019

Member

Is this the only thing we would have to patch to support all potential old pickles that might be out there in saved SimulationResult hdf5 files, or are there other attributes we've added since? We should either fix everything or fix nothing, but I fear fixing everything might get out of hand.

This won't address the use case that brought up #467 anyway because they are doing explicit pickling/unpickling not saving SimulationResults.

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 8, 2019

Author Member

This is the only change required to unpickle models or load HDF5 files, at least as far back as PySB v1.6.0 where the HDF5 format was introduced.

For the standalone pickle cases, the following snippet works. Let me know if you think I should remove the underscore prefix and/or document this somehwere explicitly.

from pysb.simulator.base import _patch_model_setstate

with open('earm_1_0_pysb1.6.pkl', 'rb') as f:
    with _patch_model_setstate():
        pickle.load(f)
# Check network generation and single-step integrator
if model.name not in ('pysb.examples.tutorial_b',
'pysb.examples.tutorial_c'):
ScipyOdeSimulator(m)

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 8, 2019

Member

This just initializes the integrator. Based on the comment on line 112, did you mean to actually run it a bit?

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 8, 2019

Author Member

The integrator RHS is run for t=0 during initialization, at least when using Cython (line 219 of scipyode.py). But I'll change this to enforce Cython, using compiler='cython' kwarg.



class PySBJSONEncoder(json.JSONEncoder):
FORMAT = 1

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 8, 2019

Member

Can you write some documentation, perthaps on this class and/or the decoder, about what we need to think about as Model and the components gain new attributes in order to maintain proper compatibility here or bump the version number?

Also, to be a bit pedantic, is "format" the right name? I would think "version" or even "protocol" (to mimic pickle) would be more apt.

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 8, 2019

Author Member

I've switched to protocol, rather than format.

I've added some documentation to cover the basics of the format. The protocol number should get bumped when new features are added which affect model simulation, or prevent it being loaded by the decoder. I don't think it's a bad thing if this happens, so long as the decoder maintains backwards compatibility.

# Check network generation and single-step integrator
if model.name not in ('pysb.examples.tutorial_b',
'pysb.examples.tutorial_c'):
ScipyOdeSimulator(m)

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 8, 2019

Member

This is a reasonable smoke test to make sure the model isn't broken, but it doesn't demonstrate we got the same model back. We ought to actually check that the model is the same, but that's probably infeasible on every example unless we leverage some other exporter (and hope we don't end up with parallel bugs in both exporters). The first thing that comes to mind is to run each model and its re-imported version through the pysb_flat exporter and compare the strings. Of course the pysb_flat exporter doesn't have a round-trip fidelity test either, does it?

Or maybe create a minimal model that exercises every feature once, and explicitly check all the components before and after re-import? That might be a big job, though.

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 8, 2019

Author Member

Indeed - I used PySBFlatExporter to manually check things were working as expected, but symbols within expressions don't appear in a deterministic order. I agree the best solution is probably to create a toy model that uses every feature once and check equality manually.

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 12, 2019

Author Member

I ended up using check_model_against_component_list on all of the example models after round-tripping to JSON, which seems to do the job.

alubbock added 5 commits Nov 8, 2019
This is to avoid e.g. component names getting converted
from strings to unicode. This is merely a test artifact, and
doesn't seem worth fixing given we're dropping Python 2 support
in under 2 months' time.
Component.__init__(self, name, _export)

@property
def value(self):

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 13, 2019

Member

What's this new getter/setter for? I'm guessing round-trip testing exposed a code path where the normalization of .value to a float was getting bypassed?

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 13, 2019

Author Member

Yes, in the earm_1_3.py model where .value is used with integers, but these values were getting converted to floats after round-tripped to JSON.

@@ -39,7 +39,7 @@


Monomer('EGF', ['R'])
Monomer('EGFR', ['L','CR1','Y1068'], {'Y1068':('U','P')})
Monomer('EGFR', ['L','CR1','Y1068'], {'Y1068': ['U', 'P']})

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 13, 2019

Member

Maybe Monomer should just normalize sites and site_states.values() to list, but that can be done outside of this PR.

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 13, 2019

Author Member

Agree, we should normalize in core.py.

@@ -112,7 +114,12 @@ def check_convert(model, format):
# Check network generation and single-step integrator

This comment has been minimized.

Copy link
@jmuhlich

jmuhlich Nov 13, 2019

Member

As you pointed out in your previous comment, we actually just need to evaluate the RHS once which is different from stepping the integrator. This comment would be more accurate as something like "Check network generation and force RHS evaluation".

This comment has been minimized.

Copy link
@alubbock

alubbock Nov 13, 2019

Author Member

Comment fixed as requested

@alubbock alubbock merged commit 54ff9e4 into pysb:master Nov 13, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@alubbock alubbock deleted the alubbock:json branch Nov 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.