Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run TestPythonReproducibility #24

Open
3 of 4 tasks
MichaelClerx opened this issue Feb 18, 2019 · 21 comments
Open
3 of 4 tasks

Run TestPythonReproducibility #24

MichaelClerx opened this issue Feb 18, 2019 · 21 comments
Labels
enhancement New features, incremental improvements

Comments

@MichaelClerx
Copy link
Contributor

MichaelClerx commented Feb 18, 2019

  • Update to pytest
  • Make pep8 compliant
  • Look at multiprocessing stuff, still required?
  • Set up cronjob to run this
@MichaelClerx MichaelClerx added the enhancement New features, incremental improvements label Apr 12, 2019
@MichaelClerx
Copy link
Contributor Author

As discussed in #181 this will need a cron-job to run.

@jonc125 do you think it will last longer than an hour? Will travis time out?

@jonc125
Copy link
Contributor

jonc125 commented May 4, 2020

That's an interesting question :) I'm not sure! We can tweak...

@MichaelClerx
Copy link
Contributor Author

With PINTS I think we ran into a 1 hour limit, but not sure if that goes away if you have a fancy account.
Otherwise I think GitHub actions is offering more free time at the moment

@MichaelClerx
Copy link
Contributor Author

(Do you have a fancy account?)

@jonc125
Copy link
Contributor

jonc125 commented May 4, 2020

Not on this organisation. We can always cut the test down, split into two, or look at other CI providers, once we see how long it takes.

@jonc125
Copy link
Contributor

jonc125 commented Oct 20, 2020

This test can now run locally, and I have done so (with 24 cores!) on scrambler, using #217.

FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -n 24

Currently 24 model/protocol combinations fail unexpectedly:

FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-mahajan_shiferaw_2008] - AssertionEr...
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[IKr_block-ohara_rudy_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[S1S2-difrancesco_noble_model_1985]
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-ten_tusscher_model_2006_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-carro_2011_epi]
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-fink_noble_giles_model_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_Purkinje_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-grandi_pasqualini_bers_2010_ss] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-shannon_wang_puglisi_weber_bers_2004] - AssertionE...
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-difrancesco_noble_model_1985] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-carro_2011_epi] - AssertionError:
====================================== 24 failed, 155 passed, 17 xfailed in 4794.38s (1:19:54) =======================================

While a couple are 'failing' because they produce output when there's no reference available, the rest are numerical differences in the output which we should investigate. I suspect the first port of call would be to replicate the work @MauriceHendrix did in chaste-codegen to stop sympy being over-enthusiastic with equation transformations. @mirams, would you be happy for Maurice to help on this?

@MauriceHendrix
Copy link
Contributor

MauriceHendrix commented Oct 21, 2020

I tried to use evaluate=False context but under some of the operations it doesn't stop evaluation annoyingly.

What I have done instead is a bit hacky but it seems to work well:

The following is in my __init.py__

from cellmlmanip.parser import Transpiler
from ._math_functions import (abs_, acos_, cos_, exp_, sin_, sqrt_, )

Transpiler.set_mathml_handler('exp', exp_)
Transpiler.set_mathml_handler('abs', abs_)
Transpiler.set_mathml_handler('acos', acos_)
Transpiler.set_mathml_handler('cos', cos_)
Transpiler.set_mathml_handler('sqrt', sqrt_)
Transpiler.set_mathml_handler('sin', sin_)

Then in my printer I've replaced those to make it print the correct C++ function names.

_function_names = {
        'abs_': 'fabs',
        'acos_': 'acos',
        'cos_': 'cos',
        'exp_': 'exp',
        'sqrt_': 'sqrt',
        'sin_': 'sin',
        ...
       }

The functions themselves are defined as follows. I've included the first differential as otherwise it won't generate correct jacobians.
Also note the MATH_FUNC_SYMPY_MAPPING I use this when I need to check the sign of a stimulus equation, being careful not to change the equation in the model that gets ultimately printed.

from sympy import (
    Abs,
    Function,
    acos,
    cos,
    exp,
    sign,
    sin,
    sqrt,
)


class RealFunction(Function):
    def _eval_is_real(self):
        return self.args[0].is_real


class exp_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return self


class abs_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return sign(self.args[0])


class acos_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return -1 / sqrt_(1 - self.args[0]**2)


class cos_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return -sin_(self.args[0])


class sqrt_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return 1 / (2 * sqrt_(self.args[0]))


class sin_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return (cos_(self.args[0]))


# MATH_FUNC_SYMPY_MAPPING provides a mapping from our specified math functions back to sympy versions.
# This can be used to put sympy function into an expression or evaluation. e.g. `expr.subs(MATH_FUNC_SYMPY_MAPPING)`.
MATH_FUNC_SYMPY_MAPPING = {abs_: Abs, acos_: acos, cos_: cos, exp_: exp, sin_: sin, sqrt_: sqrt}

@mirams
Copy link

mirams commented Oct 29, 2020

Happy for Maurice to help applying the Chaste-codegen tweaks to the weblab-cg (where exactly did the weblab-cg code end up - is it all in cellmlmanip?)

@jonc125
Copy link
Contributor

jonc125 commented Nov 5, 2020

We already had the fake handling of exp, albeit with the older style that just invented a name rather than having a class with derivative support, since we don't use Jacobians in WL. Does the class form make any other improvement @MauriceHendrix ?

I added old-style prevention for the other functions Maurice listed in the fix-reproducibility branch, and we're at 22 failures:

FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-mahajan_shiferaw_2008] - AssertionEr...
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-fink_noble_giles_model_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-ten_tusscher_model_2006_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-grandi_pasqualini_bers_2010_ss] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[S1S2-difrancesco_noble_model_1985]
FAILED test/test_python_reproducibility.py::test_fc_experiment[IKr_block-ohara_rudy_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-carro_2011_epi]
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-difrancesco_noble_model_1985] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_Purkinje_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-shannon_wang_puglisi_weber_bers_2004] - AssertionE...
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-mahajan_shiferaw_2008] - AssertionError:
====================================== 22 failed, 157 passed, 17 xfailed in 5356.86s (1:29:16) =======================================

So slight improvement but clearly something else going on.

Were there any other tricks you used @MauriceHendrix ?

@MauriceHendrix
Copy link
Contributor

@jonc125 no I don't think so I mainly had to introduce the classes it to get it to make Jacobians correctly.

@MauriceHendrix
Copy link
Contributor

@jonc125 @MichaelClerx looking at the actual test I'm getting really confused as I can't see where the test_fc_experiment is?
I'm assuming you're tried math.isclose or similar on numeric values?

@jonc125
Copy link
Contributor

jonc125 commented Nov 6, 2020

Hi @MauriceHendrix the main test is https://github.com/ModellingWebLab/weblab-fc/blob/fix-reproducibility/test/test_python_reproducibility.py (starting line 59 after all the decorators!). The results checking uses np.testing.assert_allclose, which used to pass with pycml. But the implementation is spread around the fc package, e.g. code generation which writes the .pyx code, protocol.py which compiles it, etc.

@MauriceHendrix
Copy link
Contributor

Hhm very ood a search didn't find that function earlier anyway ...
hhm if you're already doing that and the sticking names in then I'm not sure ...
Are you using any of numpy to calculate anything besides all_close? We had a smilar issue with another project using numpy

@jonc125
Copy link
Contributor

jonc125 commented Nov 10, 2020

With #227 we're now down to just 1 failure! ExtracellularPotassiumVariation-li_mouse_2010

This is the code I used to compare results by eye, having run

FC_OUTPUT_FOLDER=`pwd`/testoutput FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -n 24 | tee output.log
# Or for a single combination:
FC_OUTPUT_FOLDER=`pwd`/testoutput FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -k 'Extra and li_mouse' -s
import matplotlib.pyplot as plt
import numpy as np

def load1d(model, proto, output):
    from fc.data_loading import load
    ref = load(f'test/output/real/{model}/{proto}/outputs_{output}.csv').array
    actual = np.loadtxt(f'testoutput/{model}/{proto}/outputs_{output}.csv', dtype=float)
    return ref, actual

def load2d(model, proto, output):
    ref = np.loadtxt(f'test/output/real/{model}/{proto}/outputs_{output}.csv', dtype=float, delimiter=',', ndmin=2, unpack=True)
    actual = np.loadtxt(f'testoutput/{model}/{proto}/outputs_{output}.csv', dtype=float, delimiter=',', ndmin=2, unpack=True)
    return ref, actual

def compare(ref, actual):
    if ref.ndim == 1:
        fig, ax = plt.subplots()
        ax.plot(ref, label='ref')
        ax.plot(actual, label='new')
        ax.legend()
        fig.show()
    elif ref.ndim == 2:
        for i in range(ref.shape[0]):
            fig, ax = plt.subplots()
            ax.plot(ref[i, :], label='ref')
            ax.plot(actual[i, :], label='new')
            ax.legend()
            fig.show()

compare(*load2d('li_mouse_2010', 'ExtracellularPotassiumVariation', 'detailed_voltage'))

Interestingly, on scrambler there's only a difference in the voltage trace at step 1 (2 mM) whereas on my Mac it's also very different for step 8 (9 mM). So seems to be rather sensitive to the library versions etc.

  • 2mM EPV step 1
  • 9 mM EPV step 8

@mirams
Copy link

mirams commented Nov 10, 2020

I suspect what looks dramatic in the second is a 2:1 pacing which the ref happens to hit on the start of the period 2 orbit and the new happens to hit in the middle.

@mirams mirams closed this as completed Nov 10, 2020
@mirams mirams reopened this Nov 10, 2020
@mirams
Copy link

mirams commented Nov 10, 2020

didnt mean to close it, typing on phone!

@jonc125
Copy link
Contributor

jonc125 commented Nov 10, 2020

By the way, the apparently different initial conditions in the second plot are because this is just the last second of simulated time after the model has been run to steady state. Given it's supposedly a steady-state picture I'm a bit puzzled by the reference result! It's definitely an outlier compared to the other concentrations, but does still happen on scrambler...

Would you be happy for us to just update the reference solution for this final case @mirams ? Or should we investigate further?

@mirams
Copy link

mirams commented Nov 17, 2020

If we can plot two paces and just check they overlay when one is shifted by 1000ms, then fine to replace I think.

@jonc125
Copy link
Contributor

jonc125 commented Nov 18, 2020

It doesn't look like alternans in the 2mM case:

EPV step 1 - 2 paces

@mirams
Copy link

mirams commented Nov 18, 2020

Hmmm, can you play around in Myokit with this model and setting @MichaelClerx - if this is something of a sodium channel bifurcation we just happen to be hitting at 2mM I guess that is fine (i.e. if 2.01 or 1.99mM look like these?). Otherwise worth investigating the difference here.

@MichaelClerx
Copy link
Contributor Author

OK, will investigate! But probably not this week while I wait for finger to recover / learn better typing habits 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features, incremental improvements
Projects
None yet
Development

No branches or pull requests

4 participants