Run TestPythonReproducibility #24

MichaelClerx · 2019-02-18T14:28:04Z

Update to pytest
Make pep8 compliant
Look at multiprocessing stuff, still required?
Set up cronjob to run this

MichaelClerx · 2020-05-04T13:08:52Z

As discussed in #181 this will need a cron-job to run.

@jonc125 do you think it will last longer than an hour? Will travis time out?

jonc125 · 2020-05-04T13:12:26Z

That's an interesting question :) I'm not sure! We can tweak...

MichaelClerx · 2020-05-04T13:14:40Z

With PINTS I think we ran into a 1 hour limit, but not sure if that goes away if you have a fancy account.
Otherwise I think GitHub actions is offering more free time at the moment

MichaelClerx · 2020-05-04T13:14:49Z

(Do you have a fancy account?)

jonc125 · 2020-05-04T13:24:03Z

Not on this organisation. We can always cut the test down, split into two, or look at other CI providers, once we see how long it takes.

jonc125 · 2020-10-20T15:58:33Z

This test can now run locally, and I have done so (with 24 cores!) on scrambler, using #217.

FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -n 24

Currently 24 model/protocol combinations fail unexpectedly:

FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-mahajan_shiferaw_2008] - AssertionEr...
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[IKr_block-ohara_rudy_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[S1S2-difrancesco_noble_model_1985]
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-ten_tusscher_model_2006_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-carro_2011_epi]
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-fink_noble_giles_model_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_Purkinje_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-grandi_pasqualini_bers_2010_ss] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-shannon_wang_puglisi_weber_bers_2004] - AssertionE...
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-difrancesco_noble_model_1985] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-carro_2011_epi] - AssertionError:
====================================== 24 failed, 155 passed, 17 xfailed in 4794.38s (1:19:54) =======================================

While a couple are 'failing' because they produce output when there's no reference available, the rest are numerical differences in the output which we should investigate. I suspect the first port of call would be to replicate the work @MauriceHendrix did in chaste-codegen to stop sympy being over-enthusiastic with equation transformations. @mirams, would you be happy for Maurice to help on this?

MauriceHendrix · 2020-10-21T09:20:49Z

I tried to use evaluate=False context but under some of the operations it doesn't stop evaluation annoyingly.

What I have done instead is a bit hacky but it seems to work well:

The following is in my __init.py__

from cellmlmanip.parser import Transpiler
from ._math_functions import (abs_, acos_, cos_, exp_, sin_, sqrt_, )

Transpiler.set_mathml_handler('exp', exp_)
Transpiler.set_mathml_handler('abs', abs_)
Transpiler.set_mathml_handler('acos', acos_)
Transpiler.set_mathml_handler('cos', cos_)
Transpiler.set_mathml_handler('sqrt', sqrt_)
Transpiler.set_mathml_handler('sin', sin_)

Then in my printer I've replaced those to make it print the correct C++ function names.

_function_names = {
        'abs_': 'fabs',
        'acos_': 'acos',
        'cos_': 'cos',
        'exp_': 'exp',
        'sqrt_': 'sqrt',
        'sin_': 'sin',
        ...
       }

The functions themselves are defined as follows. I've included the first differential as otherwise it won't generate correct jacobians.
Also note the MATH_FUNC_SYMPY_MAPPING I use this when I need to check the sign of a stimulus equation, being careful not to change the equation in the model that gets ultimately printed.

from sympy import (
    Abs,
    Function,
    acos,
    cos,
    exp,
    sign,
    sin,
    sqrt,
)


class RealFunction(Function):
    def _eval_is_real(self):
        return self.args[0].is_real


class exp_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return self


class abs_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return sign(self.args[0])


class acos_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return -1 / sqrt_(1 - self.args[0]**2)


class cos_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return -sin_(self.args[0])


class sqrt_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return 1 / (2 * sqrt_(self.args[0]))


class sin_(RealFunction):

    def fdiff(self, argindex=1):
        """
        Returns the first derivative of this function.
        """
        assert argindex == 1
        return (cos_(self.args[0]))


# MATH_FUNC_SYMPY_MAPPING provides a mapping from our specified math functions back to sympy versions.
# This can be used to put sympy function into an expression or evaluation. e.g. `expr.subs(MATH_FUNC_SYMPY_MAPPING)`.
MATH_FUNC_SYMPY_MAPPING = {abs_: Abs, acos_: acos, cos_: cos, exp_: exp, sin_: sin, sqrt_: sqrt}

mirams · 2020-10-29T09:53:06Z

Happy for Maurice to help applying the Chaste-codegen tweaks to the weblab-cg (where exactly did the weblab-cg code end up - is it all in cellmlmanip?)

jonc125 · 2020-11-05T17:39:49Z

We already had the fake handling of exp, albeit with the older style that just invented a name rather than having a class with derivative support, since we don't use Jacobians in WL. Does the class form make any other improvement @MauriceHendrix ?

I added old-style prevention for the other functions Maurice listed in the fix-reproducibility branch, and we're at 22 failures:

FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[GraphState-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-mahajan_shiferaw_2008] - AssertionEr...
FAILED test/test_python_reproducibility.py::test_fc_experiment[ExtracellularPotassiumVariation-li_mouse_2010] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-fink_noble_giles_model_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-ten_tusscher_model_2006_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_atrial_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[INa_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-grandi_pasqualini_bers_2010_ss] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[S1S2-difrancesco_noble_model_1985]
FAILED test/test_python_reproducibility.py::test_fc_experiment[IKr_block-ohara_rudy_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-decker_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[RyR_block-carro_2011_epi]
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-difrancesco_noble_model_1985] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-carro_2011_epi] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[NCX_block-aslanidi_Purkinje_model_2009] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner4Hz-mahajan_shiferaw_2008] - AssertionError:
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-shannon_wang_puglisi_weber_bers_2004] - AssertionE...
FAILED test/test_python_reproducibility.py::test_fc_experiment[SteadyStateRunner-mahajan_shiferaw_2008] - AssertionError:
====================================== 22 failed, 157 passed, 17 xfailed in 5356.86s (1:29:16) =======================================

So slight improvement but clearly something else going on.

Were there any other tricks you used @MauriceHendrix ?

MauriceHendrix · 2020-11-06T10:44:55Z

@jonc125 no I don't think so I mainly had to introduce the classes it to get it to make Jacobians correctly.

MauriceHendrix · 2020-11-06T14:41:56Z

@jonc125 @MichaelClerx looking at the actual test I'm getting really confused as I can't see where the test_fc_experiment is?
I'm assuming you're tried math.isclose or similar on numeric values?

jonc125 · 2020-11-06T15:14:01Z

Hi @MauriceHendrix the main test is https://github.com/ModellingWebLab/weblab-fc/blob/fix-reproducibility/test/test_python_reproducibility.py (starting line 59 after all the decorators!). The results checking uses np.testing.assert_allclose, which used to pass with pycml. But the implementation is spread around the fc package, e.g. code generation which writes the .pyx code, protocol.py which compiles it, etc.

MauriceHendrix · 2020-11-06T16:16:52Z

Hhm very ood a search didn't find that function earlier anyway ...
hhm if you're already doing that and the sticking names in then I'm not sure ...
Are you using any of numpy to calculate anything besides all_close? We had a smilar issue with another project using numpy

jonc125 · 2020-11-10T19:28:58Z

With #227 we're now down to just 1 failure! ExtracellularPotassiumVariation-li_mouse_2010

This is the code I used to compare results by eye, having run

FC_OUTPUT_FOLDER=`pwd`/testoutput FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -n 24 | tee output.log
# Or for a single combination:
FC_OUTPUT_FOLDER=`pwd`/testoutput FC_LONG_TESTS=1 pytest test/test_python_reproducibility.py -k 'Extra and li_mouse' -s

import matplotlib.pyplot as plt
import numpy as np

def load1d(model, proto, output):
    from fc.data_loading import load
    ref = load(f'test/output/real/{model}/{proto}/outputs_{output}.csv').array
    actual = np.loadtxt(f'testoutput/{model}/{proto}/outputs_{output}.csv', dtype=float)
    return ref, actual

def load2d(model, proto, output):
    ref = np.loadtxt(f'test/output/real/{model}/{proto}/outputs_{output}.csv', dtype=float, delimiter=',', ndmin=2, unpack=True)
    actual = np.loadtxt(f'testoutput/{model}/{proto}/outputs_{output}.csv', dtype=float, delimiter=',', ndmin=2, unpack=True)
    return ref, actual

def compare(ref, actual):
    if ref.ndim == 1:
        fig, ax = plt.subplots()
        ax.plot(ref, label='ref')
        ax.plot(actual, label='new')
        ax.legend()
        fig.show()
    elif ref.ndim == 2:
        for i in range(ref.shape[0]):
            fig, ax = plt.subplots()
            ax.plot(ref[i, :], label='ref')
            ax.plot(actual[i, :], label='new')
            ax.legend()
            fig.show()

compare(*load2d('li_mouse_2010', 'ExtracellularPotassiumVariation', 'detailed_voltage'))

Interestingly, on scrambler there's only a difference in the voltage trace at step 1 (2 mM) whereas on my Mac it's also very different for step 8 (9 mM). So seems to be rather sensitive to the library versions etc.

2mM
9 mM

mirams · 2020-11-10T19:50:38Z

I suspect what looks dramatic in the second is a 2:1 pacing which the ref happens to hit on the start of the period 2 orbit and the new happens to hit in the middle.

mirams · 2020-11-10T19:51:57Z

didnt mean to close it, typing on phone!

jonc125 · 2020-11-10T21:28:36Z

By the way, the apparently different initial conditions in the second plot are because this is just the last second of simulated time after the model has been run to steady state. Given it's supposedly a steady-state picture I'm a bit puzzled by the reference result! It's definitely an outlier compared to the other concentrations, but does still happen on scrambler...

Would you be happy for us to just update the reference solution for this final case @mirams ? Or should we investigate further?

mirams · 2020-11-17T09:23:22Z

If we can plot two paces and just check they overlay when one is shifted by 1000ms, then fine to replace I think.

jonc125 · 2020-11-18T15:48:14Z

It doesn't look like alternans in the 2mM case:

mirams · 2020-11-18T16:01:45Z

Hmmm, can you play around in Myokit with this model and setting @MichaelClerx - if this is something of a sodium channel bifurcation we just happen to be hitting at 2mM I guess that is fine (i.e. if 2.01 or 1.99mM look like these?). Otherwise worth investigating the difference here.

MichaelClerx · 2020-11-18T16:09:09Z

OK, will investigate! But probably not this week while I wait for finger to recover / learn better typing habits 😅

MichaelClerx added the enhancement New features, incremental improvements label Apr 12, 2019

This was referenced May 4, 2020

Make TestPythonReproducibilty work again #161

Closed

Re-enable or get rid of long tests #181

Closed

This was referenced May 5, 2020

Trac: Extensively regression test C++ and Python implementations on multiple protocols #201

Closed

Remove unused test files #204

Closed

MichaelClerx mentioned this issue Jul 13, 2020

Epic 1: WebLab code gen with cellmlmanip ModellingWebLab/project_issues#57

Closed

jonc125 mentioned this issue Oct 20, 2020

Epic 3: Update WebLab deploy to use weblab-fc ModellingWebLab/project_issues#59

Open

9 tasks

MichaelClerx mentioned this issue Oct 27, 2020

Trying out long tests #182

Closed

mirams closed this as completed Nov 10, 2020

mirams reopened this Nov 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run TestPythonReproducibility #24

Run TestPythonReproducibility #24

MichaelClerx commented Feb 18, 2019 •

edited by jonc125

Loading

MichaelClerx commented May 4, 2020

jonc125 commented May 4, 2020

MichaelClerx commented May 4, 2020

MichaelClerx commented May 4, 2020

jonc125 commented May 4, 2020

jonc125 commented Oct 20, 2020 •

edited

Loading

MauriceHendrix commented Oct 21, 2020 •

edited

Loading

mirams commented Oct 29, 2020

jonc125 commented Nov 5, 2020

MauriceHendrix commented Nov 6, 2020

MauriceHendrix commented Nov 6, 2020

jonc125 commented Nov 6, 2020

MauriceHendrix commented Nov 6, 2020

jonc125 commented Nov 10, 2020

mirams commented Nov 10, 2020

mirams commented Nov 10, 2020

jonc125 commented Nov 10, 2020

mirams commented Nov 17, 2020

jonc125 commented Nov 18, 2020

mirams commented Nov 18, 2020

MichaelClerx commented Nov 18, 2020

Run TestPythonReproducibility #24

Run TestPythonReproducibility #24

Comments

MichaelClerx commented Feb 18, 2019 • edited by jonc125 Loading

MichaelClerx commented May 4, 2020

jonc125 commented May 4, 2020

MichaelClerx commented May 4, 2020

MichaelClerx commented May 4, 2020

jonc125 commented May 4, 2020

jonc125 commented Oct 20, 2020 • edited Loading

MauriceHendrix commented Oct 21, 2020 • edited Loading

mirams commented Oct 29, 2020

jonc125 commented Nov 5, 2020

MauriceHendrix commented Nov 6, 2020

MauriceHendrix commented Nov 6, 2020

jonc125 commented Nov 6, 2020

MauriceHendrix commented Nov 6, 2020

jonc125 commented Nov 10, 2020

mirams commented Nov 10, 2020

mirams commented Nov 10, 2020

jonc125 commented Nov 10, 2020

mirams commented Nov 17, 2020

jonc125 commented Nov 18, 2020

mirams commented Nov 18, 2020

MichaelClerx commented Nov 18, 2020

MichaelClerx commented Feb 18, 2019 •

edited by jonc125

Loading

jonc125 commented Oct 20, 2020 •

edited

Loading

MauriceHendrix commented Oct 21, 2020 •

edited

Loading