TST: TestMICE.test_combine, test_corrpsd_threshold[0], test_mixedlm failing on Debian unstable #7911

rebecca-palmer · 2021-11-27T16:13:22Z

Describe the bug

In Debian unstable, TestMICE.test_combine, test_corrpsd_threshold[0] and test_mixedlm are failing. (This log is from statsmodels 0.12.2, but 0.13.1 has the same errors; I haven't tried current main.)

As the output of test_corrpsd_threshold is so close to 0, and the results of TestMice.test_combine and test_mixedlm depend substantially on the np.random state, I suspect that this is a rounding issue not a real incorrect-results bug, but I don't have proof of that.

=================================== FAILURES ===================================
____________________________ TestMICE.test_combine _____________________________

self = <statsmodels.imputation.tests.test_mice.TestMICE object at 0x7f946cec9070>

@pytest.mark.slow
def test_combine(self):

    np.random.seed(3897)
    x1 = np.random.normal(size=300)
    x2 = np.random.normal(size=300)
    y = x1 + x2 + np.random.normal(size=300)
    x1[0:100] = np.nan
    x2[250:] = np.nan
    df = pd.DataFrame({"x1": x1, "x2": x2, "y": y})
    idata = mice.MICEData(df)
    mi = mice.MICE("y ~ x1 + x2", sm.OLS, idata, n_skip=20)
    result = mi.fit(10, 20)

    fmi = np.asarray([0.1778143, 0.11057262, 0.29626521])

  assert_allclose(result.frac_miss_info, fmi, atol=1e-5)

E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=1e-05
E
E Mismatched elements: 3 / 3 (100%)
E Max absolute difference: 0.17686937
E Max relative difference: 1.59957657
E x: array([0.230217, 0.287442, 0.322124])
E y: array([0.177814, 0.110573, 0.296265])

/usr/lib/python3/dist-packages/statsmodels/imputation/tests/test_mice.py:366: AssertionError
__________________________ test_corrpsd_threshold[0] ___________________________

threshold = 0

@pytest.mark.parametrize('threshold', [0, 1e-15, 1e-10, 1e-6])
def test_corrpsd_threshold(threshold):
    x = np.array([[1, -0.9, -0.9], [-0.9, 1, -0.9], [-0.9, -0.9, 1]])

    y = corr_nearest(x, n_fact=100, threshold=threshold)
    evals = np.linalg.eigvalsh(y)

  assert_allclose(evals[0], threshold, rtol=1e-6, atol=1e-15)

E AssertionError:
E Not equal to tolerance rtol=1e-06, atol=1e-15
E
E Mismatched elements: 1 / 1 (100%)
E Max absolute difference: 1.05471187e-15
E Max relative difference: inf
E x: array(1.054712e-15)
E y: array(0)

/usr/lib/python3/dist-packages/statsmodels/stats/tests/test_corrpsd.py:196: AssertionError
_________________________________ test_mixedlm _________________________________

def test_mixedlm():

    np.random.seed(3424)

    n = 200

    # The exposure (not time varying)
    x = np.random.normal(size=n)
    xv = np.outer(x, np.ones(3))

    # The mediator (with random intercept)
    mx = np.asarray([4., 4, 1])
    mx /= np.sqrt(np.sum(mx**2))
    med = mx[0] * np.outer(x, np.ones(3))
    med += mx[1] * np.outer(np.random.normal(size=n), np.ones(3))
    med += mx[2] * np.random.normal(size=(n, 3))

    # The outcome (exposure and mediator effects)
    ey = np.outer(x, np.r_[0, 0.5, 1]) + med

    # Random structure of the outcome (random intercept and slope)
    ex = np.asarray([5., 2, 2])
    ex /= np.sqrt(np.sum(ex**2))
    e = ex[0] * np.outer(np.random.normal(size=n), np.ones(3))
    e += ex[1] * np.outer(np.random.normal(size=n), np.r_[-1, 0, 1])
    e += ex[2] * np.random.normal(size=(n, 3))
    y = ey + e

    # Group membership
    idx = np.outer(np.arange(n), np.ones(3))

    # Time
    tim = np.outer(np.ones(n), np.r_[-1, 0, 1])

    df = pd.DataFrame({"y": y.flatten(), "x": xv.flatten(),
                       "id": idx.flatten(), "time": tim.flatten(),
                       "med": med.flatten()})

    mediator_model = sm.MixedLM.from_formula("med ~ x", groups="id", data=df)
    outcome_model = sm.MixedLM.from_formula("y ~ med + x", groups="id", data=df)
    me = Mediation(outcome_model, mediator_model, "x", "med")
    mr = me.fit(n_rep=2)
    st = mr.summary()
    pm = st.loc["Prop. mediated (average)", "Estimate"]

  assert_allclose(pm, 0.52, rtol=1e-2, atol=1e-2)

E AssertionError:
E Not equal to tolerance rtol=0.01, atol=0.01
E
E Mismatched elements: 1 / 1 (100%)
E Max absolute difference: 0.01958632
E Max relative difference: 0.03766599
E x: array(0.539586)
E y: array(0.52)

/usr/lib/python3/dist-packages/statsmodels/stats/tests/test_mediation.py:214: AssertionError

Code Sample, a copy-pastable example if possible

The statsmodels test suite.

Expected Output

The tests should pass.

Output of `import statsmodels.api as sm; sm.show_versions()`

The problem started when Debian upgraded from libblas3/liblapack3 3.9 to 3.10.

Python 3.9, numpy 1.19, scipy 1.7, matplotlib 3.3, pandas 1.1.

The text was updated successfully, but these errors were encountered:

josef-pkt · 2021-11-27T17:05:41Z

the solution for corrpsd is most likely #3716 or something like that.

I don't know the code well enough for the other two to have a guess how fragile they are or why.

ArchangeGabriel · 2022-03-18T23:00:43Z

FWIW, I have the three same failures on an up-to-date ArchLinux for 0.13.2:

____________________________ TestMICE.test_combine _____________________________

self = <statsmodels.imputation.tests.test_mice.TestMICE object at 0x7f21c0ea6800>

    @pytest.mark.slow
    def test_combine(self):
    
        np.random.seed(3897)
        x1 = np.random.normal(size=300)
        x2 = np.random.normal(size=300)
        y = x1 + x2 + np.random.normal(size=300)
        x1[0:100] = np.nan
        x2[250:] = np.nan
        df = pd.DataFrame({"x1": x1, "x2": x2, "y": y})
        idata = mice.MICEData(df)
        mi = mice.MICE("y ~ x1 + x2", sm.OLS, idata, n_skip=20)
        result = mi.fit(10, 20)
    
        fmi = np.asarray([0.1778143, 0.11057262, 0.29626521])
>       assert_allclose(result.frac_miss_info, fmi, atol=1e-5)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=1e-05
E       
E       Mismatched elements: 3 / 3 (100%)
E       Max absolute difference: 0.17686937
E       Max relative difference: 1.59957657
E        x: array([0.230217, 0.287442, 0.322124])
E        y: array([0.177814, 0.110573, 0.296265])

statsmodels/imputation/tests/test_mice.py:366: AssertionError
__________________________ test_corrpsd_threshold[0] ___________________________

threshold = 0

    @pytest.mark.parametrize('threshold', [0, 1e-15, 1e-10, 1e-6])
    def test_corrpsd_threshold(threshold):
        x = np.array([[1, -0.9, -0.9], [-0.9, 1, -0.9], [-0.9, -0.9, 1]])
    
        y = corr_nearest(x, n_fact=100, threshold=threshold)
        evals = np.linalg.eigvalsh(y)
>       assert_allclose(evals[0], threshold, rtol=1e-6, atol=1e-15)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-06, atol=1e-15
E       
E       Mismatched elements: 1 / 1 (100%)
E       Max absolute difference: 1.05471187e-15
E       Max relative difference: inf
E        x: array(1.054712e-15)
E        y: array(0)

statsmodels/stats/tests/test_corrpsd.py:196: AssertionError
_________________________________ test_mixedlm _________________________________

    def test_mixedlm():
    
        np.random.seed(3424)
    
        n = 200
    
        # The exposure (not time varying)
        x = np.random.normal(size=n)
        xv = np.outer(x, np.ones(3))
    
        # The mediator (with random intercept)
        mx = np.asarray([4., 4, 1])
        mx /= np.sqrt(np.sum(mx**2))
        med = mx[0] * np.outer(x, np.ones(3))
        med += mx[1] * np.outer(np.random.normal(size=n), np.ones(3))
        med += mx[2] * np.random.normal(size=(n, 3))
    
        # The outcome (exposure and mediator effects)
        ey = np.outer(x, np.r_[0, 0.5, 1]) + med
    
        # Random structure of the outcome (random intercept and slope)
        ex = np.asarray([5., 2, 2])
        ex /= np.sqrt(np.sum(ex**2))
        e = ex[0] * np.outer(np.random.normal(size=n), np.ones(3))
        e += ex[1] * np.outer(np.random.normal(size=n), np.r_[-1, 0, 1])
        e += ex[2] * np.random.normal(size=(n, 3))
        y = ey + e
    
        # Group membership
        idx = np.outer(np.arange(n), np.ones(3))
    
        # Time
        tim = np.outer(np.ones(n), np.r_[-1, 0, 1])
    
        df = pd.DataFrame({"y": y.flatten(), "x": xv.flatten(),
                           "id": idx.flatten(), "time": tim.flatten(),
                           "med": med.flatten()})
    
        mediator_model = sm.MixedLM.from_formula("med ~ x", groups="id", data=df)
        outcome_model = sm.MixedLM.from_formula("y ~ med + x", groups="id", data=df)
        me = Mediation(outcome_model, mediator_model, "x", "med")
        mr = me.fit(n_rep=2)
        st = mr.summary()
        pm = st.loc["Prop. mediated (average)", "Estimate"]
>       assert_allclose(pm, 0.52, rtol=1e-2, atol=1e-2)
E       AssertionError: 
E       Not equal to tolerance rtol=0.01, atol=0.01
E       
E       Mismatched elements: 1 / 1 (100%)
E       Max absolute difference: 0.01958632
E       Max relative difference: 0.03766599
E        x: array(0.539586)
E        y: array(0.52)

statsmodels/stats/tests/test_mediation.py:214: AssertionError

The corrpsd tests are emitting a warning:

stats/tests/test_corrpsd.py::TestCovPSD::test_cov_nearest
stats/tests/test_corrpsd.py::TestCorrPSD1::test_nearest
stats/tests/test_corrpsd.py::test_corrpsd_threshold[0]
stats/tests/test_corrpsd.py::test_corrpsd_threshold[1e-15]
stats/tests/test_corrpsd.py::test_corrpsd_threshold[1e-10]
stats/tests/test_corrpsd.py::test_corrpsd_threshold[1e-06]
  /build/python-statsmodels/src/statsmodels-0.13.2/build/lib.linux-x86_64-3.10/statsmodels/stats/correlation_tools.py:90: IterationLimitWarning: 
  Maximum iteration reached.
  
    warnings.warn(iteration_limit_doc, IterationLimitWarning)

But I have no idea if that’s related.

No warnings for the two others (but 285 warnings in total).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: TestMICE.test_combine, test_corrpsd_threshold[0], test_mixedlm failing on Debian unstable #7911

TST: TestMICE.test_combine, test_corrpsd_threshold[0], test_mixedlm failing on Debian unstable #7911

rebecca-palmer commented Nov 27, 2021

josef-pkt commented Nov 27, 2021

ArchangeGabriel commented Mar 18, 2022

TST: TestMICE.test_combine, test_corrpsd_threshold[0], test_mixedlm failing on Debian unstable #7911

TST: TestMICE.test_combine, test_corrpsd_threshold[0], test_mixedlm failing on Debian unstable #7911

Comments

rebecca-palmer commented Nov 27, 2021

Describe the bug

Code Sample, a copy-pastable example if possible

Expected Output

Output of import statsmodels.api as sm; sm.show_versions()

josef-pkt commented Nov 27, 2021

ArchangeGabriel commented Mar 18, 2022

Output of `import statsmodels.api as sm; sm.show_versions()`