BUG: Bambi logistic regression fails on M1 silicon #6415

tarmojuristo · 2022-12-27T16:50:55Z

Describe the issue:

Running a logistic regression in Bambi with default settings fails. However, if chains=1 is set in model.fit() then everything works fine.

Reproduceable code example:

import bambi as bmb
import numpy as np

data = bmb.load_data("adults")

age_mean = np.mean(data["age"])
age_std = np.std(data["age"])

data["age"] = (data["age"] - age_mean) / age_std

model = bmb.Model("income['>50K'] ~ sex + age", data, family="bernoulli")
fitted = model.fit(draws=1000)

Error message:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [Intercept, sex, age]
Traceback (most recent call last):000 00:00<? Sampling 4 chains, 0 divergences]
  File "/Users/tarmo/Desktop/SALK/streamlit-model/example.py", line 12, in <module>
    fitted = model.fit(draws=1000)
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/bambi/models.py", line 271, in fit
    return self.backend.run(
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/bambi/backend/pymc.py", line 98, in run
    result = self._run_mcmc(
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/bambi/backend/pymc.py", line 289, in _run_mcmc
    idata = pm.sample(
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/pymc/sampling/mcmc.py", line 529, in sample
    mtrace = _mp_sample(**sample_args, **parallel_args)
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/pymc/sampling/mcmc.py", line 1019, in _mp_sample
    for draw in sampler:
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 449, in __iter__
    draw = ProcessAdapter.recv_draw(self._active)
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/site-packages/pymc/sampling/parallel.py", line 320, in recv_draw
    msg = ready[0].recv()
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/multiprocessing/connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/Users/tarmo/mambaforge/envs/pymc/lib/python3.10/multiprocessing/connection.py", line 388, in _recv
    raise EOFError
EOFError

PyMC version information:

PyMC 5.0.1
PyTensor 2.8.11
Python 3.10
MacOS Ventura 13.0.1 on M1
mambaforge

Context for the issue:

No response

The text was updated successfully, but these errors were encountered:

twiecki · 2022-12-27T18:09:40Z

Does a super simple model

with pm.Model():
    x = pm.Normal("x")
    pm.sample()

also produce that error?

tarmojuristo · 2022-12-28T14:36:31Z

no, this works fine. also the bambi t-test example samples without problems.

tomicapretto · 2022-12-29T18:09:45Z

@tarmojuristo could you try with this example?

import numpy as np
import pymc as pm

size = 100
rng = np.random.default_rng(1234)
x = rng.normal(size=size)
g = rng.choice(list("ABC"), size=size)
y = rng.integers(0, 1, size=size, endpoint=True)

g_levels, g_idxs = np.unique(g, return_inverse=True)
coords = {"group": g_levels}

with pm.Model(coords=coords) as model:
    coef_x = pm.Normal("x")
    coef_g = pm.Normal("g", dims="group")
    p = pm.math.softmax(coef_x + coef_g[g_idxs])
    pm.Bernoulli("y", p=p, observed=y)
    idata = pm.sample()

tarmojuristo · 2022-12-30T13:39:16Z

all good: Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 1 seconds.

tomicapretto · 2022-12-30T16:43:39Z

If you agree, I can give you a couple of examples you can try until we find the broken part.

What if you do

import bambi as bmb
import numpy as np
import pymc as pm

data = bmb.load_data("adults")

age_mean = np.mean(data["age"])
age_std = np.std(data["age"])

data["age"] = (data["age"] - age_mean) / age_std

model = bmb.Model("income['>50K'] ~ sex + age", data, family="bernoulli")
model.build()
with model.backend.model:
    idata = pm.sample()

tarmojuristo · 2022-12-30T16:47:08Z

this one fails with the same error

tomicapretto · 2022-12-30T16:55:46Z

Ok! Later today I'll give another chunk of PyMC code to test. I'll try to reproduce the PyMC model created by Bambi more closely

michaelosthege · 2023-01-09T13:55:06Z

Make sure to avoid model code on the 0th indent level. This can lead to recursion problems with multiprocessing depending on the OS-dependent fork/spawn setting.

Just wrap everything in a simple run() function:

def run():
    # do stuff


if __name__ == "__main__":
    run()

tarmojuristo · 2023-01-12T13:21:46Z

Make sure to avoid model code on the 0th indent level. This can lead to recursion problems with multiprocessing depending on the OS-dependent fork/spawn setting.

Just wrap everything in a simple run() function:
def run():
    # do stuff


if __name__ == "__main__":
    run()

Thanks, this is a useful tip - definitely avoids some trouble on my intel-based macbook - but on M1 the above example still fails, even when wrapped into a function.

ivanmkc · 2023-01-31T09:21:38Z

+1, I get this error as well when using a Jupyter notebook.

tomicapretto · 2023-02-01T12:32:56Z

import bambi as bmb
import pandas as pd
import numpy as np
import pymc as pm
import pytensor.tensor as pt

from formulae import design_matrices

data = bmb.load_data("adults")

age_mean = np.mean(data["age"])
age_std = np.std(data["age"])

data["age"] = (data["age"] - age_mean) / age_std

def logit(x):
    """Logit function that ensures result is in (0, 1)"""
    eps = np.finfo(float).eps
    result = pt.sigmoid(x)
    result = pt.switch(pt.eq(result, 0), eps, result)
    result = pt.switch(pt.eq(result, 1), 1 - eps, result)
    return result

dm = design_matrices("income['>50K'] ~ sex + age", data)
y = np.array(dm.response)
X = np.array(dm.common)

coords = {"sex_dim": ["Male"]}
with pm.Model(coords=coords) as pm_model:
    eta = 0
    intercept = pm.Normal("intercept", mu=0, sigma=4.34)
    sex = pt.atleast_1d(pm.Normal("sex", mu=0, sigma=5.31, dims="sex_dim"))
    age = pt.atleast_1d(pm.Normal("age", mu=0, sigma=2.5, shape=None))
    
    # Concatenate parameters
    coefs = pt.concatenate([sex, age])
    
    # Create linear predictor
    eta += intercept 
    eta += pt.dot(X[:, 1:], coefs) # drop first column because that's for the intercept

    # Use inverse of link function
    p = logit(eta)

    pm.Bernoulli("income", p=p, observed=y)

This resembles much more closely what's happening under the hood. Could you run it and let me know if it works or fails?

pm.model_to_graphviz(pm_model)

tarmojuristo · 2023-02-01T14:00:45Z

When I sample this one, I once again get the EOFError.

tomicapretto · 2023-02-01T14:17:20Z

Now I think we can try "removing" parts that may be conflicting. I would start with the dims for sex or the replacing the custom logit function with pt.sigmoid

tarmojuristo · 2023-02-03T15:04:40Z

pt.sigmoid(eta) gives the same EOF error.

I am not quite sure how should I remove dims here though.

tomicapretto · 2023-02-03T17:02:15Z

Try

    sex = pm.Normal("sex", mu=0, sigma=5.31)
    age = pm.Normal("age", mu=0, sigma=2.5)
    coefs = pt.stack([sex, age])

and

    sex = pt.atleast_1d(pm.Normal("sex", mu=0, sigma=5.31))
    age = pt.atleast_1d(pm.Normal("age", mu=0, sigma=2.5))
    coefs = pt.concatenate([sex, age])

davipatti · 2023-04-25T20:20:41Z

(Potentially related / helpful)

I was getting an identical EOFError whilst using pymc 5.1.1 directly to implement a survival model. For testing I had generated a dataset with 10,000 observations. Reducing the number of observations to 1,000 resolved the EOFError and allowed me to sample from the model.

More info here: https://discourse.pymc.io/t/dataset-size-dependent-eoferror/11977

tomicapretto · 2023-09-19T10:53:39Z

@tarmojuristo is this still a problem?

jvparidon · 2023-09-22T15:02:58Z

This is still happening on M1 chipsets for me. As far as I can tell it's an out-of-memory issue when using multiprocessing to sample from multiple chains. Sampling using only one core doesn't trigger the issue (because no multiprocessing) and reducing the size of the dataset often solves it (because there is sufficient memory then). It's weird though, these datasets are not large enough to expect memory limitations (sampling these same models on an Intel chipset with four cores and less memory works just fine).

tomicapretto · 2023-09-22T15:29:46Z

@jvparidon does it happen to you in the following cases too?

Using a Python script, put the sampling statement inside the following condition

if __name__ == "__main__":
    with model:
        idata = pm.sample()

Using a Jupyter Notebook?

jvparidon · 2023-09-22T15:57:01Z

Neither of those fixes the problem. What does work is passing mp_ctx='forkserver' as an argument to the .fit(). That changes the multiprocessing context for pm.sample() from fork to forkserver. Unfortunately I'm not familiar enough with the multiprocessing internals to know why that fixes things or if there are any risks to changing the multiprocessing context.

tomicapretto · 2023-09-22T16:49:50Z

I'm sorry but I'm not familiar with that either. Do you think we could write that model in PyMC and, if the issue persists, we open an issue in the PyMC repository?

jvparidon · 2023-09-22T16:56:05Z

I'll try to create a minimal working example in PyMC this weekend and do a little poking around to see if switching to forkserver for the multiprocessing context breaks anything obvious. I'll make sure to loop you in if I open an issue or PR.

tomicapretto · 2023-09-22T17:01:49Z

@jvparidon if you have the bambi model and some data that can be used, I could do it for you :)

jvparidon · 2023-09-22T17:08:00Z

Unfortunately I can't share either of those, but I'm sure I can create a synthetic dataset and model when I have a minute.

tomicapretto · 2023-09-22T18:27:23Z

I'm sorry but I'm not familiar with that either. Do you think we could write that model in PyMC and, if the issue persists, we open an issue in the PyMC repository?

Just realize we're in the PyMC repo 😅

jvparidon · 2023-09-22T18:36:38Z

Yeah, as it turns out it is a PyMC/multiprocessing issue rather than a Bambi issue so that's convenient.

EDIT: After some testing, the above seems to be incorrect. It is an issue with the default multiprocessing context in pm.sample but as far as I can tell it only crashes Bambi models, not PyMC models. See bambinos/bambi#700 for a reprex.

tarmojuristo added the bug label Dec 27, 2022

michaelosthege added the macOS macOS related label Jan 9, 2023

jvparidon mentioned this issue Sep 23, 2023

EOFError bambinos/bambi#700

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Bambi logistic regression fails on M1 silicon #6415

BUG: Bambi logistic regression fails on M1 silicon #6415

tarmojuristo commented Dec 27, 2022

twiecki commented Dec 27, 2022

tarmojuristo commented Dec 28, 2022

tomicapretto commented Dec 29, 2022

tarmojuristo commented Dec 30, 2022

tomicapretto commented Dec 30, 2022

tarmojuristo commented Dec 30, 2022

tomicapretto commented Dec 30, 2022

michaelosthege commented Jan 9, 2023

tarmojuristo commented Jan 12, 2023

ivanmkc commented Jan 31, 2023

tomicapretto commented Feb 1, 2023

tarmojuristo commented Feb 1, 2023 •

edited

tomicapretto commented Feb 1, 2023

tarmojuristo commented Feb 3, 2023

tomicapretto commented Feb 3, 2023

davipatti commented Apr 25, 2023 •

edited

tomicapretto commented Sep 19, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023 •

edited

BUG: Bambi logistic regression fails on M1 silicon #6415

BUG: Bambi logistic regression fails on M1 silicon #6415

Comments

tarmojuristo commented Dec 27, 2022

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

twiecki commented Dec 27, 2022

tarmojuristo commented Dec 28, 2022

tomicapretto commented Dec 29, 2022

tarmojuristo commented Dec 30, 2022

tomicapretto commented Dec 30, 2022

tarmojuristo commented Dec 30, 2022

tomicapretto commented Dec 30, 2022

michaelosthege commented Jan 9, 2023

tarmojuristo commented Jan 12, 2023

ivanmkc commented Jan 31, 2023

tomicapretto commented Feb 1, 2023

tarmojuristo commented Feb 1, 2023 • edited

tomicapretto commented Feb 1, 2023

tarmojuristo commented Feb 3, 2023

tomicapretto commented Feb 3, 2023

davipatti commented Apr 25, 2023 • edited

tomicapretto commented Sep 19, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023

tomicapretto commented Sep 22, 2023

jvparidon commented Sep 22, 2023 • edited

tarmojuristo commented Feb 1, 2023 •

edited

davipatti commented Apr 25, 2023 •

edited

jvparidon commented Sep 22, 2023 •

edited