Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
TypeError from Theano when multiprocessing: numpy.array not aligned #2640
If I try to sample with Metropolis in parallel (njobs>1), Theano throws an error, indicating that the numpy array is not aligned.
Code that produces the error
import numpy as np import pymc3 as pm import theano.tensor as tt #create data n=1000000 X = np.concatenate((np.ones((n, 1)), np.random.rand(n, 1)), axis = 1) y = np.random.normal(loc = X.dot(np.array([5, -2])), scale = 1) model = pm.Model() with model: beta = pm.Normal('beta', mu = 0, sd = 1e3, shape = X.shape) mu = tt.dot(X, beta) y_est = pm.Normal('y_est', mu = mu, observed = y) step = pm.Metropolis() trace = pm.sample(1, tune = 0, step = step, njobs = 2)
The same code runs fine, if
The Error message:
After that message, the process seems dead and I terminate it manually.
Versions and main components
The cutoff is at n=125000, all smaller values work. At exactly this point
So it looks like this problem appears when an array is larger than a million bytes and the first theano op that uses this array requires an aligned array. I guess the easiest workaround would be to just disable the memory mapping in joblib, by passing
That's a bit strange, but I don't think this is about how good your computer is...
A short script to test the alignment:
import joblib import numpy as np def func(x): print(x.flags) print(type(x)) if __name__ == '__main__': n = 1024 ** 2 // 8 + 1 x = np.random.randn(n) jobs = joblib.Parallel(2) print(x.nbytes) res = jobs([joblib.delayed(func)(x), joblib.delayed(func)(x)])
Could you use this script to figure out at which point you get a
This whole thing kind of sounds like a joblib bug to me. I don't see a good reason why the memmap shouldn't be aligned...
1048584 C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : False WRITEABLE : False ALIGNED : False UPDATEIFCOPY : False C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : False WRITEABLE : False ALIGNED : False UPDATEIFCOPY : False <class 'numpy.core.memmap.memmap'> <class 'numpy.core.memmap.memmap'>
I reported this to joblib.
@lorentzenchr If you need this quickly, applying this patch should avoid the issue until this is fixed properly:
diff --git a/pymc3/sampling.py b/pymc3/sampling.py index 338a054c..0fb9f678 100644 --- a/pymc3/sampling.py +++ b/pymc3/sampling.py @@ -546,11 +546,13 @@ def _mp_sample(**kwargs): chains = list(range(chain, chain + njobs)) pbars = [kwargs.pop('progressbar')] + [False] * (njobs - 1) - traces = Parallel(n_jobs=njobs)(delayed(_sample)(chain=chains[i], - progressbar=pbars[i], - random_seed=rseed[i], - start=start_vals[i], - **kwargs) for i in range(njobs)) + traces = Parallel(n_jobs=njobs, mmap_mode=None)( + delayed(_sample)(chain=chains[i], + progressbar=pbars[i], + random_seed=rseed[i], + start=start_vals[i], + **kwargs) + for i in range(njobs)) return merge_traces(traces)