Running multiple chains causes RecursionError #879

fonnesbeck · 2015-11-25T22:35:14Z

Setting the njobs parameter to run multiple chains results in an error:

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-59-548e16bedce3> in <module>()
      6 
      7 
----> 8     trace = sample(5000, njobs=2)

/Users/fonnescj/Github/pymc3/pymc3/sampling.py in sample(draws, step, start, trace, chain, njobs, tune, progressbar, model, random_seed)
    153         sample_args = [draws, step, start, trace, chain,
    154                        tune, progressbar, model, random_seed]
--> 155     return sample_func(*sample_args)
    156 
    157 

/Users/fonnescj/Github/pymc3/pymc3/sampling.py in _mp_sample(njobs, args)
    274 def _mp_sample(njobs, args):
    275     p = mp.Pool(njobs)
--> 276     traces = p.map(argsample, args)
    277     p.close()
    278     return merge_traces(traces)

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in map(self, func, iterable, chunksize)
    258         in a list that is returned.
    259         '''
--> 260         return self._map_async(func, iterable, mapstar, chunksize).get()
    261 
    262     def starmap(self, func, iterable, chunksize=None):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in get(self, timeout)
    606             return self._value
    607         else:
--> 608             raise self._value
    609 
    610     def _set(self, i, obj):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
    383                         break
    384                     try:
--> 385                         put(task)
    386                     except Exception as e:
    387                         job, ind = task[:2]

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/connection.py in send(self, obj)
    204         self._check_closed()
    205         self._check_writable()
--> 206         self._send_bytes(ForkingPickler.dumps(obj))
    207 
    208     def recv_bytes(self, maxlength=None):

/Users/fonnescj/anaconda3/lib/python3.5/multiprocessing/reduction.py in dumps(cls, obj, protocol)
     48     def dumps(cls, obj, protocol=None):
     49         buf = io.BytesIO()
---> 50         cls(buf, protocol).dump(obj)
     51         return buf.getbuffer()
     52 

RecursionError: maximum recursion depth exceeded

The text was updated successfully, but these errors were encountered:

twiecki · 2015-11-27T10:08:14Z

Ran into the same issue.

twiecki · 2015-11-27T10:28:29Z

It seems to work for simpler models, but the stochastic volatility model I can only run with njobs=2, but it breaks with njobs=4. So odd.

twiecki · 2015-12-17T07:42:20Z

Can you try if e873d6d fixes it?

fonnesbeck · 2015-12-17T13:59:51Z

Well, I get a different error, so that's progress.

MaybeEncodingError: Error sending result: '[<MultiTrace: 1 chains, 40000 iterations, 9 variables>]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647",)'

twiecki · 2015-12-17T14:09:03Z

And what a specific error it is. MaybeEncodingErrorMaybeNot

fonnesbeck · 2015-12-17T14:16:15Z

Yeah, that seemed odd -- creating an Exception subclass for an error that you're not totally sure about.

twiecki · 2015-12-17T14:21:13Z

Anyway, it looks like we're passing maybe an object where an int is expected?

hvasbath · 2016-02-08T08:34:48Z

You can somewhat hack this with sys.setrecursionlimit(2000), but this also works up to a certain amount of parameters. With my latest model around 450 parameters it doesnt help. As I really need the parallel implementation to work otherwise my model has to run for monthes, I would want to look into this. Can you point me to some code lines where I could start looking- as I am not so familiar yet with the code base. Thank you!

hvasbath · 2016-02-08T13:59:16Z

With increasing the recursion limit and the latest commit from twiecki above( e873d6d ) I get this error. It keeps running doing nothing. Does anybody have any advice where I could start investigating?

Exception in thread Thread-14:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 504, in run
self.__target(_self.__args, *_self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks
put(task)
SystemError: NULL result without error in PyObject_Call

fonnesbeck · 2016-02-08T14:11:45Z

All of the multiprocessing business for PyMC3 is in the sampling module. Its pretty basic mapping of processes to the elements of a multiprocessing Pool. We might want to explore using ipyparallel for parallel processing.

twiecki · 2016-02-08T15:11:29Z

I have also considered switching. The issue is that currently you can't launch processes internally (see ipython/ipyparallel#22 for a plan to change that).

fonnesbeck · 2016-02-08T17:17:56Z

That should not be a deal-breaker. Forcing the user to spin up ipcluster is not particularly onerous, particularly if you are working in Jupyter, where it is just a tab in the interface. I think its a small price to pay for more robust parallelism, and if it gets automated in the future, all the better.

datnamer · 2016-02-08T17:35:03Z

What about Dask?

fonnesbeck · 2016-02-08T19:57:25Z

Would Dask be effective here? I could see if we were applying the same algorithm to subsets of a dataset, but a set of parallel chains executes over the entire dataset for each chain. So, its not clear how Dask's collections would be beneficial. That said, it may be useful if we ever implement expectation propagation, which does subdivide the data.

datnamer · 2016-02-08T20:19:36Z

Dask imperative + multiprocessing scheduler can schedule the chains without needing to use a specific collection to chunk.

But this is out of my depth.

Maybe @mrocklin can chime in.

twiecki · 2016-02-08T21:07:41Z

I don't think Dask, although awesome, can be leveraged here.

mrocklin · 2016-02-08T21:10:10Z

If someone can briefly describe the problem I'd be happy to chime in if there is potential overlap. The dask schedulers are useful well outside the common use case of big chunked arrays. If you're considering technologies like multiprocessing or ipyparallel it's possible that one of the dask schedulers could be relevant.

fonnesbeck · 2016-02-08T22:34:48Z

@mrocklin Matt, this is Monte Carlo sampling for Bayesian statistical modeling. Its an embarrassingly parallel task that just simulates Markov chains using the same model on the same dataset, then uses the sampled chains (the output of the algorithm) for inference. We are currently using the multiprocessing module for this, but are contemplating a move to something more robust.

mrocklin · 2016-02-08T22:44:32Z

Something non-trivial must be going on to cause multiprocessing to hang.

Looking at the traceback it seems like you might be trying to send something that pickle doesn't like? Historically I've gotten around this by pre-serializing everything with dill or cloudpickle before I hand things off to multiprocessing. This is what dask.multiprocessing.get does.

If this is what is going on then the pathos library would probably be a decent drop in replacement for you all. It's a multiprocessing clone that uses dill.

But really, I'm just guessing at the problem that you're trying to solve and so am probably out of my depth here. Happy to help if I can. Best of luck.

fonnesbeck · 2016-02-10T21:48:59Z

Thanks, Matt. Unfortunately pathos appears not to support Python 3 yet, so I will look at explicitly passing everything to dill.

mrocklin · 2016-02-10T22:25:12Z

I write a function like the following:

def apply(serialized_func, serialized_args, serialized_kwargs):
    func = dill.loads(serialized_func)
    args = dill.loads(serialized_args)
    kwargs = dill.loads(serialized_kwargs)

And then I dumps my func, args, kwargs ahead of time and call them with the apply function remotely. Something like the following:

pool.map(apply, [dill.dumps(func) for i in range(len(sequence))], [dill.dumps(args) for args in sequence])

<self serving> Or, you can always just use dask.multiprocessing.get, where this work is already done. </self serving>

fonnesbeck · 2016-02-10T22:49:42Z

I might have found a solution using Joblib, but will give this a shot if that doesn't work. Thanks again.

mrocklin · 2016-02-10T22:51:37Z

Oh great. That's much simpler.

tyarkoni · 2016-02-11T07:38:53Z

I don't think this solves the problem, unfortunately... On the joblib branch, with njobs=4 and a pretty big model, I still get a max recursion exceeded exception (see below). On inspection, it looks like Joblib uses multiprocessing as its default backend, so I guess that makes sense. I tried switching to the threading backend, but that failed with a different set of errors.

Traceback (most recent call last):
  File "run_wm.py", line 53, in <module>
    run_model(40)
  File "run_wm.py", line 40, in run_model
    trace = model.run(samples=250, verbose=True, find_map=False, njobs=4)
  File "/Users/tal/Dropbox/Projects/RandomStimuli/code/pymcwrap/pymcwrap/model.py", line 383, in run
    samples, start=start, step=step, progressbar=verbose, njobs=njobs)
  File "/usr/local/lib/python3.5/site-packages/pymc3-3.0-py3.5.egg/pymc3/sampling.py", line 146, in sample
    return sample_func(**sample_args)
  File "/usr/local/lib/python3.5/site-packages/pymc3-3.0-py3.5.egg/pymc3/sampling.py", line 272, in _mp_sample
    **kwargs) for i in range(njobs))
  File "/usr/local/lib/python3.5/site-packages/joblib-0.9.4-py3.5.egg/joblib/parallel.py", line 810, in __call__
    self.retrieve()
  File "/usr/local/lib/python3.5/site-packages/joblib-0.9.4-py3.5.egg/joblib/parallel.py", line 727, in retrieve
    self._output.extend(job.get())
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 608, in get
    raise self._value
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 385, in _handle_tasks
    put(task)
  File "/usr/local/lib/python3.5/site-packages/joblib-0.9.4-py3.5.egg/joblib/pool.py", line 368, in send
    CustomizablePickler(buffer, self._reducers).dump(obj)
RecursionError: maximum recursion depth exceeded

fonnesbeck · 2016-02-11T14:20:13Z

It was worth a shot. I will try flavoring it with a little dill.

fonnesbeck · 2016-02-11T16:24:03Z

Actually, joblib serializes the arguments for us, so that's not the solution. Increasing the recursion limit helps (as @eigenblutwurst notes), which I have done inside sample. This problem may resurface with bigger models. Works well on the rugby analytics example (which I have modified) using 4 cores.

twiecki · 2016-02-11T16:25:24Z

joblib/joblib#240

grburgess · 2016-03-17T18:12:00Z

I just updated to the latest Theano 0.8 and pymc3 and this problem has disappeared for me.
Strange thing though, while I build python manually with setup.py install, it still complained that it wanted Theano 0.7. The install seemed to go ok though.

hvasbath · 2016-03-18T09:52:38Z

yes for me it also wants to install theano 0.7 although I have the dev version thats somehow anoying, I simply disabled it in the setup script, although there must be a nice way.

twiecki · 2016-03-18T09:54:19Z

It's trying to pull 0.7 when you run pymc3's setup.py?

hvasbath · 2016-03-18T09:58:21Z

Yes it does.

grburgess · 2016-03-18T09:59:57Z

Yes, it seemed to install fine and use Theano 0.8, but it was rather confusing.

hvasbath · 2016-03-18T10:01:20Z

I have to abort it because when I let it install it, my import uses the 0.7 version instead of the dev version. They made sooooo many improvements in the current dev version so it is really significant to use the dev version.

twiecki · 2016-03-18T10:02:07Z

f9de16e should fix that.

hvasbath · 2016-03-18T10:03:25Z

Ah great thx!

grburgess · 2016-03-18T10:16:40Z

Fixed it. thanks!

springcoil · 2016-03-18T10:28:37Z

Is it time to shut this?

grburgess · 2016-03-18T10:30:18Z

I haven't done extensive testing, but on some high dimensional problems that originally threw the recursion error, the problem has disappeared. So perhaps for now it is solved. :)

twiecki · 2016-03-18T10:33:30Z

That sounds amazing. I'll close it but feel free to reopen if the problem persists with master pymc3 and theano.

jonsedar · 2016-03-18T10:34:30Z

Thanks for the recent bugfixes guys, also the updates to the build dependencies mean I'm now running theano: 0.8.0rc1 and either or both changes seem to have increased the theshold at which I was finding recursion errors.

EDIT: Okay, well - that does seem to have fixed it. I think I have a different bug though:

for a sufficiently complex model, the first time I create and sample it using njobs > 1, the processes start (I'm viewing in htop) and then they die without throwing an error
If I re-run the sampling then the processes seem to run fine.

I assume the difference in 2 is that the model is already cached. It's tricky to replicate though, a bit of a Heisenbug!

hvasbath · 2016-03-19T14:05:13Z

I also still get my segmentation faults - also with creating all the Text backends in advance...

vivek-hari · 2016-04-19T15:13:51Z

Oh! Really. Even with the latest pymc3 version, I am getting the same error with njobs=2.

multiprocessing.pool.MaybeEncodingError: Error sending result: '[<MultiTrace: 1 chains, 10 iterations, 2106 variables>]'. Reason: 'RuntimeError('maximum recursion depth exceeded',)'

    trace = pm.sample(n_samples, step=step_func, start=start, njobs=n_chains, progressbar=False)
  File "/home/user/.local/lib/python2.7/site-packages/pymc3/sampling.py", line 150, in sample
    return sample_func(**sample_args)
  File "/home/user/.local/lib/python2.7/site-packages/pymc3/sampling.py", line 282, in _mp_sample
    **kwargs) for i in range(njobs))
  File "/home/user/.local/lib/python2.7/site-packages/joblib/parallel.py", line 810, in __call__
    self.retrieve()
  File "/home/user/.local/lib/python2.7/site-packages/joblib/parallel.py", line 727, in retrieve
    self._output.extend(job.get())
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
    raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[<MultiTrace: 1 chains, 10 iterations, 2106 variables>]'. Reason: 'RuntimeError('maximum recursion depth exceeded',)'

I have pymc3-3.0, numpy-1.11.0, Theano-0.8.1, scipy-0.17.0 installed.
Anyone else facing the same issue in the latest version of pymc3?

fonnesbeck · 2016-04-19T15:20:08Z

By "latest pymc3 version" do you mean that you installed it from GitHub master? That is,

pip install -U git+https://github.com/pymc-devs/pymc3.git

vivek-hari · 2016-04-19T15:22:07Z

I installed using

pip install --process-dependency-links git+https://github.com/pymc-devs/pymc3

fonnesbeck · 2016-04-19T15:25:11Z

Make sure you use the -U flag or it may not update. I have not had this error since we closed this issue, so my first guess is that your update did not stick.

vivek-hari · 2016-04-19T15:27:25Z

Oh.. Thank you so much for your quick response. I'll update using -U flag and will get back. Thanks again!

vivek-hari · 2016-04-21T07:04:36Z

Sorry @fonnesbeck , installing pymc3 with -U also leads to same error.
Even I removed all the packages(pymc3, numpy, scipy, theano) from my machine and tried fresh installation of pymc3 using pip install -U git+https://github.com/pymc-devs/pymc3.git. It also ended up in RuntimeError('maximum recursion depth exceeded',).

I have,
Python 2.7.6,
pymc3.0,
matplotlib-1.5.1, joblib-0.9.4, numpy-1.11.0, pandas-0.18.0, patsy-0.4.1, pydot_ng-1.0.0, pyparsing-2.1.1,scipy-0.17.0,
Theano-0.8.1
installed in my machine.

nvidia-smi toolkit gives following details,
NVIDIA-SMI 346.96, Driver Version: 346.96, 4 GPU(0,1,2,3).

My .theanorc config is,

[global]
device = gpu
floatX = float32
assert_no_cpu_op = warn
[cuda]
root = /usr/local/cuda
[nvcc]
fastmath = True
[pycuda]
init = True

Is there anything else to be done?

twiecki · 2016-04-21T07:06:22Z

Perhaps the GPU utilization is at fault? Have you tried with CPU?

On Thu, Apr 21, 2016 at 9:04 AM, Vivek Harikrishnan Ramalingam <
notifications@github.com> wrote:

Sorry @fonnesbeck https://github.com/fonnesbeck , installing pymc3 with
-U also leads to same error.
Even I removed all the packages(pymc3, numpy, scipy, theano) from my
machine and tried fresh installation of pymc3 using pip install -U git+
https://github.com/pymc-devs/pymc3.git. It also ended up in
RuntimeError('maximum recursion depth exceeded',).

I have,
Python 2.7.6,
pymc3.0,
matplotlib-1.5.1, joblib-0.9.4, numpy-1.11.0, pandas-0.18.0, patsy-0.4.1,
pydot_ng-1.0.0, pyparsing-2.1.1,scipy-0.17.0,
Theano-0.8.1
installed in my machine.

nvidia-smi toolkit gives following details,
NVIDIA-SMI 346.96, Driver Version: 346.96, 4 GPU(0,1,2,3).

My .theanorc config is,

[global]
device = gpu
floatX = float32
assert_no_cpu_op = warn
[cuda]
root = /usr/local/cuda
[nvcc]
fastmath = True
[pycuda]
init = True

Is there anything else to be done?

—
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub
#879 (comment)

vivek-hari · 2016-04-21T07:11:34Z

Thanks @twiecki I will try with CPU and post my updates.

vivek-hari · 2016-04-21T07:30:58Z

device=cpu in .theanorc also raises RuntimeError('maximum recursion depth exceeded',).

vivek-hari · 2016-04-21T14:23:34Z

Below snippet is I am trying to execute.

import pymc3 as pm
import theano.tensor as T
import pandas

def tinvlogit(x):
    return T.exp(x) / (1 + T.exp(x))

pandas_df = pandas.read_csv("data.csv")

x_col1 = pandas_df['col1']
x_col2 = pandas_df['col2']
x_col3 = pandas_df['col3']
n_col3 = len(pandas_df['col3'].unique())

with pm.Model() as model:
        b_0 = pm.Normal('b_0', mu=0, sd=100)
        b_col1 = pm.Normal('b_col1', mu=0, sd=100)
        b_col2 = pm.Normal('b_col2', mu=0, sd=100)
        sigma_col3 = pm.HalfNormal('sigma_col3', sd=100)
        b_col3 = pm.Normal('b_col3', mu=0, sd=sigma_col3, shape=n_col3)

        for i in range(0, len(pandas_df)):
            p = pm.Deterministic('p', T.maximum(0, T.minimum(1, tinvlogit(
                b_0 + b_col1 * x_col1.at[i] + b_col2 * x_col2.at[i] + b_col3[x_col3.at[i]))))

        y = pm.Bernoulli('y', p, observed=pandas_df.y)

        start = pm.find_MAP()

        step_func = pm.NUTS()

        trace = pm.sample(5000, step=step_func, start=start, njobs=2, progressbar=True)

pm.sample fails with RuntimeError('maximum recursion depth exceeded')

pandas_df is pandas dataframe with columns col1(decimal), col2(decimal), col3(integer between 1-10), y(0 or 1) and has 50000 rows.

hvasbath · 2016-04-23T07:41:50Z

You get the recursion error because your graph will be very long as your loop will be running for 50k times, each time with all the nodes. Although I dont really get the purpose of your model I have the feeling you could vectorize it and get rid of the loop. The RVs have a shape parameter where you can simply create vectors of length of your data frame.
The way you do it now p will be always overwritten and only the last sample of your dataframe will go into the cost. Or am I missing something?

fonnesbeck mentioned this issue Dec 17, 2015

RecursionError: when working through PyMC3 tutorial #893

Closed

fonnesbeck mentioned this issue Jan 3, 2016

njobs > 1 causing RuntimeError: maximum recursion depth exceeded #925

Closed

fonnesbeck mentioned this issue Feb 10, 2016

Parallel processing using Joblib #973

Merged

twiecki closed this as completed Mar 18, 2016

hvasbath mentioned this issue Mar 20, 2016

Adaptive Transitional Markov Chain Monte Carlo #1031

Closed

This was referenced Nov 1, 2016

Multiprocessing failure #1247

Closed

Parallel jobs generate identical chains #1479

Closed

jithendaraa mentioned this issue Jul 18, 2020

Maximum Recursion Depth Exceeded for time series model #4018

Closed

Running multiple chains causes RecursionError #879

Running multiple chains causes RecursionError #879

Comments

fonnesbeck commented Nov 25, 2015

twiecki commented Nov 27, 2015

twiecki commented Nov 27, 2015

twiecki commented Dec 17, 2015

fonnesbeck commented Dec 17, 2015

twiecki commented Dec 17, 2015

fonnesbeck commented Dec 17, 2015

twiecki commented Dec 17, 2015

hvasbath commented Feb 8, 2016

hvasbath commented Feb 8, 2016

fonnesbeck commented Feb 8, 2016

twiecki commented Feb 8, 2016

fonnesbeck commented Feb 8, 2016

datnamer commented Feb 8, 2016

fonnesbeck commented Feb 8, 2016

datnamer commented Feb 8, 2016

twiecki commented Feb 8, 2016

mrocklin commented Feb 8, 2016

fonnesbeck commented Feb 8, 2016

mrocklin commented Feb 8, 2016

fonnesbeck commented Feb 10, 2016

mrocklin commented Feb 10, 2016

fonnesbeck commented Feb 10, 2016

mrocklin commented Feb 10, 2016

tyarkoni commented Feb 11, 2016

fonnesbeck commented Feb 11, 2016

fonnesbeck commented Feb 11, 2016

twiecki commented Feb 11, 2016

grburgess commented Mar 17, 2016

hvasbath commented Mar 18, 2016

twiecki commented Mar 18, 2016

hvasbath commented Mar 18, 2016

grburgess commented Mar 18, 2016

hvasbath commented Mar 18, 2016

twiecki commented Mar 18, 2016

hvasbath commented Mar 18, 2016

grburgess commented Mar 18, 2016

springcoil commented Mar 18, 2016

grburgess commented Mar 18, 2016

twiecki commented Mar 18, 2016

jonsedar commented Mar 18, 2016

hvasbath commented Mar 19, 2016

vivek-hari commented Apr 19, 2016 • edited

fonnesbeck commented Apr 19, 2016

vivek-hari commented Apr 19, 2016

fonnesbeck commented Apr 19, 2016

vivek-hari commented Apr 19, 2016

vivek-hari commented Apr 21, 2016

twiecki commented Apr 21, 2016

vivek-hari commented Apr 21, 2016

vivek-hari commented Apr 21, 2016

vivek-hari commented Apr 21, 2016

hvasbath commented Apr 23, 2016

vivek-hari commented Apr 19, 2016 •

edited