Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error while attempting to bind on address, address already in use #315

Closed
segasai opened this issue Jul 19, 2021 · 5 comments
Closed

error while attempting to bind on address, address already in use #315

segasai opened this issue Jul 19, 2021 · 5 comments
Labels

Comments

@segasai
Copy link

segasai commented Jul 19, 2021

When running multiple pystan instances in parallel, (specifically I'm directly calling
.log_prob() on subsets of the data in parallel).
there is an error caused by port allocation of httpstan.

in getl_one()
     36 
     37 def getl_one(i, x):
---> 38     return si.Ms[i].log_prob(x)
     39 
     40 

~/pyenv38/lib/python3.8/site-packages/stan/model.py in log_prob()
    395                 return resp.json()["log_prob"]
    396 
--> 397         return asyncio.run(go())
    398 
    399     def grad_log_prob(self, unconstrained_parameters: Sequence[float]) -> float:

/usr/lib/python3.8/asyncio/runners.py in run()
     42         if debug is not None:
     43             loop.set_debug(debug)
---> 44         return loop.run_until_complete(main)
     45     finally:
     46         try:

/usr/lib/python3.8/asyncio/base_events.py in run_until_complete()
    614             raise RuntimeError('Event loop stopped before Future completed.')
    615 
--> 616         return future.result()
    617 
    618     def stop(self):

~/pyenv38/lib/python3.8/site-packages/stan/model.py in go()
    389 
    390         async def go():
--> 391             async with stan.common.HttpstanClient() as client:
    392                 resp = await client.post(f"/{self.model_name}/log_prob", json=payload)
    393                 if resp.status != 200:

~/pyenv38/lib/python3.8/site-packages/stan/common.py in __aenter__()
     34         host, port = "127.0.0.1", unused_tcp_port()
     35         site = aiohttp.web.TCPSite(self.runner, host, port)
---> 36         await site.start()
     37         self.session = aiohttp.ClientSession()
     38         self.base_url = f"http://{host}:{port}/v1"

~/pyenv38/lib/python3.8/site-packages/aiohttp/web_runner.py in start()
    119         server = self._runner.server
    120         assert server is not None
--> 121         self._server = await loop.create_server(
    122             server,
    123             self._host,

/usr/lib/python3.8/asyncio/base_events.py in create_server()
   1461                         sock.bind(sa)
   1462                     except OSError as err:
-> 1463                         raise OSError(err.errno, 'error while attempting '
   1464                                       'to bind on address %r: %s'
   1465                                       % (sa, err.strerror.lower())) from None

OSError: [Errno 98] error while attempting to bind on address ('127.0.0.1', 35943): address already in use

Basically if I understand correctly, the port is returned by unused_tcp_port(), but it could
easily get stolen by a parallel httpstan instance

Describe your system

Linux, Ubuntu 20.04, 64bit, gcc-9.3

Steps/Code to Reproduce

The following code does not do anything useful but triggers the bug

import stan
import multiprocessing as mp
import numpy as np

schools_code = """
data {
int N;
vector[N] x;
}
parameters {
  real mu;                // population treatment effect

}
model {
  target += normal_lpdf(x | mu, 1); // log-likelihood
}
"""


class si:
    M = None


def func(x):
    return si.M.log_prob(x)


if __name__ == '__main__':
    N = 10000
    x = np.random.normal(size=N)
    data = {'x': x, 'N': N}
    si.M = stan.build(schools_code, data=data)
    pool = mp.Pool(36)
    res = []
    for i in range(100000):
        res.append(pool.apply_async(func, ([1], )))
    for r in res:
        (r.get())
@segasai segasai added the bug label Jul 19, 2021
@riddell-stan
Copy link
Contributor

Thanks for the report.

I'm not sure anyone intended to support the use of pystan in this manner. The provided code is a little fishy -- you could simply run func sequentially, right?

It's difficult to predict the order of execution when using multiprocessing. Perhaps you could arrange things to run in entirely different processes (i.e., without using multiprocessing). I suspect port allocation would work correctly.

@segasai
Copy link
Author

segasai commented Jul 19, 2021

The reason why I run things in parallel, is that my likelihood calculation is very slow, so I've decided to split the data and use many cores that I have to distributie the likelihood/gradient calculations over cores and do HMC myself using littlemcmc. (map_rect is too much pain to use).

Regarding multprocessing, on linux it uses processes not threads. And in my actual problem what I have is 36 datasets and run them over my 36 processes in a pool and then accumulate gradients/likelihoods. The order does not matter for me.

@segasai
Copy link
Author

segasai commented Jul 19, 2021

As a temporary stopgap fix I've just put this in stan/common.py

        for i in range(10):
            try:
                host, port = "127.0.0.1", unused_tcp_port()
                site = aiohttp.web.TCPSite(self.runner, host, port)
                await site.start()
                break
            except OSError:
                continue

so it'd try 10 times before bailing out, but that's just a hack.

@riddell-stan
Copy link
Contributor

riddell-stan commented Jul 19, 2021 via email

@riddell-stan riddell-stan removed the bug label Aug 13, 2021
@stale
Copy link

stale bot commented Nov 25, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 25, 2021
@stale stale bot closed this as completed Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants