Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: filedescriptor out of range in select() #48

Closed
tuxes3 opened this issue Dec 19, 2019 · 3 comments
Closed

ValueError: filedescriptor out of range in select() #48

tuxes3 opened this issue Dec 19, 2019 · 3 comments

Comments

@tuxes3
Copy link

tuxes3 commented Dec 19, 2019

I run a pool.map on 300 millions entries. Running locally it works, on the linux cluster it fails with this exception

Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pebble/pool/process.py", line 385, in worker_process
    for task in worker_get_next_task(channel, params.max_tasks):
  File "/usr/local/lib/python3.6/dist-packages/pebble/pool/process.py", line 400, in worker_get_next_task
    yield fetch_task(channel)
  File "/usr/local/lib/python3.6/dist-packages/pebble/pool/process.py", line 404, in fetch_task
    while channel.poll():
  File "/usr/local/lib/python3.6/dist-packages/pebble/pool/channel.py", line 46, in unix_poll
    return bool(select.select([self.reader], [], [], timeout)[0])
ValueError: filedescriptor out of range in select()

Is it possible to switch out the select.select() in the channel.py with a select.poll()

Thanks for any help pointing out fixing this error

@noxdafox
Copy link
Owner

IIRC, select was chosen due to its portability. I can indeed use poll for those OSes which support it but just clarify me: how many child processes are you running? What are the specs of your server/workstation?

@tuxes3
Copy link
Author

tuxes3 commented Dec 19, 2019

I run following code snippet. The entries variable is an array containing 300 million entries.

def process(entry):
    ....
    return 1

with ProcessPool() as pool:
    # 5 minutes max per entry
    future = pool.map(process, entries, timeout=300)
    iterator = future.result()
    while True:
        try:
            next(iterator)
        except StopIteration:
            break
        except TimeoutError:
            print("skipping")
log("finished\n")

Specs:
MemTotal: 1585220900 kB
CPU: 128x Intel(R) Xeon(R) CPU E7-8860 v3 @ 2.20GHz

noxdafox added a commit that referenced this issue Dec 20, 2019
Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
@noxdafox
Copy link
Owner

Issue resolved in release 4.4.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants