Skip to content

Conversation

@wjszlachta-man
Copy link
Contributor

What changes were proposed in this pull request?

On glibc based Linux systems select() can monitor only file descriptor numbers that are less than FD_SETSIZE (1024).

This is an unreasonably low limit for many modern applications.

This PR replaces select.select() with select.poll() when running on POSIX os.

Why are the changes needed?

When running via pyspark we frequently observe:

Exception occurred during processing of request from ('127.0.0.1', 46334)
Traceback (most recent call last):
  File "/usr/lib/python3.11/socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.11/socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python3.11/socketserver.py", line 755, in __init__
    self.handle()
  File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293, in handle
    poll(authenticate_and_accum_updates)
  File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266, in poll
    r, _, _ = select.select([self.rfile], [], [], 1)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: filedescriptor out of range in select()

On POSIX systems poll() should be used instead of select().

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests + we have been running this change (combined with py4j/py4j#560) on our YARN cluster (Linux) since April 2025.

Was this patch authored or co-authored using generative AI tooling?

No

…osix

On glibc based Linux systems select() can monitor only file descriptor numbers
that are less than FD_SETSIZE (1024).

This is an unreasonably low limit for many modern applications.
@wjszlachta-man
Copy link
Contributor Author

This is identical to #50774 (which was never reviewed and closed by the bot), but rebased against current master.

@wjszlachta-man
Copy link
Contributor Author

@HyukjinKwon is that something you could maybe review (in combination with py4j/py4j#560)?

We needed to implement this change to allow us to run 1000+ executors without running into filedescriptor out of range in select() error.

@HyukjinKwon
Copy link
Member

Can we have an environment variable to fallback?

@wjszlachta-man
Copy link
Contributor Author

@HyukjinKwon you can now use PYSPARK_FORCE_SELECT to fallback to select.select().

@wjszlachta-man
Copy link
Contributor Author

@HyukjinKwon @gaogaotiantian - I see you now merged #53388

Can you merge a similar change to python/pyspark/accumulators.py as per this PR?

@wjszlachta-man
Copy link
Contributor Author

wjszlachta-man commented Dec 9, 2025

See traceback:

Exception occurred during processing of request from ('127.0.0.1', 46334)
Traceback (most recent call last):
  File "/usr/lib/python3.11/socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python3.11/socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python3.11/socketserver.py", line 755, in __init__
    self.handle()
  File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293, in handle
    poll(authenticate_and_accum_updates)
  File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266, in poll
    r, _, _ = select.select([self.rfile], [], [], 1)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: filedescriptor out of range in select()

@gaogaotiantian
Copy link
Contributor

Could you make sure the CI pass? I think it’s the linter issue. Also it has some conflicts now.

@wjszlachta-man
Copy link
Contributor Author

Sure - let me rebase... (although conflict is because of #53388 - not sure why duplicate?)

Can we have an environment variable to fallback?

Do you still want environment variable to fallback? I see you don't have one in #53388

@gaogaotiantian
Copy link
Contributor

Sure - let me rebase... (although conflict is because of #53388 - not sure why duplicate?)

That would be my fault. I did not realize this PR exists when I fix the worker. Whether we should have an envvar fallback is a decision for @HyukjinKwon . Personally I think it's okay to just replace the old mechanism. We would need a whole config path for the fallback to work. With the heavily executed CI I think we can validate this local scope change pretty well.

@wjszlachta-man
Copy link
Contributor Author

Just updated the branch - this should fix conflict and overall is similar to your changes in python/pyspark/daemon.py.

Removed the fallback - if we have one it should be for both daemon.py and accumulators.py (personally I think it's redundant and should always use poll() if available).

As per your PR I updated poll() timeout to 1000 - I missed it was in millis (unlike select()) in my original commit 👍

@HyukjinKwon
Copy link
Member

Does this LGTM, @gaogaotiantian ?

@wjszlachta-man
Copy link
Contributor Author

wjszlachta-man commented Dec 10, 2025

@gaogaotiantian let me know what you think re. latest commit - it will check for errors in both accumulators.py and daemon.py.

I also removed try/catch around:

try:
    ready_fds = select.select([0, listen_sock], [], [], 1)[0]
except select.error as ex:
    if ex[0] == EINTR:
        continue
    else:
        raise

in daemon.py as this is old code and only needed for Python <3.5 (see: https://peps.python.org/pep-0475/ - from Python 3.5 onward select.select() will automatically retry system calls on EINTR).

Considering python_requires=">=3.9" as of Spark>=4, it should be safe to remove.

@wjszlachta-man
Copy link
Contributor Author

@gaogaotiantian happy to merge in this form?

Let me know if ok and I can propagate similar changes to py4j/py4j#560.

@gaogaotiantian
Copy link
Contributor

Yes I think similar changes should be done for py4j.

Comment on lines +187 to +188
# Could be POLLERR or POLLNVAL (select would raise in this case).
raise PySparkRuntimeError(f"Polling error - event {event} on fd {fd}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this introduce any behavior change? What's the original behavior? Can we improve this error message to make it more user friendly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the original behavior is probably to raise a Python builtin exception. To be honest if we need to raise this exception the situation is pretty bad - it would be an unexpected networking issue. I don't think we have coverage here.

Anyway, because this is a Python exception raised from worker side, the driver will always see a PythonException. The traceback might be different - but this is already a super rare situation and I don't think users will be relying on this.

On the other hand, yes there's improvement room for exception message.

Copy link
Contributor Author

@wjszlachta-man wjszlachta-man Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original behaviour would be select() raising OSError in situations, where poll() events POLLERR/POLLNVAL raise PySparkRuntimeError.

Agree users shouldn't rely on that behaviour, so thought using PySparkRuntimeError here makes most sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allisonwang-db agree with @gaogaotiantian the exception here should be very rare and due to networking issues - can you think of a better worded error message?

Copy link
Contributor

@allisonwang-db allisonwang-db Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the PySparkRuntimeError are user facing exceptions with proper error classes and actionable error messages. If this is a rare low level system issue, it's better to keep the original exception (OSError) error message. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I wouldn't make it OSError as this is not a system call error (why I went for RuntimeError instead).

But honestly not too fussed and happy to change it to whatever you or Spark committers find appropriate as your feel for the codebase is better than mine (thanks for reviewing it!).

My focus here is to merge the fix that removes dependency on select.select() as this is something we have been patching internally for quite some time now to allow our researchers to run with large number of executors. The problem is even worse if you load a lot of shared libs (for example via ctypes), which take fds <1024 so you can hit FD_SETSIZE with fewer than 1000 executors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like OSError either. I think the general rule is to make all known exceptions a spark error. We have the ability to add error class in the future. Again, on driver side this is just a PythonException - also no errorClass with it because it's not supported yet.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants