Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Don't close socket in the WSGI thread, delegate it back to the main thread! #377

Merged
merged 1 commit into from May 25, 2022

Conversation

bertjwregeer
Copy link
Member

There's a potential issue whereby a thread attempts to send data, but the socket is closed on the other end, and the thread shuts the socket down.

This may however happen while the file descriptors are being put together before being passed to select() which will then fail as that file descriptor is no longer valid.

Instead we loop until we have called select() successfully, while checking to see if the file descriptors has changed since we started the poll.

If the map of file descriptors has not changed, and yet we get an errno that is a bad file descriptor, we will just raise it and kill the main loop.

Closes #374

@ale-rt
Copy link

ale-rt commented May 4, 2022

I was also hit by #374.
Running this branch the problem appears to be solved.
I did not had any other drawback.

Copy link

@ale-rt ale-rt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I added some really minor remarks

src/waitress/wasyncore.py Outdated Show resolved Hide resolved
src/waitress/wasyncore.py Outdated Show resolved Hide resolved
@bertjwregeer
Copy link
Member Author

It unfortunately does have drawbacks in that there are still race conditions in the code. I've got a better fix, I just haven't had time to test/validate/implement it.

I'm sorry you were hit by this too :-(

@ale-rt
Copy link

ale-rt commented May 5, 2022

I'm sorry you were hit by this too :-(

Are you kidding 🤣?! Thanks for the great work you are doing!

This solves a race condition that may exist when attempting to loop over
the open sockets and then calling select() and accidentally have called
close() on the socket in an app thread.
@bertjwregeer bertjwregeer force-pushed the bugfix/select-closed-socket-race branch from 729170f to c7a3d7e Compare May 25, 2022 02:27
@bertjwregeer
Copy link
Member Author

This change has now been updated to the following:

  1. The main thread can call close() on a socket, but an app thread can't
  2. App thread will pull the trigger (because no data was flushed) waking up the main thread to call close()

This solves the race condition and allows select() to function as before.

@bertjwregeer bertjwregeer marked this pull request as ready for review May 25, 2022 02:35
@mmerickel mmerickel merged commit 4f6789b into master May 25, 2022
27 checks passed
@mmerickel mmerickel deleted the bugfix/select-closed-socket-race branch May 25, 2022 03:07
@bertjwregeer bertjwregeer changed the title Bugfix: Retry if a thread closes a socket before we select() on it Bugfix: Don't close socket in the WSGI thread, delegate it back to the main thread! May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible race condition leading to the main loop dying?
3 participants