Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KQueueEventLoop won't unregister active channels reusing a file descriptor #9149

Merged
merged 1 commit into from May 22, 2019

Conversation

@normanmaurer
Copy link
Member

commented May 14, 2019

Motivation:

The current KQueueEventLoop implementation does not process concurrent domain socket channel registration/unregistration in the order they actual
happen since unregistration are delated by an event loop task scheduling. When a domain socket is closed, it's file descriptor might be reused
quickly and therefore trigger a new channel registration using the same descriptor.

Consequently the KQueueEventLoop#add(AbstractKQueueChannel) method will overwrite the current inactive channels having the same descriptor
and the delayed KQueueEventLoop#remove(AbstractKQueueChannel) will remove the active channel that replaced the inactive one.

As active channels are registered, events for this file descriptor won't be processed anymore and the channels will never be closed.

Modifications:

  • Change the logic of KQueueEventLoop#remove(AbstractKQueueChannel) channels so it will check channels equality prior removal.

Result:

KQueueEventLoop won't remove anymore active channels reusing a file descriptor.

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

@vietj PTAL... This contains your change + some small adjustments to also fix the case when we update the filters for the wrong Fd.

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

If this looks ok to you @vietj I will just squash both commits and give you the attribution :)

@normanmaurer normanmaurer added this to the 4.1.37.Final milestone May 14, 2019

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

Proper fix for #9114 and #9113

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 14, 2019

@netty-bot test this please

@johnou

johnou approved these changes May 14, 2019

@njhill

njhill approved these changes May 14, 2019

@vietj

This comment has been minimized.

Copy link
Contributor

commented May 15, 2019

it looks good to me @normanmaurer

@normanmaurer normanmaurer force-pushed the kqueue_race branch from a579e51 to d823664 May 15, 2019

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 16, 2019

@vietj PTAL again as I adjusted it even more :)

@vietj

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

LGTM @normanmaurer good job!

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 22, 2019

Alright let me merge this one...

KQueueEventLoop | EpollEventLoop may incorrectly update registration …
…when FD is reused.

Motivation:

The current KQueueEventLoop implementation does not process concurrent domain socket channel registration/unregistration in the order they actual
happen since unregistration are delated by an event loop task scheduling. When a domain socket is closed, it's file descriptor might be reused
quickly and therefore trigger a new channel registration using the same descriptor.

Consequently the KQueueEventLoop#add(AbstractKQueueChannel) method will overwrite the current inactive channels having the same descriptor
and the delayed KQueueEventLoop#remove(AbstractKQueueChannel) will remove the active channel that replaced the inactive one.

As active channels are registered, events for this file descriptor won't be processed anymore and the channels will never be closed.

The same problem can also happen in EpollEventLoop. Beside this we also may never remove the AbstractEpollChannel from the internal map
when it is unregistered which will prevent it from be GC'ed

Modifications:

- Change logic of native KQueue and Epoll implementations to ensure we correctly handle the case of FD reuse
- Only try to update kevent / epoll if the Channel is still open (as otherwise it will be handled by kqueue / epoll itself)
- Correctly remove AbstractEpollChannel from internal map in all cases
- Make implementation of closeAll() consistent for Epoll and KQueueEventLoop

Result:

KQueue and Epoll native transports correctly handle FD reuse

Co-authored-by: Norman Maurer <norman_maurer@apple.com>

@normanmaurer normanmaurer force-pushed the kqueue_race branch from d823664 to 5b14051 May 22, 2019

@normanmaurer normanmaurer merged commit 604f9f7 into 4.1 May 22, 2019

@normanmaurer normanmaurer deleted the kqueue_race branch May 22, 2019

@normanmaurer

This comment has been minimized.

Copy link
Member Author

commented May 22, 2019

Cherry-picked as e348bd9

astei added a commit to VelocityPowered/Velocity that referenced this pull request Jul 2, 2019

astei added a commit to VelocityPowered/Velocity that referenced this pull request Jul 2, 2019

@astei

This comment has been minimized.

Copy link

commented Jul 2, 2019

This was quite interesting, as this appears to also fix a related TCP issue I had in my project (which I worked around by deferring to a future iteration of the event loop). I suspected it was a Netty bug but wasn't entirely sure. I tested my project with 4.1.37.Final and removed my workaround, and this worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.