-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in eventloop #36
Race condition in eventloop #36
Conversation
Hi, thanks for opening the issue. So what I understand the actual issue is that |
Have you tried enabling trace logging? If so, do you see |
Yes this is accurate. Due to how Geyser works if you delay sending the packet until after the channelActive it'll crash the client because compression is turned on at the same time that packet is sent. Ideally the incoming datagrams are buffered until channelActive or we modify the API for ChannelFuture access and enable compression only after that packet is placed in the outbound buffer.
I tried trace logging in org.cloudburst io.netty org.geyser and the only traces look like this (slightly different due to my own debugging lines)
Nothing relevant following until you press CANCEL in minecraft client. (edit: redact PII) |
Hi @romanalexander , can you please test if you are able to reproduce this issue with |
👍 |
bump netty due CVE-2022-41881 Changes in cloudburst libs: - uses netty 4.1.101.Final - bumps netty-transport-raknet which fixes CloudburstMC/Network#36, an issue where some connection attempts did not work
* bump cloudburst's netty-transport-raknet * bump cloudburst protocol/codec/connection bump bump netty due CVE-2022-41881 Changes in cloudburst libs: - uses netty 4.1.101.Final - bumps netty-transport-raknet which fixes CloudburstMC/Network#36, an issue where some connection attempts did not work
I have geyser running inside a k8s pod (containerd runtime). There's been a persistent bug where clients can't connect half the time in a weird pattern like
I isolated it down to
RakServerOnlineInitialHandler.channelRead0 channel.eventLoop()
responsible forRakChildChannel.setActive
is somehow happening too slowly. When this happens, datagrams are read while the channel is still inactive. This causes any UpstreamSession.sendPacketImmediately to silently(!!) fail writing the packet out. This behavior is caused solely by RequestNetworkSettingsPacket trying to immediately flush NetworkSettingsPacket that will never actually be sent to the underlying socket.I think the best solution is to hold all EncapsulatedPackets in a buffer until the channel becomes active. (Also I have no clue why eventLoop is running like this. Maybe the guest OS being stuck with low resolution timers.)
You can create a reproducible test case with:
I attached this as a PR with example solution code that fixes the problem for me. Obviously this isn't an ideal solution. Hopefully someone more well versed in netty can write a better solution.
Additional observations:
graalvm-ce-java17-linux-amd64-22.2.0
TransportMethod.EPOLL
andTransportMethod.NIO
with the same situation.