Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EngineIo v3 support #22

Closed
sleeyax opened this issue Jun 17, 2023 · 13 comments · Fixed by #24
Closed

Add EngineIo v3 support #22

sleeyax opened this issue Jun 17, 2023 · 13 comments · Fixed by #24
Labels
enhancement New feature or request

Comments

@sleeyax
Copy link
Contributor

sleeyax commented Jun 17, 2023

Created this issue as a way to keep track of this feature.

According to the Engine.IO protocol docs, changes from V3 to V4 are fairly small in scope.

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 17, 2023

I'd be willing to try and work on (a part of) this feature. I took some time exploring the codebase and got some general pointers in mind to accomplish this. I'll post an update soon.

@Totodore
Copy link
Owner

Nice ! Just, note that it is not only about EngineIO v3 but also SocketIo v3/v4.

You can use the official tests suites for engine io & socket io to verify the correctness of your implementations.

@Totodore Totodore changed the title Add SocketIo v3 support Add EngineIo v3 support Jun 17, 2023
@Totodore
Copy link
Owner

I made a branch for this feature so you can gradually make pull requests without breaking the main branch.

@Totodore Totodore added the enhancement New feature or request label Jun 22, 2023
@sleeyax sleeyax linked a pull request Jun 25, 2023 that will close this issue
3 tasks
@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 25, 2023

While working on the reverse ping/pong mechanism I hit an unrelated roadblock and I'm not sure how to continue... Spent plenty of hours investigating the issue but unfortunately I haven't managed to figure out the exact cause yet, so I could use some help. I'll try to explain the issue.

You can find a reproducible example here: https://github.com/sleeyax/socketioxide-issue-22.

Basically, from what I gathered, the server gets stuck waiting for packets during request polling phase (see [sid=LunryQxmFHQ] polling request in log below) in https://github.com/sleeyax/socketioxide/blob/51da550fe5be799f37903cfc1871cc8663afe65d/engineioxide/src/engine.rs#L139.

The client seems to be waiting for the polling connection to close (see we are currently polling - waiting to pause in log below) in https://github.com/socketio/engine.io-client/blob/7aad0d6da1ee9941002b44d0b6700e0b71d21151/lib/transports/polling.js#L88.

Both are waiting for a condition that is never met. Because of that, the server never receives the required upgrade packet during the handshake in https://github.com/sleeyax/socketioxide/blob/51da550fe5be799f37903cfc1871cc8663afe65d/engineioxide/src/engine.rs#L466 but the client thinks the websocket connection has been opened and eventually times out when it doesn't get a pong reply back from the ping that was sent.

Client logs:

socket.io-client:url parse http://localhost:3000 +0ms
  socket.io-client new io instance for http://localhost:3000 +0ms
  socket.io-client:manager readyState closed +0ms
  socket.io-client:manager opening http://localhost:3000 +0ms
  engine.io-client:socket creating transport "polling" +0ms
  engine.io-client:polling polling +0ms
  engine.io-client:polling-xhr xhr poll +0ms
  engine.io-client:polling-xhr xhr open GET: http://localhost:3000/socket.io/?EIO=3&transport=polling&t=OZolP1m&b64=1 +0ms
  engine.io-client:polling-xhr xhr data null +0ms
  engine.io-client:socket setting transport polling +3ms
  socket.io-client:manager connect attempt will timeout after 20000 +4ms
  engine.io-client:polling polling got data 108:0{"sid":"LunryQxmFHQ","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":20000,"maxPayload":100000} +7ms
  engine.io-client:socket socket receive: type "open", data "{"sid":"LunryQxmFHQ","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":20000,"maxPayload":100000}" +4ms
  engine.io-client:socket socket open +0ms
  socket.io-client:manager open +5ms
  socket.io-client:manager cleanup +0ms
  socket.io-client:socket transport is open - connecting +0ms
  engine.io-client:socket starting upgrade probes +1ms
  engine.io-client:socket probing transport "websocket" +0ms
  engine.io-client:socket creating transport "websocket" +0ms
  engine.io-client:polling polling +3ms
  engine.io-client:polling-xhr xhr poll +10ms
  engine.io-client:polling-xhr xhr open GET: http://localhost:3000/socket.io/?EIO=3&transport=polling&t=OZolP1w&b64=1&sid=LunryQxmFHQ +0ms
  engine.io-client:polling-xhr xhr data null +0ms
  engine.io-client:socket probe transport "websocket" opened +6ms
  engine.io-client:polling polling got data 1:6 +5ms
  engine.io-client:socket socket receive: type "noop", data "undefined" +1ms
  engine.io-client:polling polling +0ms
  engine.io-client:polling-xhr xhr poll +5ms
  engine.io-client:polling-xhr xhr open GET: http://localhost:3000/socket.io/?EIO=3&transport=polling&t=OZolP1_&b64=1&sid=LunryQxmFHQ +0ms
  engine.io-client:polling-xhr xhr data null +0ms
  engine.io-client:socket probe transport "websocket" pong +1ms
  engine.io-client:socket pausing current transport "polling" +0ms
  engine.io-client:polling we are currently polling - waiting to pause +1ms

Server logs:

2023-06-25T13:28:05.863994Z  INFO socketioxide_issue_22: 34: Starting server on 127.0.0.1:3000
2023-06-25T13:28:09.589918Z DEBUG hyper::proto::h1::io: 207: parsed 4 headers
2023-06-25T13:28:09.589977Z DEBUG hyper::proto::h1::conn: 222: incoming body is empty
2023-06-25T13:28:09.590175Z DEBUG engineioxide::sid_generator: 11: Generating new sid: LunryQxmFHQ
2023-06-25T13:28:09.590296Z DEBUG socketioxide::client: 119: eio socket connect LunryQxmFHQ
2023-06-25T13:28:09.590418Z DEBUG hyper::proto::h1::io: 320: flushed 230 bytes
2023-06-25T13:28:09.590461Z DEBUG engineioxide::socket: 195: [sid=LunryQxmFHQ] heartbeat receiver routine started
2023-06-25T13:28:09.596880Z DEBUG hyper::proto::h1::io: 207: parsed 6 headers
2023-06-25T13:28:09.596883Z DEBUG hyper::proto::h1::io: 207: parsed 4 headers
2023-06-25T13:28:09.596931Z DEBUG hyper::proto::h1::conn: 222: incoming body is empty
2023-06-25T13:28:09.596935Z DEBUG hyper::proto::h1::conn: 222: incoming body is empty
2023-06-25T13:28:09.597168Z DEBUG engineioxide::engine: 121: [sid=LunryQxmFHQ] polling request
2023-06-25T13:28:09.597412Z DEBUG hyper::proto::h1::io: 320: flushed 166 bytes
2023-06-25T13:28:09.597499Z DEBUG engineioxide::engine: 317: [sid=LunryQxmFHQ] websocket connection upgrade
2023-06-25T13:28:09.597530Z DEBUG engineioxide::socket: 151: [sid=LunryQxmFHQ] sending packet: Noop
2023-06-25T13:28:09.597677Z DEBUG hyper::proto::h1::io: 320: flushed 119 bytes
2023-06-25T13:28:09.601253Z DEBUG hyper::proto::h1::io: 207: parsed 4 headers
2023-06-25T13:28:09.601295Z DEBUG hyper::proto::h1::conn: 222: incoming body is empty
2023-06-25T13:28:09.601458Z DEBUG engineioxide::engine: 121: [sid=LunryQxmFHQ] polling request

Another weird thing I don't quite understand is that client side it never actually calls pause, even though the pollComplete event should have been fired: https://github.com/socketio/engine.io-client/blob/7aad0d6da1ee9941002b44d0b6700e0b71d21151/lib/transports/polling.js#L88-L93

And to top it all off, the connection sometimes manages to get out of the polling state for no sensible reason if you manually restart the client like 5 to 10 times. Sounds like a race condition?

Client logs when it randomly works:

 [...]
  engine.io-client:polling we are currently polling - waiting to pause +4ms
  engine.io-client:polling polling got data 1:6 +0ms
  engine.io-client:socket socket receive: type "noop", data "undefined" +0ms
  engine.io-client:polling pre-pause polling complete +0ms
  engine.io-client:polling paused +0ms
  engine.io-client:socket changing transport and sending upgrade packet +0ms
  engine.io-client:socket setting transport websocket +1ms
  engine.io-client:socket clearing existing transport polling +0ms
  engine.io-client:polling ignoring poll - transport state "paused" +1ms
  engine.io-client:socket writing ping packet - expecting pong within 20000ms +25s
  engine.io-client:socket flushing 1 packets in socket +1ms
  engine.io-client:socket socket receive: type "close", data ":3" +2ms

Server logs when it randomly works:

[...]
2023-06-25T14:00:25.317598Z DEBUG engineioxide::engine: 121: [sid=sivJnpUEwNo] polling request
2023-06-25T14:00:25.317703Z DEBUG hyper::proto::h1::io: 320: flushed 166 bytes
2023-06-25T14:00:25.317742Z DEBUG engineioxide::engine: 317: [sid=sivJnpUEwNo] websocket connection upgrade
2023-06-25T14:00:25.317760Z DEBUG engineioxide::socket: 151: [sid=sivJnpUEwNo] sending packet: Noop
2023-06-25T14:00:25.317831Z DEBUG hyper::proto::h1::io: 320: flushed 119 bytes
2023-06-25T14:00:25.320306Z DEBUG engineioxide::engine: 478: [sid=sivJnpUEwNo] ws upgraded successful
2023-06-25T14:00:25.320406Z DEBUG socketioxide::client: 119: eio socket connect sivJnpUEwNo
2023-06-25T14:00:50.334082Z DEBUG engineioxide::socket: 198: [sid=sivJnpUEwNo] ping received, sending pong
2023-06-25T14:00:50.334380Z DEBUG engineioxide::engine: 364: [sid=sivJnpUEwNo] sent packet

Any help or commits to solve this problem would be greatly appreciated! click here for a reproducible example

@Totodore
Copy link
Owner

Totodore commented Jun 25, 2023

I'll check this ASAP, you can also try to check if this issue happens with other client implementations (rust,java,...)

@Totodore
Copy link
Owner

So I take a look, and it seems to come from here: the upgrade mechanism in v4 starts by sending a Noop packet to any pending polling request :
When I test you're reproductible example I have this result :

  engine.io-client:socket socket receive: type "open", data "{"sid":"VmVZUuXPCBM","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":20000,"maxPayload":100000}" +8ms
  engine.io-client:socket socket open +0ms
  socket.io-client:manager open +8ms
  socket.io-client:manager cleanup +1ms
  socket.io-client:socket transport is open - connecting +0ms
  engine.io-client:socket starting upgrade probes +2ms
  engine.io-client:socket probing transport "websocket" +0ms
  engine.io-client:socket creating transport "websocket" +0ms
  engine.io-client:polling polling +6ms
  engine.io-client:polling-xhr xhr poll +17ms
  engine.io-client:polling-xhr xhr open GET: http://127.0.0.1:3000/socket.io/?EIO=3&transport=polling&t=OZq-98F&b64=1&sid=VmVZUuXPCBM +0ms
  engine.io-client:polling-xhr xhr data null +0ms
  engine.io-client:socket probe transport "websocket" opened +11ms
  engine.io-client:socket probe transport "websocket" pong +1ms
  engine.io-client:socket pausing current transport "polling" +1ms
  engine.io-client:polling we are currently polling - waiting to pause +10ms
  engine.io-client:polling polling got data 6 +0ms
  engine.io-client:socket socket receive: type "error", data "parser error" +1ms
  engine.io-client:socket socket error {"code":"parser error"} +0ms
  socket.io-client:manager error Error: server error
    at Socket.onPacket (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\socket.js:455:19)
    at XHR.<anonymous> (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\socket.js:278:10)
    at Emitter.emit (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\component-emitter\index.js:145:20)
    at Transport.onPacket (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\transport.js:149:8)
    at callback (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\transports\polling.js:144:10)
    at exports.decodePayload (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-parser\lib\index.js:322:12)
    at Polling.onData (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\transports\polling.js:148:10)
    at Request.<anonymous> (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\transports\polling-xhr.js:129:10)
    at Emitter.emit (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\component-emitter\index.js:145:20)
    at Request.onData (C:\Users\prevo\source\repos\sandbox\socketioxide-issue-22\node_modules\engine.io-client\lib\transports\polling-xhr.js:303:8) {
  code: 'parser error'
} +15ms
  engine.io-client:socket socket close with reason: "transport error" +3ms

And specifically this :

engine.io-client:polling polling got data 6 +0ms
engine.io-client:socket socket receive: type "error", data "parser error" +1ms

(6 correspond to a Noop Packet).

So the upgrade mechanism is radically different according to this :

A connection always starts with polling (either XHR or JSONP). WebSocket gets tested on the side by sending a probe. If the probe is responded from the server, an upgrade packet is sent.

To ensure no messages are lost, the upgrade packet will only be sent once all the buffers of the existing transport are flushed and the transport is considered paused.

It introduces the notion of pause which doesn't exists in v4. So instead of using a NOOP packet to close the pending polling connection at the beggining of the upgrade process we keep it open until the 2probe 3probe are exchanged. By adapting the ws_upgrade_handshake you can fix this issue :

    async fn ws_upgrade_handshake(
        &self,
        sid: Sid,
        ws: &mut WebSocketStream<Upgraded>,
        protocol_version: ProtocolVersion,
    ) -> Result<(), Error> {
        let socket = self.get_socket(sid).unwrap();

        // send a NOOP packet to any pending polling request so it closes gracefully
        if protocol_version == ProtocolVersion::V4 {
            socket.send(Packet::Noop)?;
        }

        // Fetch the next packet from the ws stream, it should be a PingUpgrade packet
        let msg = match ws.next().await {
            Some(Ok(Message::Text(d))) => d,
            _ => Err(Error::UpgradeError)?,
        };
        match Packet::try_from(msg)? {
            Packet::PingUpgrade => {
                // Respond with a PongUpgrade packet
                ws.send(Message::Text(Packet::PongUpgrade.try_into()?))
                    .await?;
            }
            p => Err(Error::BadPacket(p))?,
        };

        // send a NOOP packet to any pending polling request so it closes gracefully
        // V3 protocol introduce _paused_ polling transport which require to close 
        // the polling request **after** the ping/pong handshake
        if protocol_version == ProtocolVersion::V3 {
            socket.send(Packet::Noop)?;
        }

        // Fetch the next packet from the ws stream, it should be an Upgrade packet
        let msg = match ws.next().await {
            Some(Ok(Message::Text(d))) => d,
            Some(Ok(Message::Close(_))) => {
                debug!("ws stream closed before upgrade");
                Err(Error::UpgradeError)?
            },
            _ => {
                debug!("unexpected ws message before upgrade");
                Err(Error::UpgradeError)?
            },
        };
        match Packet::try_from(msg)? {
            Packet::Upgrade => debug!("[sid={sid}] ws upgraded successful"),
            p => Err(Error::BadPacket(p))?,
        };

        // wait for any polling connection to finish by waiting for the socket to be unlocked
        let _ = socket.internal_rx.lock().await;
        socket.upgrade_to_websocket();
        Ok(())
    }

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 26, 2023

Ohhh I see. I actually tried only sending the noop packet when dealing with v3, but I was missing the pause part (i.e
sending Noop after PongUpgrade). Thank you, I'll test this out when I get home!

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 26, 2023

Awesome, it's working well now :)

2023-06-26T19:36:19.190904Z DEBUG engineioxide::socket: 198: [sid=Evh38gAH1tE] ping received, sending pong
2023-06-26T19:36:19.191193Z DEBUG engineioxide::engine: 361: [sid=Evh38gAH1tE] sent packet
2023-06-26T19:36:44.215047Z DEBUG engineioxide::socket: 198: [sid=Evh38gAH1tE] ping received, sending pong
2023-06-26T19:36:44.215165Z DEBUG engineioxide::engine: 361: [sid=Evh38gAH1tE] sent packet
[...]

Just the connect and disconnect events are never firing client side, but I guess there's a logical explanation for that too.

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 27, 2023

So, it appears only the connect event isn't firing when I use the official socket io client version 2.3.1. It oddly does get fired with the V4 protocol (latest client version), but not V3. I can't find proper documentation about this though, v4#packet-types and v3#packet-types look the same to me.

I think there's an undocumented difference between both implementations. In V4, it looks like they changed the client to send the CONNECT event/packet whereas V3 expects the server to do so. I guess I could hack this server-side send of the CONNECT packet into https://github.com/sleeyax/socketioxide/blob/51da550fe5be799f37903cfc1871cc8663afe65d/socketioxide/src/client.rs#L119 (only if the protocol is V3).

Not sure if this is the right approach, maybe I'm missing something.

What do you think @Totodore ? I'd like to hear your input again before I make any changes.

@Totodore
Copy link
Owner

Totodore commented Jun 27, 2023

I don't understand if you are talking about socket.io or engine.io.

Currently you are only implementing engine.io ?

If it is a socket.io issue, let's just work on it once the engine.io V3 impl is done.

If it is an engine.io issue I'll try to check what could be the problem.

The main difference between V4 & V5 socket.io is the fact that clients do not send a connect packet by default (the server assumes that you are connected on / by default)

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 27, 2023

Hmm yeah it's socket.io I suppose. Both socket.io and engine.io implementations go hand in hand, it's kinda confusing to me to distinguish between the two sometimes.

@sleeyax
Copy link
Contributor Author

sleeyax commented Jun 28, 2023

I'll move on with the engine.io implementation first then. I'll need socket.io V3 support afterwards though so I can use this library for a project which only supports V3 socket.io clients. I've also been testing my current changes with said client, which is how I stumbled upon the issue above.

@Totodore
Copy link
Owner

Sure !

To separate EngineIo & socketio just only use the engineioxide crate with the engineioxide examples/e2e and the engine.io client

Totodore added a commit that referenced this issue Jul 2, 2023
Feature: implement EngineIO V3 (closes #22)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants