Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace TCP with WebSockets (when feature gate enabled) #2059

Merged
merged 2 commits into from
Feb 27, 2024

Conversation

Restioson
Copy link
Contributor

@Restioson Restioson commented Feb 19, 2024

This PR replaces all instances of TCP with WebSockets when the ln_net_native_ws or ln_net_wasm_ws features are enabled. This is step 1 for 10101 WASM 馃帀

The new, default, ln_net_tcp carries out the old behaviour. When it is not enabled, the node will not listen for incoming TCP connections. The coordinator supports both TCP and WS connections. WS connections simply connect to its HTTP endpoint at / with a Websocket upgrade header.

There is a potential breaking protocol change in how we handle pings/pongs with the orderbook. I have swapped uses of tungstenite in transitive dependencies of native to the tokio_tungstenite_wasm crate, which is a facade over tungstenite and WASM websockets (depending on target). Since WASM websockets can't directly intercept Ping/Pong frames, instead we use Text messages with the content "ping" and "pong". Additionally, we do the same in the BitMex protocol (which supports this natively as well).

The coordinator's axum websocket implementation is not perfect as it kind of 'lies' to rust-lightning about when the data gets sent exactly. See the SocketDescriptor send_data implementation comment for more info. FWIW, this is how Mutiny does it too so hopefully it's okay. This also applies to the ln_net_native_ws implementation, but this is only intended for testing anyway and should be removed later down the road.

Because of the large potential impact, I've requested a review from everyone.

To do:

  • Support WSS (web socket secure) Probably not needed as we don't use this currently (rust-lightning encrypts everything anyway)
  • Fix wasm implementation

@Restioson Restioson marked this pull request as ready for review February 19, 2024 11:11
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is largely a copy of lightning-net-tokio, but that is specific to a PeerManager with a lighting_net_tokio::SocketDescriptor in its type args. This one is for a PeerManager with a DynamicSocketDescriptor, thus supporting websockets and TCP in the same build (needed for the coordinator).

It is a shame that we have to fork it into our own repo though.

Comment on lines +158 to +161
// This isn't so great as we should be waiting for this to be sent before returning the
// amount of data sent (which implies that the send operation is done). This is so that the
// backpressure stuff works properly
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just calling this out as I did in the description so it's easier to find

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea how we could tackle this limitation? Can we just add another channel so that the other task can let us know when the data has been sent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little more complicated than that. If we don't send all the data when calling send_data then there is a function we need to call of the peer manager which results in send_data being called again. At first glance you'd think you can just make the 2nd a no-op, but there could be more data that's waiting to be sent by then. Therefore we need to keep track of how much we promised to send and how much extra must be sent.

Definitely possible to do this, just more work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining! As discussed, we don't have to change this until we run into problems.

@Restioson
Copy link
Contributor Author

Ah, it seems that the WASM websocket approach won't work due to the Send bound in BDK.

@Restioson Restioson force-pushed the replace-tcp-with-ws branch 5 times, most recently from 36858c8 to 5c1eb2c Compare February 23, 2024 07:31
Copy link
Contributor

@holzeis holzeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice 馃憤

Copy link
Contributor

@luckysori luckysori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I'm not the best reviewer of networking code.

It would be cool if we could get rid of some of those allows.

Comment on lines +181 to +182
#[allow(dead_code)]
listen_address: SocketAddr, // Irrelevant when using websockets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

鉂揅an we feature-flag this instead or is it too much of a pain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I thought that perhaps we may want to use it somewhere else (?). If not I can definitely get rid of it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but I think this can be safely removed (from the version that doesn't use it).

crates/ln-dlc-node/src/node/mod.rs Show resolved Hide resolved
crates/ln-dlc-node/Cargo.toml Show resolved Hide resolved
crates/ln-dlc-node/src/networking.rs Outdated Show resolved Hide resolved
Comment on lines +26 to +51
if peer.is_ws {
debug!("Connecting over WS");

#[cfg(not(feature = "ln_net_ws"))]
let ws: Option<future::Either<future::Ready<()>, _>> =
panic!("Cannot connect outbound over WS when ln_net_ws is not enabled");

#[cfg(feature = "ln_net_ws")]
let ws = tungstenite::connect_outbound(peer_manager, peer)
.await
.map(future::Either::Left);

ws
} else {
debug!("Connecting over TCP");

#[cfg(not(feature = "ln_net_tcp"))]
let tcp: Option<future::Either<_, future::Ready<()>>> =
panic!("Cannot connect outbound over TCP when ln_net_tcp is not enabled");

#[cfg(feature = "ln_net_tcp")]
let tcp = tcp::connect_outbound(peer_manager, peer.pubkey, peer.address)
.await
.map(future::Either::Right);

tcp
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

馃敡 I think you should be able to get rid of all of the allows if you use the cfg! macro. You could handle all the valid combinations explicitly and panic for all the invalid cases in a single else branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can use the cfg! macro here, as that resolves to a runtime bool. Hence, we'd end up with a call to tcp::connect_outbound still in the final code, which would not exist if the ln_net_tcp feature were not enabled.

I tried with the cfg_if crate but it doesn't support expression returns so regardless it's kind of the same.

However, one way we may be able to solve it is by explicit return statements sprinkled around, if you'd prefer?

Copy link
Contributor Author

@Restioson Restioson Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this

    if peer.is_ws {
        debug!("Connecting over WS");

        #[cfg(not(feature = "ln_net_ws"))]
        return panic!("Cannot connect outbound over WS when ln_net_ws is not enabled");

        #[cfg(feature = "ln_net_ws")]
        tungstenite::connect_outbound(peer_manager, peer)
            .await
            .map(future::Either::Left)
    } else {
        debug!("Connecting over TCP");

        #[cfg(not(feature = "ln_net_tcp"))]
        return panic!("Cannot connect outbound over TCP when ln_net_tcp is not enabled");

        #[cfg(feature = "ln_net_tcp")]
        tcp::connect_outbound(peer_manager, peer.pubkey, peer.address)
            .await
            .map(future::Either::Right)
    }

which is pleasing but doesn't give Rust enough info to infer the type of Either - which we can't know unless we know the features that are enabled, which is a bit of a pain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks for giving it a go! We can leave it. If I find time after we merge this I will play around with it and report back.

crates/ln-dlc-node/src/networking/tungstenite.rs Outdated Show resolved Hide resolved
crates/ln-dlc-node/src/networking/tungstenite.rs Outdated Show resolved Hide resolved
Comment on lines +158 to +161
// This isn't so great as we should be waiting for this to be sent before returning the
// amount of data sent (which implies that the send operation is done). This is so that the
// backpressure stuff works properly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea how we could tackle this limitation? Can we just add another channel so that the other task can let us know when the data has been sent?

This means that we need to replace Ping/Pong messages with 'ping'/'pong' as browser JS WS can't interact with Ping/Pong directly
@Restioson
Copy link
Contributor Author

@luckysori just checking if we are ready for merge on this?

@luckysori
Copy link
Contributor

@luckysori just checking if we are ready for merge on this?

Sure thing! I will add it to the merge queue.

@luckysori luckysori added this pull request to the merge queue Feb 27, 2024
Merged via the queue into main with commit d1f4a89 Feb 27, 2024
20 checks passed
@luckysori luckysori deleted the replace-tcp-with-ws branch February 27, 2024 12:05
@Restioson Restioson mentioned this pull request Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants