Skip to content

fix(server): Update quic server with compio.#2070

Merged
numinnex merged 16 commits intoapache:io_uring_tpcfrom
jadireddi:1935_fix_quic_-compio
Aug 20, 2025
Merged

fix(server): Update quic server with compio.#2070
numinnex merged 16 commits intoapache:io_uring_tpcfrom
jadireddi:1935_fix_quic_-compio

Conversation

@jadireddi
Copy link
Copy Markdown
Contributor

@jadireddi jadireddi commented Aug 2, 2025

Fixes: #1935

Migrates the QUIC listener and sender from tokio-based quinn to compio-quic to align with the project's io_uring migration strategy. This change ensures QUIC transport works with the compio runtime.

@jadireddi jadireddi mentioned this pull request Aug 2, 2025
Comment thread core/server/src/shard/system/messages.rs Outdated
Comment thread core/server/src/shard/system/messages.rs Outdated
Comment thread core/server/src/streaming/diagnostics/metrics.rs Outdated
Comment thread core/server/src/server_error.rs Outdated
Comment thread core/server/Cargo.toml Outdated
Comment thread core/server/src/quic/listener.rs
Comment thread core/server/src/quic/listener.rs Outdated
@numinnex
Copy link
Copy Markdown
Contributor

numinnex commented Aug 6, 2025

Other than the one more comment that I left it LGTM.

Can you also run a benchmark using quic protocol to check whether it works ?
All you have to do is to run server in one terminal cargo r -r --bin iggy-server and in another run the benchmark - cargo r -r --bin iggy-bench pp quic

@jadireddi
Copy link
Copy Markdown
Contributor Author

jadireddi commented Aug 6, 2025

Other than the one more comment that I left it LGTM.

Can you also run a benchmark using quic protocol to check whether it works ? All you have to do is to run server in one terminal cargo r -r --bin iggy-server and in another run the benchmark - cargo r -r --bin iggy-bench pp quic

I was able to run this: cargo r -r --bin iggy-server.

But when trying to cargo r -r --bin iggy-bench pp quic, it was complaining for file: file local_data/runtime/current_config.toml , which i don't have in my local.

Please find attachment for the logs:
logs.txt

@numinnex
Copy link
Copy Markdown
Contributor

numinnex commented Aug 6, 2025

Did you run that benchmark next to the server running ? on what port and address the quic listener is established ? can you try providing that port and address to the benchmark, the way you do it is as follows - cargo r -r --bin iggy-bench pp quic --server-address QUIC_ADDRESS, by default it's 127.0.0.1:8080

@jadireddi
Copy link
Copy Markdown
Contributor Author

@numinnex I have tested quic protocol and benchmarked as well. Looking good now. Can you please help with review.

@numinnex
Copy link
Copy Markdown
Contributor

numinnex commented Aug 11, 2025

@jadireddi
When trying to run the server from your branch I am getting following error:

Failed to install crypto provider: CryptoProvider { cipher_suites: [TLS13_AES_256_GCM_SHA384, TLS13_AES_128_GCM_SHA256, TLS13_CHACHA20_POLY1305_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256], kx_groups: [X25519, secp256r1, secp384r1], signature_verification_algorithms: WebPkiSupportedAlgorithms { all: [ .. ], mapping: [ECDSA_NISTP384_SHA384, ECDSA_NISTP256_SHA256, ED25519, RSA_PSS_SHA512, RSA_PSS_SHA384, RSA_PSS_SHA256, RSA_PKCS1_SHA512, RSA_PKCS1_SHA384, RSA_PKCS1_SHA256] }, secure_random: Ring, key_provider: Ring }

Did you get similar issue ? maybe there is conflict between rustls providers for QUIC and HTTP.

@jadireddi
Copy link
Copy Markdown
Contributor Author

@numinnex Didn't encounter above issue.

@numinnex
Copy link
Copy Markdown
Contributor

That's weird, maybe it's something wrong with the configuration, I'll try pinging somebody from the core team and see whether they have the same error.

@hubcio
Copy link
Copy Markdown
Contributor

hubcio commented Aug 11, 2025

Same issue for me:
image

@jadireddi
Copy link
Copy Markdown
Contributor Author

Sorry to bother. Would it be feasdible to run again by pulling the latest commit. Somehow on my local, this was not reproducing.

@jadireddi
Copy link
Copy Markdown
Contributor Author

jadireddi commented Aug 11, 2025

partial logs:

Screenshot 2025-08-11 at 6 58 10 PM

@numinnex
Copy link
Copy Markdown
Contributor

I've run the server and I noticed one thing, you don't set the SO_REUSEPORT flag on the UDP socket that you pass to the QUIC endpoint, because of that this program fails to create QUIC endpoint on multiple shards. I don't see an option to set SO_REUSEPORT flag on the Socket that ur providing, but by looking at web I see that the SO_REUSEPORT flag exists for UDP.

Could you investigate that ?

On top of that I think the current example with LISTENERS_COUNT does not really work well with the current architecture (we are spawning 10 worksers per shard, which means we do 10xN workers), since the Endpoint from quic wraps into Arc all of it's internal state, including the UDP socket, I think the endpoint could be shared between different shards and each shard would spawn one worker, but investigate first the SO_REUSEPORT flag for UDP

@jadireddi
Copy link
Copy Markdown
Contributor Author

@numinnex Fixed multiple shards sharing the same UDP port and One worker per shard.

I am able to bring quic_server for multiple shards successfully. Client successfully connected via QUIC. Connection accepted on shard 5. Login processed successfully on shard 5

Screenshot 2025-08-12 at 1 53 25 PM Screenshot 2025-08-12 at 1 53 52 PM

After client establishing connection to server's shard-5 successfully, for other shards i see below panic error's.

thread 'shard-2' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread 'shard-3' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-8' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-6' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-1' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-0' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-7' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-9' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'shard-4' panicked at core/server/src/shard/system/users.rs:465:14:
At this point session for 2461787274, should exist.

thread 'main' panicked at core/server/src/main.rs:302:23:
Failed to join shard thread: Any { .. }

@numinnex
Copy link
Copy Markdown
Contributor

Hmm intresting, when I tried to run it I've got error that the port is already in use.

For the error that you are getting, you need to broadcast the session in the handle_stream method. Check out the tcp_listenerline 171-178.

@jadireddi
Copy link
Copy Markdown
Contributor Author

jadireddi commented Aug 13, 2025

@numinnex now we don't see any panics on shards.
Can you check once if no other process is using 8080 port, and can you help to run this on your local.

numinnex
numinnex previously approved these changes Aug 19, 2025
Comment thread core/server/src/quic/quic_sender.rs
Comment thread core/server/src/quic/listener.rs Outdated
@numinnex numinnex self-requested a review August 19, 2025 09:39
@numinnex numinnex dismissed their stale review August 19, 2025 09:39

Two more comments to resolve

@numinnex numinnex merged commit 518fa21 into apache:io_uring_tpc Aug 20, 2025
This was referenced Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants