Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make mixnode handle bodies coming from the same source concurrently #372

Merged
merged 2 commits into from
Sep 6, 2023

Conversation

youngjoon-lee
Copy link
Contributor

@youngjoon-lee youngjoon-lee commented Sep 5, 2023

So far, the mixnode has spawned a new async task whenever it accepts a new inbound connection from mixclients or other mixnodes. It's good for handling multiple mixclients and other mixnodes concurrently.

But, it's not enough for better performance. Mixnode also needs to spawn a new async task whenever it reads a 'body' from a inbound connection, so that it can handle bodies concurrently which comes from the same source. This is especiallly important if we set delay to Sphinx packets.

Previously, we hadn’t haven this issue since we had created one-time connections and closed them immediately. But with conn pool added recently, we have this issue.

This PR makes the mixnet.rs test much faster: 20s -> 3s

I'm opening this PR based on the PR #370 because I need #370 to make CI passed. It seems #370 is not very related to the performance issue, but it still should be done since I think we have logical issues with regard to locking.

@youngjoon-lee youngjoon-lee added this to the Mixnet milestone Sep 5, 2023
@youngjoon-lee youngjoon-lee self-assigned this Sep 5, 2023
@al8n al8n mentioned this pull request Sep 5, 2023
Comment on lines +92 to +100
let pool = pool.clone();
let private_key = PrivateKey::from(private_key);
let client_tx = client_tx.clone();

tokio::spawn(async move {
if let Err(e) = Self::handle_body(body, &pool, &private_key, &client_tx).await {
tracing::error!("failed to handle body: {e}");
}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can potentially means that packets are delivered out of order to the destination, can this be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Packets (messages) can be delivered out of order. I think it's okay. The mix destination can reconstruct and broadcast messages even if packets are delivered out of order (first-reconstructed, first-broadcasted). Acutally, the current mixnet implementaion also doesn't guarantee the order of messages.

@zeegomo
Copy link
Contributor

zeegomo commented Sep 5, 2023

It seems #370 is not very related to the performance issue, but it still should be done since I think we have logical issues with regard to locking.

What issues?

@al8n
Copy link
Contributor

al8n commented Sep 5, 2023

It seems #370 is not very related to the performance issue, but it still should be done since I think we have logical issues with regard to locking.

What issues?

The issue seems that using sync version Mutex blocks the async runtime.

EDIT: but we do not actually find the real reason, just found if we remove sync version Mutex, then CI pass.

@zeegomo
Copy link
Contributor

zeegomo commented Sep 5, 2023

but we do not actually find the real reason, just found if we remove sync version Mutex, then CI pass.

The real reason seems #373

@zeegomo zeegomo changed the base branch from mixnet-pool-performance to fix-guard-await September 5, 2023 16:05
@zeegomo zeegomo force-pushed the mixnet-pool-performance-yjlee branch from dfcb77d to 65df004 Compare September 5, 2023 16:07
@youngjoon-lee
Copy link
Contributor Author

but we do not actually find the real reason, just found if we remove sync version Mutex, then CI pass.

The real reason seems #373

Yes. This is the reason.

@youngjoon-lee youngjoon-lee linked an issue Sep 6, 2023 that may be closed by this pull request
@codecov-commenter
Copy link

Codecov Report

Patch has no changes to coverable lines.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the GitHub App Integration for your organization. Read more.

📢 Thoughts on this report? Let us know!.

@youngjoon-lee youngjoon-lee changed the base branch from fix-guard-await to mixnet September 6, 2023 08:16
@youngjoon-lee youngjoon-lee changed the base branch from mixnet to fix-guard-await September 6, 2023 08:16
@youngjoon-lee youngjoon-lee merged commit 9a54f09 into fix-guard-await Sep 6, 2023
11 checks passed
@youngjoon-lee youngjoon-lee deleted the mixnet-pool-performance-yjlee branch September 6, 2023 08:16
zeegomo added a commit that referenced this pull request Sep 6, 2023
* Fix MutexGuard across await

Holding a MutexGuard across an await point is not a good idea.
Removing that solves the issues we had with the mixnet test

* Make mixnode handle bodies coming from the same source concurrently (#372)

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>
youngjoon-lee added a commit that referenced this pull request Sep 14, 2023
* Add `mixnode` and `mixnet-client` crate (#302)

* Add `mixnode` binary (#317)

* Integrate mixnet with libp2p network backend (#318)

* Fix #312: proper delays (#321)

* proper delays

* add missing duration param

* tiny fix: compilation error caused by `rand` 0.8 -> 0.7

* use `get_available_port()` for mixnet integration tests (#333)

* add missing comments

* Overwatch mixnet node (#339)

* Add mixnet service and overwatch app

* remove #[tokio::main]

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* fix tests for the overwatch mixnode (#342)

* fix panic when corner case happen in RandomDelayIter (#335)

* Use `log` service for `mixnode` bin (#341)

* Use `wire` for MixnetMessage in libp2p (#347)

* Prevent tmixnet tests from running forever (#363)

* Use random delay when sending msgs to mixnet (#362)

* fix a minor compilation error caused by the latest master

* Fix run output fd (#343)

* add a connection pool

* Exp backoff (#332)

* move mixnet listening into separate task

* add exponential retry for insufficient peers in libp2p

* fix logging

* Fix MutexGuard across await (#373)

* Fix MutexGuard across await

Holding a MutexGuard across an await point is not a good idea.
Removing that solves the issues we had with the mixnet test

* Make mixnode handle bodies coming from the same source concurrently (#372)

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* Move wait at network startup (#338)

We now wait after the call to 'subscribe' to give the network
the time to register peers in the mesh before starting to
publish messages

* Remove unused functions from mixnet connpool (#374)

* Mixnet benchmark (#375)

* merge fixes

* add `connection_pool_size` field to `config.yaml`

* Simplify mixnet topology (#393)

* Simplify bytes and duration range ser/de (#394)

* optimize bytes serde and duration serde

---------

Co-authored-by: Al Liu <scygliu1@gmail.com>
Co-authored-by: Daniel Sanchez <sanchez.quiros.daniel@gmail.com>
Co-authored-by: Giacomo Pasini <Zeegomo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connection reuse between MixnetNodes
4 participants