Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix panic when corner case happen in RandomDelayIter #335

Merged
merged 11 commits into from
Aug 30, 2023

Conversation

al8n
Copy link
Contributor

@al8n al8n commented Aug 29, 2023

No description provided.

@al8n al8n added enhancement New feature or request mixnet labels Aug 29, 2023
@al8n al8n added this to the Mixnet milestone Aug 29, 2023
@al8n al8n self-assigned this Aug 29, 2023

#[test]
fn test_random_delay_iter_zero_total_time() {
const TOTAL_DELAYS: u64 = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This const could be global shared if you use for all tests.

@@ -133,21 +133,60 @@ where
return None;
}

// corner case: 0 average delay
if self.avg_delay == 0 {
self.remaining_delays -= 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set remaining delays at line 135 and avoid putting it in every branch

// Calculate bounds to avoid extreme values
let lower_bound = (self.avg_delay as f64 * 0.5) as u64;
let upper_bound = (self.avg_delay as f64 * 1.5)
.min((self.remaining_time - self.remaining_delays + 1) as f64)
.min((self.remaining_time as f64 - self.remaining_delays as f64) + 1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand this formula

Copy link
Contributor Author

@al8n al8n Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line .min((self.remaining_time - self.remaining_delays + 1) as f64) is included to ensure that upper_bound never exceeds the maximum possible delay that can be returned without causing future delays to be zero or negative.

self.remaining_time is the total time still available for all remaining delays, and self.remaining_delays is the number of remaining delays, then the expression (self.remaining_time - self.remaining_delays + 1) calculates the maximum time we can allocate for the current delay without making future delays invalid (i.e., zero or negative).

For example, consider a case where we have remaining_time = 100 and remaining_delays = 10. If we use up 95 time units for the current delay, then we would be left with only 5-time units to distribute among the remaining 9 delays, which is not possible without using zero or negative delays.

Here's how the logic works:

  • self.remaining_time is the remaining time that needs to be allocated across all remaining delays.
  • self.remaining_delays is the number of remaining delays.

The maximum we could use for the current delay to be still able to allocate at least 1 unit of time for each of the remaining delays is (self.remaining_time - self.remaining_delays + 1).

The min function is used to choose the smaller of the two computed upper bounds (1.5 * avg_delay and (self.remaining_time - self.remaining_delays + 1)) to make sure that the program doesn't allocate too much time for the current delay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, better to put this in a comment as well (it's enought to point out that we don't want future delays to be zero)

Comment on lines 159 to 164
// corner case: when lower bound is greater than upper bound
if lower_bound >= upper_bound {
self.remaining_delays -= 1;
self.remaining_time = 0;
return Some(self.remaining_time);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we set lower_bound = (self.avg_delay as f64 * 0.5).min(upper_bound) as u64; and avoid having another corner case?

mixnet/client/src/sender.rs Outdated Show resolved Hide resolved
mixnet/client/src/sender.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@youngjoon-lee youngjoon-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@al8n al8n requested a review from zeegomo August 29, 2023 12:54

let delay = self.rng.gen_range(lower_bound, upper_bound);
let delay = self.rng.gen_range(lower_bound, upper_bound).floor() as u64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is upper bound included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the upper bound is exclusive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, what happens then if lower bound == upper bound? I think it probably panics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, my bad, thanks!

@al8n al8n requested a review from zeegomo August 29, 2023 13:05
let delay = self.rng.gen_range(lower_bound, upper_bound);
self.remaining_time -= delay;
self.remaining_delays -= 1;
let delay = self.rng.gen_range(lower_bound, upper_bound + 1.0).floor() as u64;
Copy link
Contributor

@zeegomo zeegomo Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of +1, as you need to do some calculations to show that it's safe to use and it's easy to mess it up with future refactors. The new api would be better, but since we have to use an old version of rand, I suggest we use https://docs.rs/rand/0.7.3/rand/distributions/struct.Uniform.html#method.new_inclusive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Thanks!

@youngjoon-lee youngjoon-lee linked an issue Aug 29, 2023 that may be closed by this pull request
as u64;
// guarantee that we don't exceed the remaining time and promise the delay we return is
// at least 1ms.
.min((self.remaining_time as f64 - self.remaining_delays as f64) + 1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.min((self.remaining_time as f64).saturating_sub(self.remaining_delays - 1)) and I think we can remove both the avg delay and remaining time edge cases

Copy link
Contributor Author

@al8n al8n Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove the avg delay, then we may have extreme delays, and with higher probability delay1 > delay 2 > delay3 and etc.

Copy link
Contributor

@zeegomo zeegomo Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nono, I mean just removing the edge case at line 139. If we craft the formula with a bit more care we can make it work in most cases without having to add checks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, sorry for the misunderstanding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we need the saturating sub in .min((self.remaining_time as f64).saturating_sub(self.remaining_delays - 1))

as u64;
// guarantee that we don't exceed the remaining time and promise the delay we return is
// at least 1ms.
.min(self.remaining_time.saturating_sub(self.remaining_delays) as f64 + 1.0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need the +1 anymore since we decrease remaining_delays before this

@al8n al8n requested a review from zeegomo August 29, 2023 16:21
for _ in 0..TOTAL_DELAYS {
assert!(delays.next().is_some());
}
assert!(delays.next().is_none());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably check that the total wait is indeed 1, also a duration instead of a u64 would be much better because otherwise it's not clear what unit we are using


#[test]
fn test_random_delay_iter_small_total_time() {
let mut delays = RandomDelayIterator::new(rand::thread_rng(), TOTAL_DELAYS, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in RandomDelayIterator::new(rng, total_delays, total_time), total_time should be a Duration and not a u64

@al8n al8n merged commit 82421ce into mixnet Aug 30, 2023
2 checks passed
@al8n al8n deleted the mixnet-delay-corner-case branch August 30, 2023 09:50
youngjoon-lee added a commit that referenced this pull request Sep 14, 2023
* Add `mixnode` and `mixnet-client` crate (#302)

* Add `mixnode` binary (#317)

* Integrate mixnet with libp2p network backend (#318)

* Fix #312: proper delays (#321)

* proper delays

* add missing duration param

* tiny fix: compilation error caused by `rand` 0.8 -> 0.7

* use `get_available_port()` for mixnet integration tests (#333)

* add missing comments

* Overwatch mixnet node (#339)

* Add mixnet service and overwatch app

* remove #[tokio::main]

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* fix tests for the overwatch mixnode (#342)

* fix panic when corner case happen in RandomDelayIter (#335)

* Use `log` service for `mixnode` bin (#341)

* Use `wire` for MixnetMessage in libp2p (#347)

* Prevent tmixnet tests from running forever (#363)

* Use random delay when sending msgs to mixnet (#362)

* fix a minor compilation error caused by the latest master

* Fix run output fd (#343)

* add a connection pool

* Exp backoff (#332)

* move mixnet listening into separate task

* add exponential retry for insufficient peers in libp2p

* fix logging

* Fix MutexGuard across await (#373)

* Fix MutexGuard across await

Holding a MutexGuard across an await point is not a good idea.
Removing that solves the issues we had with the mixnet test

* Make mixnode handle bodies coming from the same source concurrently (#372)

---------

Co-authored-by: Youngjoon Lee <taxihighway@gmail.com>

* Move wait at network startup (#338)

We now wait after the call to 'subscribe' to give the network
the time to register peers in the mesh before starting to
publish messages

* Remove unused functions from mixnet connpool (#374)

* Mixnet benchmark (#375)

* merge fixes

* add `connection_pool_size` field to `config.yaml`

* Simplify mixnet topology (#393)

* Simplify bytes and duration range ser/de (#394)

* optimize bytes serde and duration serde

---------

Co-authored-by: Al Liu <scygliu1@gmail.com>
Co-authored-by: Daniel Sanchez <sanchez.quiros.daniel@gmail.com>
Co-authored-by: Giacomo Pasini <Zeegomo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mixnet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set delays when building Sphinx packets
4 participants