-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix panic when corner case happen in RandomDelayIter #335
Conversation
|
||
#[test] | ||
fn test_random_delay_iter_zero_total_time() { | ||
const TOTAL_DELAYS: u64 = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This const could be global shared if you use for all tests.
mixnet/client/src/sender.rs
Outdated
@@ -133,21 +133,60 @@ where | |||
return None; | |||
} | |||
|
|||
// corner case: 0 average delay | |||
if self.avg_delay == 0 { | |||
self.remaining_delays -= 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can set remaining delays at line 135 and avoid putting it in every branch
mixnet/client/src/sender.rs
Outdated
// Calculate bounds to avoid extreme values | ||
let lower_bound = (self.avg_delay as f64 * 0.5) as u64; | ||
let upper_bound = (self.avg_delay as f64 * 1.5) | ||
.min((self.remaining_time - self.remaining_delays + 1) as f64) | ||
.min((self.remaining_time as f64 - self.remaining_delays as f64) + 1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand this formula
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line .min((self.remaining_time - self.remaining_delays + 1) as f64)
is included to ensure that upper_bound
never exceeds the maximum possible delay that can be returned without causing future delays to be zero or negative.
self.remaining_time
is the total time still available for all remaining delays, and self.remaining_delays
is the number of remaining delays, then the expression (self.remaining_time - self.remaining_delays + 1)
calculates the maximum time we can allocate for the current delay without making future delays invalid (i.e., zero or negative).
For example, consider a case where we have remaining_time = 100
and remaining_delays = 10
. If we use up 95 time units for the current delay, then we would be left with only 5-time units to distribute among the remaining 9 delays, which is not possible without using zero or negative delays.
Here's how the logic works:
self.remaining_time
is the remaining time that needs to be allocated across all remaining delays.self.remaining_delays
is the number of remaining delays.
The maximum we could use for the current delay to be still able to allocate at least 1 unit of time for each of the remaining delays is (self.remaining_time - self.remaining_delays + 1)
.
The min
function is used to choose the smaller of the two computed upper bounds (1.5 * avg_delay
and (self.remaining_time - self.remaining_delays + 1)
) to make sure that the program doesn't allocate too much time for the current delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, better to put this in a comment as well (it's enought to point out that we don't want future delays to be zero)
mixnet/client/src/sender.rs
Outdated
// corner case: when lower bound is greater than upper bound | ||
if lower_bound >= upper_bound { | ||
self.remaining_delays -= 1; | ||
self.remaining_time = 0; | ||
return Some(self.remaining_time); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we set lower_bound = (self.avg_delay as f64 * 0.5).min(upper_bound) as u64;
and avoid having another corner case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
mixnet/client/src/sender.rs
Outdated
|
||
let delay = self.rng.gen_range(lower_bound, upper_bound); | ||
let delay = self.rng.gen_range(lower_bound, upper_bound).floor() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is upper bound included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the upper bound is exclusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, what happens then if lower bound == upper bound? I think it probably panics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, my bad, thanks!
mixnet/client/src/sender.rs
Outdated
let delay = self.rng.gen_range(lower_bound, upper_bound); | ||
self.remaining_time -= delay; | ||
self.remaining_delays -= 1; | ||
let delay = self.rng.gen_range(lower_bound, upper_bound + 1.0).floor() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a fan of +1, as you need to do some calculations to show that it's safe to use and it's easy to mess it up with future refactors. The new api would be better, but since we have to use an old version of rand, I suggest we use https://docs.rs/rand/0.7.3/rand/distributions/struct.Uniform.html#method.new_inclusive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! Thanks!
mixnet/client/src/sender.rs
Outdated
as u64; | ||
// guarantee that we don't exceed the remaining time and promise the delay we return is | ||
// at least 1ms. | ||
.min((self.remaining_time as f64 - self.remaining_delays as f64) + 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.min((self.remaining_time as f64).saturating_sub(self.remaining_delays - 1))
and I think we can remove both the avg delay and remaining time edge cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove the avg delay, then we may have extreme delays, and with higher probability delay1 > delay 2 > delay3 and etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nono, I mean just removing the edge case at line 139. If we craft the formula with a bit more care we can make it work in most cases without having to add checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, sorry for the misunderstanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we need the saturating sub in .min((self.remaining_time as f64).saturating_sub(self.remaining_delays - 1))
mixnet/client/src/sender.rs
Outdated
as u64; | ||
// guarantee that we don't exceed the remaining time and promise the delay we return is | ||
// at least 1ms. | ||
.min(self.remaining_time.saturating_sub(self.remaining_delays) as f64 + 1.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need the +1 anymore since we decrease remaining_delays
before this
for _ in 0..TOTAL_DELAYS { | ||
assert!(delays.next().is_some()); | ||
} | ||
assert!(delays.next().is_none()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably check that the total wait is indeed 1, also a duration instead of a u64 would be much better because otherwise it's not clear what unit we are using
mixnet/client/src/sender.rs
Outdated
|
||
#[test] | ||
fn test_random_delay_iter_small_total_time() { | ||
let mut delays = RandomDelayIterator::new(rand::thread_rng(), TOTAL_DELAYS, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in RandomDelayIterator::new(rng, total_delays, total_time)
, total_time
should be a Duration
and not a u64
* Add `mixnode` and `mixnet-client` crate (#302) * Add `mixnode` binary (#317) * Integrate mixnet with libp2p network backend (#318) * Fix #312: proper delays (#321) * proper delays * add missing duration param * tiny fix: compilation error caused by `rand` 0.8 -> 0.7 * use `get_available_port()` for mixnet integration tests (#333) * add missing comments * Overwatch mixnet node (#339) * Add mixnet service and overwatch app * remove #[tokio::main] --------- Co-authored-by: Youngjoon Lee <taxihighway@gmail.com> * fix tests for the overwatch mixnode (#342) * fix panic when corner case happen in RandomDelayIter (#335) * Use `log` service for `mixnode` bin (#341) * Use `wire` for MixnetMessage in libp2p (#347) * Prevent tmixnet tests from running forever (#363) * Use random delay when sending msgs to mixnet (#362) * fix a minor compilation error caused by the latest master * Fix run output fd (#343) * add a connection pool * Exp backoff (#332) * move mixnet listening into separate task * add exponential retry for insufficient peers in libp2p * fix logging * Fix MutexGuard across await (#373) * Fix MutexGuard across await Holding a MutexGuard across an await point is not a good idea. Removing that solves the issues we had with the mixnet test * Make mixnode handle bodies coming from the same source concurrently (#372) --------- Co-authored-by: Youngjoon Lee <taxihighway@gmail.com> * Move wait at network startup (#338) We now wait after the call to 'subscribe' to give the network the time to register peers in the mesh before starting to publish messages * Remove unused functions from mixnet connpool (#374) * Mixnet benchmark (#375) * merge fixes * add `connection_pool_size` field to `config.yaml` * Simplify mixnet topology (#393) * Simplify bytes and duration range ser/de (#394) * optimize bytes serde and duration serde --------- Co-authored-by: Al Liu <scygliu1@gmail.com> Co-authored-by: Daniel Sanchez <sanchez.quiros.daniel@gmail.com> Co-authored-by: Giacomo Pasini <Zeegomo@users.noreply.github.com>
No description provided.