Skip to content

Make ScmTimestamp's doc test more robust #1005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 7, 2019

Conversation

asomers
Copy link
Member

@asomers asomers commented Jan 6, 2019

The old test made assumptions about the responsiveness of the test
environment. The new test does not, and it's simpler too.

@asomers
Copy link
Member Author

asomers commented Jan 6, 2019

cc @pusateri @Wolvereness

@Wolvereness
Copy link
Contributor

You've missed one of the fundamental assumptions for whether or not this API is functional in the target environment. The point of this particular method of time-stamping is that the OS is assigning it, and it's independent of when the caller finally reads the packets.

If the test environment isn't "responsive", then the API probably doesn't work, as it's fundamentally supposed to be a method of time-stamping in that particular environment.

The previous test sends two packets, 250ms apart, and assumes the maximum system discrepancy is less than 5ms. If this treshold is problematic, then I suggest simply increasing it, or if a target system is giving responses with anywhere near the maximum 250ms discrepancy, consider the API to not be functional on that target.

@asomers
Copy link
Member Author

asomers commented Jan 6, 2019

What the old test did is it sent two packets separated by a sleep. It basically asserted that the sleep plus the 2nd sendmsg call took no more than 255ms. But there's nothing to ensure that. On a non-real-time OS, a process can be delayed at any time for any reason. If you look at the man page for nanosleep(2), you'll notice that it says "The suspension time may be longer than requested". The test failure doesn't indicate that SO_TIMESTAMP is non-functional on the target; it indicates that the target is busy.

What changed recently is that FreeBSD's CI moved from BuildBot to Cirrus-CI. Perhaps the Cirrus VMs are oversubscribed or something?

The new test is insensitive to arbitrary delays. All it does is ensure that the packet's received timestamp lies between two bounds. That assumes that the clock is monotonic, which isn't 100% true because SO_TIMESTAMP uses the real-time clock instead of the monotonic clock[1]. But I don't see a way around that.

Really, all that the test needs to do is check that some kind of time, any kind of time, is returned. Even if it's off by a year, then Nix is still probably binding the API correctly.

The test failure is because I used a function that wasn't introduced until Rust 1.27.0. I'll fix it.

[1] On FreeBSD 12 this is selectable with the SO_TS_CLOCK sockopt. But that isn't available on other platforms.

@Wolvereness
Copy link
Contributor

I don't think tests should ever expect to handle non-monotonic time. The test is also not expecting the sleep to be 250ms; it's actually timing the delay itself, and then seeing if the delay between packets is within 5ms of what was measured. So, if the system time changed, it should be accounted for unless it was between the packet send and the timestamp get.

Are you getting the output for the tests? How off is it?

@asomers
Copy link
Member Author

asomers commented Jan 6, 2019

Well, the old test was measuring a delay. But it still didn't account for the time taken up by the sendmsg calls themselves. They aren't instantaneous. Nor does the received timestamp necessarily get recorded during the sendmsg call. That's an implementation detail. It could be recorded when the data is copied from the sending socket's buffer to the receiving socket's buffer, and that could happen at any point before recvmsg returns. The discrepancy, in one failing case, was 13 ms.

@Wolvereness
Copy link
Contributor

I suggest just changing it from 5ms to 50ms then, and one extra check for the initial sleep to be at least 200ms.

@asomers
Copy link
Member Author

asomers commented Jan 6, 2019

Increasing the time limit wouldn't remove the test's sensitivity to the environment's responsiveness. It would just loosen it a bit. The new test should work even on the most unresponsive systems, even if the process gets paused for an hour. It only requires that the clock be monotonic.

The old test made assumptions about the responsiveness of the test
environment.  The new test does not, and it's simpler too.
@asomers asomers force-pushed the scm_timestamp_test branch from ac2ac4c to eba2fc2 Compare January 6, 2019 21:11
@asomers
Copy link
Member Author

asomers commented Jan 7, 2019

bors r+

bors bot added a commit that referenced this pull request Jan 7, 2019
1005: Make ScmTimestamp's doc test more robust r=asomers a=asomers

The old test made assumptions about the responsiveness of the test
environment.  The new test does not, and it's simpler too.

Co-authored-by: Alan Somers <asomers@gmail.com>
@bors
Copy link
Contributor

bors bot commented Jan 7, 2019

Build succeeded

@bors bors bot merged commit eba2fc2 into nix-rust:master Jan 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants