Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make randomness acquisition fallible (fixes #441) #443

Merged
merged 2 commits into from Mar 11, 2021

Conversation

djc
Copy link
Member

@djc djc commented Jan 23, 2021

No description provided.

Copy link
Contributor

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying this out! What do you think about doing acquisition of session randoms higher in the stack, e.g. in ServerSession::new(), and then passing them down the stack as constructor parameters? This would still result in changes to a lot of method signatures, but it would allow a lot of constructors that currently can't return Error to stay that way. It would also nicely express the semantic of "making this thing is infallible, so long as you can feed it some random bytes."

That would still leave fallibility in the ticket generation case, though, which I don't think we can do much about (unless we are willing to generate extra randomness just in case we later have to make a ticket).

rustls/src/rand.rs Outdated Show resolved Hide resolved
@briansmith
Copy link
Contributor

This looks pretty good to me. I'm not sure why you don't like it @djc. I was expecting it to end up looking worse, and also there's room for improvement. I'll comment more in the issue.

@djc
Copy link
Member Author

djc commented Jan 25, 2021

What do you think about doing acquisition of session randoms higher in the stack, e.g. in ServerSession::new(), and then passing them down the stack as constructor parameters? This would still result in changes to a lot of method signatures, but it would allow a lot of constructors that currently can't return Error to stay that way. It would also nicely express the semantic of "making this thing is infallible, so long as you can feed it some random bytes."

Yup, will definitely do that on the next pass, I think it would help a lot.

Can we have a separate GetRandomFailed error type? This can avoid size of return value too large.

Will also include this.

@briansmith
Copy link
Contributor

briansmith commented Jan 28, 2021

Thanks for trying this out! What do you think about doing acquisition of session randoms higher in the stack, e.g. in ServerSession::new(), and then passing them down the stack as constructor parameters?

Ideally the application is able to do something like this for TCP-based TLS:

  • Syscall 1: Send SYN (client) or SYN/ACK (server)
  • Syscall 2: Acquire the random bytes it needs while waiting for SYN/ACK (client) or ClientHello (server).
  • Syscall 3: Send ACK + ClientHello (client) or ServerHello (server).

In particular, we don't want our first segment of the TCP handshake to be blocked on sending the random bytes.

FWIW, I'm planning to extend the ring API with a way to coalesce the random generation of the ECDHE private key with the generation of the hello randoms in one (system) call to the PRNG.

Note that Rustls also has this line:

let ours = ring::agreement::EphemeralPrivateKey::generate(alg, &rng).unwrap();

That unwrap() should be replaced too, as the failure would be due to PRNG failure.

So, as far as rearranging code goes, I would try to rearrange things so the hello random and the ECDHE private key are generated close to each other.

@djc djc force-pushed the fallible-random branch 2 times, most recently from d092a43 to 25e6d03 Compare January 28, 2021 21:31
rustls/src/quic.rs Outdated Show resolved Hide resolved
rustls/src/quic.rs Outdated Show resolved Hide resolved
rustls/src/server/mod.rs Outdated Show resolved Hide resolved
rustls/src/server/mod.rs Outdated Show resolved Hide resolved
@ctz
Copy link
Member

ctz commented Jan 30, 2021

One use of unwrap() in the codebase here is to avoid untestable error handling code -- which runs for the very first time in the field in hard-to-reproduce conditions. To put it another way -- how can we arrange to test that the error handling for any one of these fallible calls works properly?

@briansmith
Copy link
Contributor

One use of unwrap() in the codebase here is to avoid untestable error handling code -- which runs for the very first time in the field in hard-to-reproduce conditions. To put it another way -- how can we arrange to test that the error handling for any one of these fallible calls works properly?

AFAICT, there is no new code path introduced with this PR. The only difference is that instead of callers catching the error as a panic using std::panic::catch_unwind, they would catch it using normal Result handling. Within Rustls, we're only bubbling up the error, never trying to handle it.

On Linux, we could write tests that intercept syscalls and force the failure conditions using ptrace. BoringSSL's test suite has that. I can look into exposing that via ring for ring and other ring-based tests, including Rustls, to use. However, IMO that doesn't need to block this improvement.

@djc djc force-pushed the fallible-random branch 3 times, most recently from b505eb9 to 18c671f Compare February 18, 2021 13:58
@codecov-io
Copy link

Codecov Report

Merging #443 (25e6d03) into main (2e4b652) will decrease coverage by 0.09%.
The diff coverage is 91.39%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #443      +/-   ##
==========================================
- Coverage   96.83%   96.73%   -0.10%     
==========================================
  Files          50       50              
  Lines        8979     9133     +154     
==========================================
+ Hits         8695     8835     +140     
- Misses        284      298      +14     
Impacted Files Coverage Δ
rustls/src/error.rs 94.66% <0.00%> (-5.34%) ⬇️
rustls/src/server/mod.rs 92.41% <87.50%> (-0.58%) ⬇️
rustls/src/rand.rs 92.85% <88.88%> (-7.15%) ⬇️
rustls/src/server/tls13.rs 94.70% <90.00%> (+0.09%) ⬆️
rustls/src/ticketer.rs 90.32% <94.11%> (-1.08%) ⬇️
rustls/src/client/common.rs 100.00% <100.00%> (ø)
rustls/src/client/hs.rs 96.57% <100.00%> (-0.52%) ⬇️
rustls/src/client/mod.rs 91.61% <100.00%> (+0.09%) ⬆️
rustls/src/quic.rs 90.73% <100.00%> (+0.37%) ⬆️
rustls/src/server/common.rs 100.00% <100.00%> (ø)
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e4b652...18c671f. Read the comment docs.

@djc
Copy link
Member Author

djc commented Feb 27, 2021

@ctz FWIW, Brian's point here seems true to me:

AFAICT, there is no new code path introduced with this PR. The only difference is that instead of callers catching the error as a panic using std::panic::catch_unwind, they would catch it using normal Result handling. Within Rustls, we're only bubbling up the error, never trying to handle it.

The only thing happening here is propagating failures explicitly in the type system. For careful API users, all this does is make potential failures more explicit.

(I'd like some sort of pronouncement: should I continue maintaining this so you can at some point review/merge it, or is this something you definitely don't want to do?)

@ctz
Copy link
Member

ctz commented Mar 10, 2021

(I'd like some sort of pronouncement: should I continue maintaining this so you can at some point review/merge it, or is this something you definitely don't want to do?)

Sorry for dropping this. Thanks for this work -- I would indeed like to get this merged (and, really in preference to other PRs that will make it harder to merge this). Could you rebase?

@ctz ctz mentioned this pull request Mar 10, 2021
@djc
Copy link
Member Author

djc commented Mar 10, 2021

Rebased.

@ctz ctz merged commit 3d8f6c1 into rustls:main Mar 11, 2021
@ctz
Copy link
Member

ctz commented Mar 11, 2021

Hm, this seems to have regressed some bogo tests, for example TLS13-HelloRetryRequest-Client-TLS-Sync-ImplicitHandshake: see https://github.com/ctz/rustls/runs/2089338754?check_suite_focus=true#step:8:1675

I think this is due to emit_client_hello_for_retry now generating its own randoms every time, which means HelloRetryRequest gets a different one to the original ClientHello.

@djc
Copy link
Member Author

djc commented Mar 11, 2021

Ah, that's bad. Will take a look at fixing it.

@briansmith
Copy link
Contributor

PR #533 was intended to help ensure that wouldn't happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants