Skip to content

Conversation

@jmpesp
Copy link
Contributor

@jmpesp jmpesp commented Nov 9, 2023

Same sort of free real estate as b960fe8! Do not decrypt while holding the downstairs lock.

Also, filter out invalid block contexts when validating read responses. This is then later relied upon (in the case when comparing different downstairs' read responses) when comparing job.read_response_hashes to what was just returned: job.read_response_hashes is Vec<Option<u64>>
not Vec<Vec<u64>>, so zero or one valid block context is turned into
an Option. Importantly, the result is not collected (except when panicking!), saving an allocation.

Same sort of free real estate as b960fe8! Do not decrypt while holding
the downstairs lock.

Also, filter out invalid block contexts when validating read responses.
This is then later relied upon (in the case when comparing different
downstairs' read responses) when comparing `job.read_response_hashes` to
what was just returned: `job.read_response_hashes` is `Vec<Option<u64>>`
 not `Vec<Vec<u64>>`, so zero or one valid block context is turned into
an Option. Importantly, the result is not `collect`ed (except when
panicking!), saving an allocation.
@jmpesp
Copy link
Contributor Author

jmpesp commented Nov 9, 2023

Here's some f3 test numbers (f3write -e 5 tmp && f3read tmp):

main (@ 3927d8d):

Average writing speed: 134.61 MB/s
Average reading speed: 52.89 MB/s

this branch:

Average writing speed: 155.65 MB/s
Average reading speed: 71.63 MB/s

Using the same fio as #1019 (but with a 32 G disk, 512b, and nexus' extent size):

main (@ 3927d8d):

Run status group 0 (all jobs):
  WRITE: bw=19.7MiB/s (20.7MB/s), 19.7MiB/s-19.7MiB/s (20.7MB/s-20.7MB/s), io=1185MiB (1242MB), run=60025-60025msec

Run status group 1 (all jobs):
  WRITE: bw=140MiB/s (146MB/s), 140MiB/s-140MiB/s (146MB/s-146MB/s), io=8424MiB (8833MB), run=60335-60335msec

Run status group 2 (all jobs):
  WRITE: bw=151MiB/s (158MB/s), 151MiB/s-151MiB/s (158MB/s-158MB/s), io=9172MiB (9618MB), run=60706-60706msec

this branch:

Run status group 0 (all jobs):
  WRITE: bw=19.0MiB/s (19.9MB/s), 19.0MiB/s-19.0MiB/s (19.9MB/s-19.9MB/s), io=1141MiB (1196MB), run=60019-60019msec

Run status group 1 (all jobs):
  WRITE: bw=137MiB/s (144MB/s), 137MiB/s-137MiB/s (144MB/s-144MB/s), io=8279MiB (8681MB), run=60257-60257msec

Run status group 2 (all jobs):
  WRITE: bw=150MiB/s (157MB/s), 150MiB/s-150MiB/s (157MB/s-157MB/s), io=9044MiB (9483MB), run=60397-60397msec

);
if job.read_response_hashes != read_response_hashes {

let mismatch = std::iter::zip(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code duplicates stuff in the ack_status == AckStatus::Acked condition above. Perhaps it should be a function on job, e.g. DownstairsIO::check_read_response_hashes(&self, read_data: &[ReadResponse]))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, also added in 6eb6d22

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about also moving the rest of the stuff (including the logging and panic) into that function, but don't feel strongly about it.

Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level question, if we are doing the decryption outside the lock, do we have
to worry about two reads coming back at close to the same time and then racing for
who is the first? My question is based on my lazyness on not figuring it out for myself.

}

#[tokio::test]
#[should_panic]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we should panic here, then we unwrap on encrypt_in_placed won't this give us a false negative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, see 707f4d9 where I rolled this back

}

#[tokio::test]
#[should_panic]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same should_panic here, If there is any unwrap() before the expected panic, I think this
tests will pass for the wrong reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, see 707f4d9

@jmpesp
Copy link
Contributor Author

jmpesp commented Nov 10, 2023

A high level question, if we are doing the decryption outside the lock, do we have to worry about two reads coming back at close to the same time and then racing for who is the first?

I don't think so. The decryption context doesn't hold any state that feeds forward from one decryption to the next (unlike something stateful like TLS). The locking of the Downstairs struct eventually does happen which serializes the modifications. If we had a race problem before this commit then that'd be an existing bug that I believe we'd have seen already.

}

if !successful_hash {
if let Some(valid_hash) = valid_hash {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are here already, could we correct the comment up at line 4111, or at least mention that
it is only for the SQL backend? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rgr, done in 5d46ff4

elide return statement
@jmpesp jmpesp merged commit 82fbf1e into oxidecomputer:main Nov 14, 2023
@jmpesp jmpesp deleted the do_not_decrypt_holding_locks branch November 14, 2023 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants