Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce latency in the network stack #277

Merged

Conversation

gbryant-arm
Copy link
Contributor

@gbryant-arm gbryant-arm commented Nov 26, 2021

Reduce latency in the network stack, specifically on the runtime manager and client sides.

  • Prefix every protocol buffer (runtime manager & proxy attestation messages) with their length to allow the receiver to know when a protocol buffer is complete. This avoids the need to try to deserialize the protocol buffer until it succeeds, which is more and more time consuming as the incoming buffer grows (it takes about 100ms to deserialize a 30MB protocol buffer)
  • Turn 'TCP no delay' on to reduce latency on the Linux backend, the only one for which the server and the runtime manager communicate over TCP. This is effective when small packets are regularly sent, which is the case here. This measure decreases latency from 40ms down to <1ms

Note that a high latency between the client and server will still have a big impact on throughput, as 16KB protobuf fragments –split by TLS – are still sent between the client and runtime manager one by one.
The obvious solution is to remove HTTP on the client/server side and replace it with a plain TCP repeater.
Other ideas:

  • The client could encapsulate groups of fragments in each HTTP request to the server, which might require to configure the TLS server to handle groups of TLS records. This would have the downside of increasing memory usage on the server.
  • Remove base64 (de)serializing, which could shave off a few hundreds of microseconds here and there

All these suggestions are however out of the scope of this PR.

TODO:

  • Benchmark "no length prefix" vs "length prefix" to validate changes
  • Avoid unnecessary conversions between arrays, vectors and mutable vectors

@gbryant-arm gbryant-arm force-pushed the optimise-network-stack branch 2 times, most recently from ac454bf to b7bcc69 Compare November 27, 2021 11:12
@dominic-mulligan-arm dominic-mulligan-arm added always-refactoring A refactoring/code quality change to the Veracruz codebase enhancement New feature or request server Something related to the untrusted Veracruz servers trusted-veracruz-runtime Something related to the trusted Veracruz runtime labels Nov 29, 2021
Copy link
Member

@dominic-mulligan-arm dominic-mulligan-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this mostly looks good, just a few small comments about some things that look suspicious. To confirm, you've run rustfmt on the changes using the Git hook, right?

runtime-manager/src/managers/execution_engine_manager.rs Outdated Show resolved Hide resolved
runtime-manager/src/managers/execution_engine_manager.rs Outdated Show resolved Hide resolved
@gbryant-arm
Copy link
Contributor Author

I haven't run the code through rustfmt yet, will do

@dominic-mulligan-arm
Copy link
Member

The TZ failure seems spurious, but the Linux and IceCap failures seem to be a bug in the changeset: before calling .split_at(8) we probably need to check that the length of the incoming bytes is at least 8 elements wrong. Though, is this now an invariant that every one of these messages is preceded by its encoded length?

@dominic-mulligan-arm
Copy link
Member

(In light of that, I'm going to unapproved this until the issue is fixed.)

@dominic-mulligan-arm dominic-mulligan-arm dismissed their stale review December 1, 2021 10:22

Unapproved due to bug in code coming to light during CI runs.

@gbryant-arm
Copy link
Contributor Author

I noticed the tests failing.
There should be a way to enforce adding the length prefix to every protocol buffer, or at least every RuntimeManagerRequest message.

@dominic-mulligan-arm
Copy link
Member

@gbryant-arm what's happening with this PR?

@gbryant-arm
Copy link
Contributor Author

It's been successfully integrated to the experimental benchmarking branch but it's on hold until I find time to spend on code quality

@dominic-mulligan-arm
Copy link
Member

What's the status of this PR, @gbryant-arm? Are you going to come back to it, or is it now defunct?

@gbryant-arm
Copy link
Contributor Author

What's the status of this PR, @gbryant-arm? Are you going to come back to it, or is it now defunct?

This PR is still active. It needs some refactoring though.
I'll get back to it this week

@dominic-mulligan-arm
Copy link
Member

dominic-mulligan-arm commented Mar 8, 2022

Converting this to a draft until it's in a fit state for review...

@dominic-mulligan-arm dominic-mulligan-arm marked this pull request as draft March 8, 2022 11:46
@dominic-mulligan-arm dominic-mulligan-arm changed the title Reduce latency in the network stack WIP: reduce latency in the network stack Mar 8, 2022
…erRequest` until fully received by the runtime manager

The `RuntimeManagerRequest` protocol buffers are prefixed by their length so that the runtime manager can tell when the data transfer is complete.
This dramatically increases provisioning throughput, as the runtime manager would otherwise try to deserialize the incoming buffer for every new chunk received, taking longer and longer as the buffer grows.
…nux backend only)

This reduces latency by sending packets immediately instead of buffering them
Extend prefixing to runtime manager and proxy attestation server messages
Change API by introducing a session id to be passed to the parsing functions to
distinguish messages being received concurrently
Add documentation
@gbryant-arm gbryant-arm marked this pull request as ready for review March 21, 2022 18:54
@gbryant-arm gbryant-arm changed the title WIP: reduce latency in the network stack Reduce latency in the network stack Mar 21, 2022
transport-protocol/src/custom.rs Show resolved Hide resolved
transport-protocol/src/custom.rs Show resolved Hide resolved
transport-protocol/src/custom.rs Show resolved Hide resolved
transport-protocol/src/custom.rs Show resolved Hide resolved
transport-protocol/src/custom.rs Show resolved Hide resolved
veracruz-server/src/veracruz_server_nitro.rs Show resolved Hide resolved
@gbryant-arm
Copy link
Contributor Author

The benchmarks validated the changes. File transfer is somehow ~ 2s faster on vc-linux

@dominic-mulligan-arm
Copy link
Member

+1 reached, merging.

@dominic-mulligan-arm dominic-mulligan-arm merged commit 59121f0 into veracruz-project:main Mar 24, 2022
@gbryant-arm gbryant-arm mentioned this pull request Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
always-refactoring A refactoring/code quality change to the Veracruz codebase enhancement New feature or request server Something related to the untrusted Veracruz servers trusted-veracruz-runtime Something related to the trusted Veracruz runtime
Projects
No open projects
Technical tasks
Awaiting triage
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants