Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient networking #905

Closed
5 tasks
paolobarbolini opened this issue Mar 31, 2023 · 6 comments · Fixed by #1060
Closed
5 tasks

Inefficient networking #905

paolobarbolini opened this issue Mar 31, 2023 · 6 comments · Fixed by #1060
Labels
enhancement Enhancement to existing functionality

Comments

@paolobarbolini
Copy link
Contributor

paolobarbolini commented Mar 31, 2023

  • handle_flush is called in far too many places. In the current state of things I don't think there's much request pipelining going on.
  • A flush is also done when there's nothing to flush. Also the flush Interval causes a lot of useless runtime wakeups.
  • There's double buffering happening when the connection uses TLS because this BufWriter is kept in the loop despite rustls acting sort of like a BufWriter by itself
  • BufWriter might not be necessary at all if writes coming from the same command were done in a single call using write_vectored
  • There's no way to opt out of request pipelining/buffering. The best you can do is keep a low flush interval, which is inefficient.

cc @dodomorandi

@paolobarbolini paolobarbolini added the bug Confirmed reproducible bug label Mar 31, 2023
@Jarema
Copy link
Member

Jarema commented Mar 31, 2023

Hey Paolo!

First of all, thanks for that report!
We are aware of BufWriter-related inefficiencies and changes around it are in progress.

That said, there are a few important points:

  1. The biggest benefit of buffers lies in publish and context.publish. Both those methods can send very small messages (few bytes) per each Message/Command.
    Because of that, I think write_vectored is not a good idea, but at the same time, simplicity of BufWriter is not enough for NATS specific needs and it was used as first iteration that yielded good enough results.

  2. Calling handle_flush - that is again part of the work in progress, but the same point is true - we don't do that for publishes, and on the other hand, requests, having pseudo-synchronous nature are expected to return response as soon as possible, without adding delay caused by waiting in buffer for low traffic clients. That's why each request flushes the buffer.
    Not sure what you mean by it being called in far to many places.
    What we might end up doing is having a separate request method, that sends publish and deferrs awaiting response to IntoFuture struct, similar to jetstream publish. In that mode, we could also add no flush method, but not sure how useful that would be.

  3. TLS double buffer - also part of the work in progress, though requires more benchmarks and getting into the detauls of rustls buffer.

  4. Request under hood is publish/response. Opting out of buffering is not obvious.

To conclude - the rework should get rid of most of the inefficiencies, though some answers are not straightforward.
Will keep you posted about all above as it happens.

cc @caspervonb

@Jarema Jarema added enhancement Enhancement to existing functionality and removed bug Confirmed reproducible bug labels Mar 31, 2023
@caspervonb
Copy link
Collaborator

On a related tangent, reduced ping noise going over the pipeline #931

@caspervonb
Copy link
Collaborator

caspervonb commented Jul 27, 2023

There's no way to opt out of request pipelining/buffering. The best you can do is keep a low flush interval, which is inefficient.

Configurable write buffer addresses this #971, as in a 0 byte buffer would effectively disable buffering. Wasn't the main intent but free side effect.

@caspervonb
Copy link
Collaborator

There's double buffering happening when the connection uses TLS because this BufWriter is kept in the loop despite rustls acting sort of like a BufWriter by itself

Looked into this a while back and found it non-actionable, TLS passes through one handshake is established no?

@caspervonb
Copy link
Collaborator

BufWriter might not be necessary at all if writes coming from the same command were done in a single call using write_vectored

We might be able to avoid some alloc by using writev, but the main intent is to reduce syscalls. Larger buffer generally means greater throughput (depending on the tuning of send/recv buffers via sysctl).

@paolobarbolini
Copy link
Contributor Author

I'll try sending a PR in the following days to improve the flush situation, so that this can be looked into later. We can't live entirely without flushing because when putting rustls in the middle of the stream there will always be some buffering and flushing i the only way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement to existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants