Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/performance optimizations - phase 2 #481

Merged

Conversation

puzpuzpuz
Copy link
Contributor

@puzpuzpuz puzpuzpuz commented Jun 12, 2019

Includes the following:

  • Major performance optimization - automated pipelining. Includes related config settings.
  • Major performance optimization - socket reads optimization.
  • A bunch of micro optimizations aimed to get rid of unnecessary buffer allocation and copying, reduce amount of litter produced on the hot path and improve performance/memory consumption, in general.
  • Ref doc section for automated pipelining.
  • Disables Nagle's algorithm for TCP socket by the default. Includes related config settings.
  • Additional unit and integration tests that cover new behavior.

Automated pipelining

The idea is based on the write queue implemented in DataStax Node.js Driver for Apache Cassandra (thanks @tkountis for finding this optimization and implementing the initial PoC). This optimization provides a significant throughput improvement (+25-30%) for in read scenarios. However, throughput slightly decreases in write scenarios (-13-23%). But in the real world scenarios (mixed, with more reads than writes) the benefit should be still significant.

The main benefit of this approach when compared with dedicated pipeline/batch operation APIs is the no explicit APIs for client users. Library users don't need to change their application logic (and thus, the source code) in order to benefit from automated pipelining. Whenever operations are started within the same event loop phase, PipelinedWriter object will try to send their payloads in a batch.

Current value of threshold for PipelinedWriter is set to 8KB (like in DataStax's write queue). I've tried higher values, but saw no difference, so I ended up with the same default.

There is one important underwater stone related with PipelinedWriter's behavior. As it concatenates buffers for multiple operations and flushes them in a single socket.write() call, the underlying runtime (network stack/kernel/OS) may send first parts of the payload over the network earlier than next parts. So, payloads of first operations within the batch may be send over the network before the socket.write()'s callback is executed. As the result, response for that operation may return into socket.on('data') earlier than promise resolve() calls for last operations in the batch. But in case of Hazelcast Node.js client library this behavior is acceptable: operation's write promise (the one that is send into PipelinedWriter#write method) is used only to chain an error handler (see InvocationService#invoke*).

Socket reads optimization

This one removes buffer allocation and copying where possible in socket.on('data') event handling, including the case when the payload is received withing 2+ chunks (note: by default Node uses 64KB for TCP read chunk size). In that case, FrameReader object caches chunks in an internal array and concatenates them only when enough data was received.

Benchmarks

You may see benchmarks results for one of intermediate commits within this PR here. I'm going to perform measurements for the latest commit later.

Further Optimizations

A couple of PRDs were created as the result of work on this PR:

@puzpuzpuz puzpuzpuz force-pushed the enhancement/output-queue-optimization branch from 591534b to 4754e9f Compare June 15, 2019 06:05
@puzpuzpuz puzpuzpuz force-pushed the enhancement/output-queue-optimization branch from 7d4f3d6 to 0c2fd65 Compare June 16, 2019 18:28
@puzpuzpuz puzpuzpuz force-pushed the enhancement/output-queue-optimization branch from 11b6c09 to eda965e Compare June 17, 2019 16:54
@puzpuzpuz puzpuzpuz force-pushed the enhancement/output-queue-optimization branch from 6582e92 to 4ea7f6c Compare June 19, 2019 07:12
@puzpuzpuz
Copy link
Contributor Author

verify

1 similar comment
@puzpuzpuz
Copy link
Contributor Author

verify

@puzpuzpuz puzpuzpuz changed the title [WIP] Enhancement/performance optimizations - phase 2 Enhancement/performance optimizations - phase 2 Jun 19, 2019
@puzpuzpuz puzpuzpuz force-pushed the enhancement/output-queue-optimization branch from 4ea7f6c to accb6b3 Compare June 19, 2019 15:24
Copy link

@tkountis tkountis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Good work!

Copy link
Contributor

@mdumandag mdumandag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few minor comments. Looks good

src/invocation/ClientConnection.ts Outdated Show resolved Hide resolved
src/invocation/ClientConnection.ts Show resolved Hide resolved
test/AutoPipeliningDisabledTest.js Outdated Show resolved Hide resolved
test/AutoPipeliningDisabledTest.js Outdated Show resolved Hide resolved
@mdumandag
Copy link
Contributor

verify

@tkountis tkountis merged commit afc1a72 into hazelcast:master Jul 18, 2019
@burakcelebi burakcelebi added this to the 3.12.1 milestone Jul 18, 2019
harunalpak pushed a commit to harunalpak/hazelcast-nodejs-client that referenced this pull request Dec 8, 2022
Implementation of output queue for socket writes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants