Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint performance and memory pressure optimization #19

Merged

Conversation

manuelbernhardt
Copy link
Contributor

@manuelbernhardt manuelbernhardt commented Mar 30, 2020

This is a bit of a controversial change from the view point of the API, yet makes a lot of sense from the view point of performance and memory utilization for very large clusters.

The issue here is that the hostname of an Endpoint is modelled as a protobuf string type. This type carries with it the overhead of encoding to or decoding from UTF-8 every time a message is sent or received (and the field accessed).

From the view point of the algorithms in place, there's no added value in having the endpoint host data be encoded as byte array or utf-8 encoded string. It is just data, what matters is that the ordering of the endpoints can be established. Having a string only matters at the interfaces: when configuring a hostname, when sending a message to one and when printing log statements (most of which at DEBUG/TRACE level).

Yet at the moment, when adding a new endpoint to the membership ring(s), the following code runs:

  public java.lang.String getHostname() {
    java.lang.Object ref = hostname_;
    if (ref instanceof java.lang.String) {
      return (java.lang.String) ref;
    } else {
      com.google.protobuf.ByteString bs =
          (com.google.protobuf.ByteString) ref;
      java.lang.String s = bs.toStringUtf8();
      hostname_ = s;
      return s;
    }
  }

For freshly received messages containing Endpoints, this means running toStringUtf8(), which when there are many is quite expensive in terms of CPU and memory usage.

This PR does the following:

  • use bytes rather than string to encode the hostname in protobuf
  • adjust all interfaces - the Cluster APIs are actually (almost) not affected since they use the HostAndPort construct
  • use the existing underlying / existing byte array when computing the hashcode of an Endpoint in Utils.AddressComparator
  • getting rid of the mapping between Map<String, Metadata> and Map<Endpoint, Metadata> by representing the map as two lists in protobuf (keys and values)

On local tests with 1000 concurrent nodes joining, there's a 10% improvement in memory allocation and a 20% improvement in CPU usage of the stack starting at the TreeSet.add method (39% vs 58%).

Memory allocation before:

allocation-profile-before

Memory allocation after:

allocation-profile-after

CPU before:

cpu-profile-before

CPU after:

cpu-profile-after

This is a bit of a controversial change from the view point of the API, yet makes a lot of sense from the view point of performance and memory utilization for very large clusters.

The issue here is that the hostname of an Endpoint is modelled as a protobuf "string" type. This type carries with it the overhead of encoding to or decoding from UTF-8 every time a message is sent or received (and the field accessed).

From the view point of the algorithms in place, there's no added value in having the endpoint host data be encoded as byte array or utf-8 encoded string. It is just data, what matters is that the ordering of the endpoints can be established. Having a string only matters at the interfaces: when configuring a hostname, when sending a message to one and when printing log statements (most of which at DEBUG/TRACE level).

Yet at the moment, when adding a new endpoint to the membership ring(s), the following code runs:

```
  public java.lang.String getHostname() {
    java.lang.Object ref = hostname_;
    if (ref instanceof java.lang.String) {
      return (java.lang.String) ref;
    } else {
      com.google.protobuf.ByteString bs =
          (com.google.protobuf.ByteString) ref;
      java.lang.String s = bs.toStringUtf8();
      hostname_ = s;
      return s;
    }
  }

```

For freshly received messages containing Endpoints, this means running `toStringUtf8()`, which when there are many is quite expensive in terms of CPU and memory usage.

This PR does the following:

- use `bytes` rather than `string` to encode the hostname in protobuf
- adjust all interfaces - the Cluster APIs are actually (almost) not affected since they use the `HostAndPort` construct
- use the existing underlying / existing byte array when computing the hashcode of an Endpoint in `Utils.AddressComparator`
- getting rid of the mapping between `Map<String, Metadata>` and `Map<Endpoint, Metadata>` by representing the map as two lists in protobuf (keys and values)

On local tests with 1000 concurrent nodes joining, there's a 10% improvement in memory allocation and a 20% improvement in CPU usage of the stack starting at the `TreeSet.add` method (39% vs 58%).
@lalithsuresh
Copy link
Owner

@lalithsuresh lalithsuresh commented Apr 9, 2020

This is a pretty good patch. Thanks @manuelbernhardt (and apologies for taking so long for reviewing). I particularly appreciate the conversions being only at the interface boundaries, but not within the core data structures itself.

One question though: do these improvements hold for more recent versions of protobuf? The protobuf/grpc versions that we use are about two years old.

@manuelbernhardt
Copy link
Contributor Author

@manuelbernhardt manuelbernhardt commented Apr 9, 2020

Hi @lalithsuresh, I haven’t tested with more recent versions of protobuf / its code generator but there are a few reasons that I think it still would be the same in the latest version:

  • when I saw this behavior I found this thread on the topic from 2009. So I think this was possibly an issue already then and has been optimized since
  • there has to be an inherent cost related to turning an array of bytes into a proper UTF-8 encoded string). The issue manifests in this setting because there are many strings involved - even if there was a better deciding algorithm available, at scale it would still end up costing performance
  • I actually saw this before in latency sensitive applications - protobuf fields encoded as byte array where a string seemed more fitting. Now I get why this is :-)

@lalithsuresh
Copy link
Owner

@lalithsuresh lalithsuresh commented Apr 9, 2020

@manuelbernhardt thanks! Glancing quickly at [1], I don't think much has changed in how strings are handled. I'll accept this PR.

[1] https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/compiler/java/java_string_field.cc

@lalithsuresh lalithsuresh merged commit 941aeb3 into lalithsuresh:master Apr 9, 2020
@lalithsuresh
Copy link
Owner

@lalithsuresh lalithsuresh commented Apr 9, 2020

@manuelbernhardt by the way, may I ask what tool you're using to produce those graphs? :)

@manuelbernhardt
Copy link
Contributor Author

@manuelbernhardt manuelbernhardt commented Apr 11, 2020

@lalithsuresh thanks for merging this! I'm using async-profiler for profiling - it's also integrated with IntelliJ since a few releases

@lalithsuresh
Copy link
Owner

@lalithsuresh lalithsuresh commented Apr 11, 2020

@manuelbernhardt Oops. Looks like the build is failing.

Compiling 72 source files to /Users/lsuresh/code/rapid-git/rapid/target/classes
/Users/lsuresh/code/rapid-git/rapid/src/main/java/com/vrg/rapid/MembershipService.java:119: error: incompatible types: ISettings cannot be converted to Settings
        this(myAddr, cutDetection, membershipView, sharedResources, settings, messagingClient,
                                                                    ^
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
1 error

@lalithsuresh
Copy link
Owner

@lalithsuresh lalithsuresh commented Apr 11, 2020

Is the Settings/ISettings change required for this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants