netty: cache method path conversion #2233

carl-mastrangelo · 2016-09-02T22:16:35Z

No description provided.

carl-mastrangelo · 2016-09-02T22:19:34Z

Alternatives considered:

Cache the transformation inside MethodDescriptor. This is difficult to expose to the transport layer, since it would be giving access to the mutable byte[]. It would also still cost the allocation of an AsciiString
Use a single static synchronized map in Netty to cache transformation. This adds synchronization even though it saves memory.
Use a netty fast thread local. Since Users provide their own thread factory, it doesn't seem possible to assume we are using the necessary thread factory needed. FastThreadLocal says that it needs this.

Option 1 is the best, but I am wary of putting a public accessor method on the API even if marked as Internal. We are already finding places where users are abusing our API quirks.

buchgr · 2016-09-05T12:01:02Z

Weak LGTM.

I am unsure about this change. I don't think I would make it myself. Seems very complex, for the benefit (if there is one). Hash table lookup vs. an allocation which certainly dies young. Hmm 🤔. If deciding to not use the threadlocal approach, then the allocation due to prepending of / might be removed by EA, however to be sure you could allocate a correctly sized byte array, and prepend / "manually".

buchgr · 2016-09-05T12:03:13Z

netty/src/main/java/io/grpc/netty/NettyClientStream.java


+import java.util.IdentityHashMap;
+import java.util.Map;
+import java.util.function.Supplier;


That's Java 8: https://docs.oracle.com/javase/8/docs/api/java/util/function/Supplier.html

After thinking about this over a long bike ride, I think I am going to revert this change in favor of a cleaner solution. Sorry for the wasted review.

no worries :-). Looking forward to your other solution!

@carl-mastrangelo It's amazing how many insights I get during long rides as well. No review is ever a waste, rather I look at them as a natural progression :) Though is Supplier also in Java 6? Maybe that check might not be necessary for this part of the code.

carl-mastrangelo · 2016-09-06T05:29:24Z

@buchgr PTAL

This solution trades startup time to fast access time. I tried to make it somewhat shade-proof by expecting the io.grpc to always be a prefix of the io.grpc.netty

@ejona86 If you like how this CL works, I think we should consider using it to unify how we access internal data across package boundaries. There are a few places today where we mark things @Internal but don't really have any teeth. This is a clean and enforceable way to set package boundaries. Someone would have to resort to reflection to bypass them, which is what they would have to do if the fields were actually private.

@pgrosu yes, I find them meditatively useful.

buchgr · 2016-09-06T17:07:00Z

core/src/main/java/io/grpc/MethodDescriptor.java

+     */
+    @Internal
+    public InternalAccessor() {
+      StackTraceElement[] st = new Throwable().getStackTrace();


@carl-mastrangelo do we REALLY need to verify this? I don't know how, but I have a feeling that this might fail in some weird way on some application / build system / JVM combo :).

With other internal methods, probably not, but this one is exposing mutable state.

When analyzing allocations this showed up (with a little fire icon in YourKit) which is the motivation for this PR.

I think I'd be okay with just exposing the method with @Internal. If we want a stronger approach, maybe have an Internal class instead that provides accessors like this? That is then more obvious to the callers they are doing something they probably shouldn't. Such a class doesn't actively prohibit using it (although it could, with the same trick here), but maybe doing so is too strong?

Looking at the class name wouldn't work too well on Android, although, yes, only Netty is involved in this case.

In terms of API stability, I think having a higher barrier to entry is the key. Marking classes as @Internal has worked so far, but this is the first time we are exposing something that could really break if someone started depending on it.

I also toyed with the idea of making this hard in other ways, such as masking the return type, or giving it a weird name. For example:

@Internal public Object internalOnlyGetRawPath();

This would force consumers to cast it, and reduce the risk of having auto complete pick it up. However, our own code would have to do the same thing, so anyone looking at it might get the idea that its okay. With the reflect guard, there is no way to misinterpret the intent.

This also conveniently works for internal constructors, where it is more difficult to change the name.

Thoughts?

Also, I realize this is not exactly clean, but its easy to remove later. Would you consider letting this in, under the premise that it can be removed if it causes problems later?

I'm sorta okay with letting this in as-is, but I really don't see why bothering in the first place. What is so much worse about this method than, say, Metadata.serialize()? I can understand the concern, but I don't see this as a new concern. Thus caller-verification seems orthogonal to me, and I don't understand why I have to accept the caller enforcement to receive the performance optimization. I'm driven toward doing what we've done in the past: KISS and just have an @Internal getter.

I don't think we should be in the habit of adding things and removing them if they cause trouble. Instead we add things when they have been shown to be necessary. I see trying to merge this as-is for speed of review a false dichotomy.

buchgr · 2016-09-06T17:18:55Z

Weak LGTM.

I feel like the construct you introduced is very ingenious and probably a good idea if we really need access across package boundaries. I am not convinced that this is the case here. It seems like this small allocation is not going to make a difference.

Would be curious to know what @ejona86 or @nmittler think.

carl-mastrangelo · 2016-09-07T17:30:58Z

After talking with Louis, I changed the API a bit. There is no more reflection. PTAL

buchgr · 2016-09-07T17:41:37Z

Weak LGTM :). I like this approach best so far, but it just seems to me that having this allocation is not going to make a difference. Not a difference that justifies these changes, at least.

carl-mastrangelo · 2016-09-07T21:35:43Z

@buchgr

Ran the numbers, seems like a minor ns bump. These changes are not obviously better, but thats mainly because this is a testbed for seeing how to make cross package opt. The numbers:

Benchmark                                      Mode      Cnt   Score   Error  Units
MethodDescriptorBenchmark.direct             sample  1179094  43.483 ± 0.610  ns/op
MethodDescriptorBenchmark.old                sample  1079285  66.210 ± 5.402  ns/op
MethodDescriptorBenchmark.transportSpecific  sample  1408070  36.800 ± 0.423  ns/op

Direct is the original reflection based version of this PR. Old is the current way, and transportSpecific is with the AtomicRef array.

I think am going to proceed with this PR. (If the reason to hide our internals so strong is not clear, we can chat offline)

Benchmark Mode Cnt Score Error Units MethodDescriptorBenchmark.direct sample 1179094 43.483 ± 0.610 ns/op MethodDescriptorBenchmark.old sample 1079285 66.210 ± 5.402 ns/op MethodDescriptorBenchmark.transportSpecific sample 1408070 36.800 ± 0.423 ns/op

ejona86 · 2016-09-07T22:20:36Z

core/src/main/java/io/grpc/MethodDescriptor.java

-                           boolean idempotent) {
+  private MethodDescriptor(
+      MethodType type, String fullMethodName,
+       Marshaller<ReqT> requestMarshaller,


Indent is a bit weird here.

Whoops! I'll fix it.

ejona86 · 2016-09-07T22:23:03Z

LGTM.

carl-mastrangelo · 2016-09-07T22:27:29Z

One other note: Assuming that the 30ns gain is real, that happens on every outbound RPC. I think we can get about 80,000qps per core, which means its about a 2ms speed up. Not a huge amount but I think it would be notice able. It is getting increasingly hard to find memory allocations that are garbage rather than non garbage.

buchgr · 2016-09-08T09:34:47Z

Well, can you show that it's noticable? :) That's 0.2%. There is so much non-determinism, I can't even measure a 1 or 2% improvement reliably :).

carl-mastrangelo · 2016-09-08T17:03:35Z

Well, I can't show that it is noticeable, but I think that's because the benchmarks have ~1-3% error noise in them.

To be honest, these recent CL's are more about denoising the allocation profiling. They show up there and throw the cumulative accounting off.

carl-mastrangelo assigned buchgr Sep 2, 2016

buchgr reviewed Sep 5, 2016
View reviewed changes

carl-mastrangelo force-pushed the noMethodAlloc branch from eb534bf to 45a8f96 Compare September 6, 2016 05:25

buchgr reviewed Sep 6, 2016
View reviewed changes

carl-mastrangelo force-pushed the noMethodAlloc branch from c70edc7 to 1df5ed4 Compare September 7, 2016 21:37

carl-mastrangelo force-pushed the noMethodAlloc branch from 1df5ed4 to ca5a402 Compare September 7, 2016 21:41

carl-mastrangelo merged commit ca5a402 into grpc:master Sep 7, 2016

carl-mastrangelo deleted the noMethodAlloc branch September 7, 2016 21:41

ejona86 reviewed Sep 7, 2016
View reviewed changes

buchgr mentioned this pull request Sep 8, 2016

core: add Metadata.discardAll() #2236

Merged

lock bot locked as resolved and limited conversation to collaborators Jan 21, 2019

netty: cache method path conversion #2233

netty: cache method path conversion #2233

Uh oh!

Conversation

carl-mastrangelo commented Sep 2, 2016

Uh oh!

carl-mastrangelo commented Sep 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

buchgr commented Sep 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carl-mastrangelo commented Sep 6, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

buchgr commented Sep 6, 2016

Uh oh!

carl-mastrangelo commented Sep 7, 2016

Uh oh!

buchgr commented Sep 7, 2016

Uh oh!

carl-mastrangelo commented Sep 7, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ejona86 commented Sep 7, 2016

Uh oh!

carl-mastrangelo commented Sep 7, 2016

Uh oh!

buchgr commented Sep 8, 2016

Uh oh!

carl-mastrangelo commented Sep 8, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

carl-mastrangelo commented Sep 2, 2016 •

edited

Loading

buchgr commented Sep 5, 2016 •

edited

Loading