-
Notifications
You must be signed in to change notification settings - Fork 0
fix: for perf improvement using fast random #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
return new RequestContext() | ||
.put(RequestContextConstants.TENANT_ID_HEADER_KEY, tenantId) | ||
.put(RequestContextConstants.REQUEST_ID_HEADER_KEY, UUID.randomUUID().toString()); | ||
.put(RequestContextConstants.REQUEST_ID_HEADER_KEY, UuidCreator.getRandomBasedFast().toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ref usage of ThreadLocalRandom:
- https://github.com/f4b6a3/uuid-creator/blob/3b6471642e95ca91901ce5efe817537aa0d1068b/src/main/java/com/github/f4b6a3/uuid/UuidCreator.java#L174
- https://github.com/f4b6a3/uuid-creator/blob/3b6471642e95ca91901ce5efe817537aa0d1068b/src/main/java/com/github/f4b6a3/uuid/UuidCreator.java#L1265
- https://github.com/f4b6a3/uuid-creator/blob/3b6471642e95ca91901ce5efe817537aa0d1068b/src/main/java/com/github/f4b6a3/uuid/factory/AbstRandomBasedFactory.java#L84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concerns here aren't the new impl but
- Adding a new dependency to a base lib like this impacts the dependency tree of every single service
- This is just covering up an upstream problem - that we're generating request contexts inappropriately - many times for a single "request".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a new dependency to a base lib like this impacts the dependency tree of every single service
I can extract out the function from this library based on ThreadLocalRandom.
This is just covering up an upstream problem - that we're generating request contexts inappropriately - many times for a single "request".
I didn't fully get this. Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't fully get this. Can you elaborate?
The reason this came up - at least the reason I'm aware of - as an issue is because ingester is generating many request contexts for a single trace ingestion. Each time it needs a context, it generates a new one. Because this now occurs across many enrichers and many threads in parallel, we're running into these performance problems. But not only is that unnecessary, it's actually wrong - we want the same ID to be shared for ingesting the same trace in the first place.
A couple different approaches there would be to either
- update the enricher interface to accept the context, and generate it up front before we delegate the enrichment off to an async thread.
- Create a request context from the trace details - use the trace id as the "request id". This would probably be defined in the ingestion pipeline repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request-id attribute is attached to the grpc-context while making a downstream call. This also seems to be getting added only for RXJava calls, if someone is not using RXJava, this request-id is not attached to the call.
How is this request-id used by the downstream services? Given that its attached only for RXJava based calls and not for all GRPC calls, do we even need to attach this request-id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason this came up - at least the reason I'm aware of - as an issue is because ingester is generating many request contexts for a single trace ingestion. Each time it needs a context, it generates a new one. Because this now occurs across many enrichers and many threads in parallel, we're running into these performance problems. But not only is that unnecessary, it's actually wrong - we want the same ID to be shared for ingesting the same trace in the first place.
A couple different approaches there would be to either
update the enricher interface to accept the context, and generate it up front before we delegate the enrichment off to an async thread.
Create a request context from the trace details - use the trace id as the "request id". This would probably be defined in the ingestion pipeline repo.
I agree that for tracing the processing of StructureTrace, it's important to establish a context based on unique keys or attributes that can be used to generate a trace ID. And if request_id
is primarily used internally to link outgoing requests to the same trace, we can consider option (2) having new request context based trace.
However, for any other consumers of this interface forTenantId
, do we really need a request_id
generated from a secure random source? Since this is primarily for tracing purposes and not for resource identification, I think it's more practical to use a fast UUID-based request_id
generated with ThreadLocalRandom
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can extract out the function from this library based on ThreadLocalRandom.
If we don't plan to use the above library here. I am thinking of having our own function like below:
public static UUID generateFastRandomUUID() {
long mostSigBits = ThreadLocalRandom.current().nextLong();
long leastSigBits = ThreadLocalRandom.current().nextLong();
// Set the version (4) For random UUID
mostSigBits &= 0xFFFFFFFFFFFF0FFFL;
mostSigBits |= 0x0000000000004000L;
// Set variant to RFC 4122
leastSigBits &= 0x3FFFFFFFFFFFFFFFL;
leastSigBits |= 0x8000000000000000L;
return new UUID(mostSigBits, leastSigBits);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few stacks showing up the contention at jdk level
Codecov Report
@@ Coverage Diff @@
## main #49 +/- ##
============================================
+ Coverage 74.57% 74.88% +0.30%
- Complexity 145 146 +1
============================================
Files 20 21 +1
Lines 409 418 +9
Branches 22 22
============================================
+ Hits 305 313 +8
- Misses 85 86 +1
Partials 19 19
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
This is unrelated to rxjava. RequestContext and everything it contains (tenant id, request id, auth context where applicable) is used by all of our grpc calls (a handful of which are themselves managed by rxjava). This request ID is used to correlate errors and logs across services.
The unique key is just the trace id itself though, right? Either way I think it would make sense to update the enricher interface so however we generate the request context is immaterial to the enrichers, it's just a question of level of effort. If that's painful, then having a locally consistent version that does something like
So here is where I think we may be getting too far in the weeds. I agree, it doesn't need to be secure and am certainly glad to have a thread local implementation (yours looks fine) - but I think it becomes an unnecessary optimization if we fix the request context generation itself. We're fixing a symptom of an upstream bug rather than the bug itself.
Right, this is the one coming from the context generation being inside the async threads rather than up front. Edited to remove internal details. |
GRPC request context generation and implementation is the responsibility of the grpc framework itself. Why do we want to push this responsibility down to the grpc client applications and leak/duplicate this code everywhere. While I agree that trace enricher refactoring is needed for several other good reasons. Definitely not worth for this one. |
@kotharironak : I vote for a simple util based implementation without a new third party dependency for this alone. |
I would still leave it there, it's just a new constructor allowing callers to bring their own ID when that makes sense. public static RequestContext forTenantId(String tenantId) {
return RequestContext.forTenantAndRequestIds(tenantId, UUID.randomUUID().toString()); // This would be the added api
} There's no real duplicate code, as this only impacts clients in that they're adding the extra arg from data they already have where applicable. And that's unknowable data to the framework.
We can choose how major of a refactor we want this to be. The interface can be left as is or updated, but we can't fix the actual bug here without changing the duplicated |
Given that UUID.randomUUID takes a lock and causes thread contention, we should replace it with a a ThreadLocalRandom based implementation. This will help in general. |
I just don't want us to lose sight of the root cause here, but otherwise no issues with that. |
@laxmanchekka @mihirgt @aaron-steinfeld |
* the default randomUUID method that relies on /dev/random. It's suitable for most random UUID | ||
* needs. | ||
*/ | ||
public static UUID generateFastRandomUUID() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static UUID generateFastRandomUUID() { | |
public static UUID randomUUID() { |
import java.util.UUID; | ||
import java.util.concurrent.ThreadLocalRandom; | ||
|
||
public class UuidGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class UuidGenerator { | |
public class FastUUIDGenerator { |
import java.util.UUID; | ||
import java.util.concurrent.ThreadLocalRandom; | ||
|
||
public class FastUUIDGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be public? Doesn't need a new release, but next time you're in here can you move this to package private since it's an internal impl detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially thought of it to our own lib so it can be used. But, I think, we can make it private, anywhere else we can use the third-party library that is already there in most of the projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
Java's UUID.randomUUID uses SecureRandom, which internally relies on the system's /dev/random. This method requires a sufficient amount of entropy to generate random numbers and may lead to blocking.
In the present context, we are focused on generating request IDs. Therefore, a faster UUID generator based on ThreadLocalRandom should be adequate.
For a similar use case, the OpenTelemetry system also utilizes a trace ID based on ThreadRandomLocal, serving as a reference.
As part of this PR, using fastRadom for request-context.