Add a request id header for all communications. #17

a-feld · 2020-07-08T15:12:46Z

Adding a request ID header makes it feasible in the future for ingest to deduplicate data being sent from the SDKs without any updates being required to the SDK clients.

mwlang · 2020-07-08T15:38:20Z

Ruby's UUID version 4 (UUID4) is not particularly fast and we recently removed its use in the Ruby agent, favoring a simpler random number generator. In high-throughput environments, generating a UUID4 can lead to file handle exhaustion since Ruby's implementation makes use of /dev/urandom

so the question here: can we get away with simple random number generations, or must it truly be UUID4 standard?

communication.md

a-feld · 2020-07-08T15:45:56Z

That's a good point @mwlang , I assumed that requests would not be high throughput (we default to batching every 5 seconds), but that's no excuse for poor performance 😄

That being said, there are ~122 bits of entropy in UUID4 according to https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random)

We can change the request ID to using a random hex number but we should probably figure out how much entropy is needed.

We'll have to look at the maximum window we'd like to deduplicate payloads in (I'm thinking 1-5 minutes), how many requests we expect in that window both now and in the future, and what the probability is for a collision. We certainly want to avoid cases where a collision would cause data loss. For the entropy required, I think having input from our ingest team would be helpful. I'll reach out to them for comment.

jkwatson · 2020-07-08T15:46:00Z

Ruby's UUID version 4 (UUID4) is not particularly fast and we recently removed its use in the Ruby agent, favoring a simpler random number generator. In high-throughput environments, generating a UUID4 can lead to file handle exhaustion since Ruby's implementation makes use of /dev/urandom

so the question here: can we get away with simple random number generations, or must it truly be UUID4 standard?

One or two UUIDs per harvest cycle doesn't seem too onerous. Is it really that bad, that you can't generate a couple every 5 seconds or so without impacting things? I would hope the cost of generating a UUID would be much, much less than the network latency. Am I wrong about that?

justinfoote · 2020-07-08T16:40:41Z

More than the performance impact of generating a UUID4 in ruby, the fact that it opens a new file descriptor to read from /dev/urandom is the bigger problem. In cases where an application is running out of file descriptors, sometimes the app-crashing stacktrace will be random number generation within the Telemetry SDK. This isn't a good look. See also.

I totally agree with Allan's proposal of requiring (or at least strongly suggesting) some minimum amount of entropy. My hunch is that 122 bits is far more than is strictly necessary, but also the cost of extra bits isn't too much. I wouldn't be opposed to suggesting something like tracestate -- 32 characters of hex representing 128 bits of random.

mwlang · 2020-07-08T17:37:37Z

One or two UUIDs per harvest cycle doesn't seem too onerous. Is it really that bad, that you can't generate a couple every 5 seconds or so without impacting things? I would hope the cost of generating a UUID would be much, much less than the network latency. Am I wrong about that?

Nope. Not wrong. Justin goes into a little more detail about our pain point. My primary goal is to avoid bringing back that pain if there's an alternative that's "good enough"(tm).

@justinfoote going with the 32 characters of hex representing 128 bits of random seems plenty good to me and would entirely side-step our earlier issues.

jkwatson · 2020-07-08T17:40:47Z

More than the performance impact of generating a UUID4 in ruby, the fact that it opens a new file descriptor to read from /dev/urandom is the bigger problem. In cases where an application is running out of file descriptors, sometimes the app-crashing stacktrace will be random number generation within the Telemetry SDK. This isn't a good look. See also.

I totally agree with Allan's proposal of requiring (or at least strongly suggesting) some minimum amount of entropy. My hunch is that 122 bits is far more than is strictly necessary, but also the cost of extra bits isn't too much. I wouldn't be opposed to suggesting something like tracestate -- 32 characters of hex representing 128 bits of random.

Ah, I get it. The problem isn't UUIDs in general, it's the SecureRandom version in ruby that requires a file descriptor to be opened. Gotcha. Sounds like this is a ruby-specific implementation detail, rather than a general problem with UUIDs.

a-feld · 2020-07-08T19:15:40Z

@mwlang @justinfoote I've updated the proposal to a 32 character hex string based on the feedback! 😄

mwlang · 2020-07-08T19:28:34Z

Ruby agent approves this change!

justinfoote

🎉

a-feld · 2020-07-10T16:25:30Z

After chatting with folks, I've changed the proposal back to uuid4. Part of this decision is that we believe the risk of collisions (which would result in data loss under the proposed usage of the request id) resulting from the use of a non-uniform random number generator is too high of a risk. Using a uuid provides greater collision resistance.

The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.

michaelgoin · 2020-07-14T15:47:35Z

dang. wish i had seen this earlier. this will be slowwwwww in Node land. up to several milliseconds occasionally to generate (possible to be even longer right after startup).

while it is only on request, we've seen attempted uses in the wild where metrics are fired off right after generating. the whole ID situation in Node is quite unfortunate. given we've likely missed the boat here, i guess we may need to get very prescriptive in documentation on forcing batching.

Add a request id header for all communications.

90f0d83

justinfoote reviewed Jul 8, 2020

View reviewed changes

communication.md Outdated Show resolved Hide resolved

Clarify request IDs for 413 responses.

2b2339e

justinfoote approved these changes Jul 8, 2020

View reviewed changes

a-feld force-pushed the request-id branch from 033f221 to 2b2339e Compare July 10, 2020 16:23

a-feld merged commit 2840953 into master Jul 10, 2020

a-feld deleted the request-id branch July 10, 2020 21:17

a-feld added a commit to newrelic/newrelic-telemetry-sdk-python that referenced this pull request Jul 13, 2020

Add x-request-id header.

3a1b436

The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.

a-feld mentioned this pull request Jul 13, 2020

Add x-request-id header to all requests. newrelic/newrelic-telemetry-sdk-python#32

Merged

a-feld added a commit to newrelic/newrelic-telemetry-sdk-python that referenced this pull request Jul 13, 2020

Add x-request-id header.

63054f5

The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.

a-feld added a commit to newrelic/newrelic-telemetry-sdk-python that referenced this pull request Jul 13, 2020

Add x-request-id header.

2b64109

The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.

a-feld added a commit to newrelic/newrelic-telemetry-sdk-python that referenced this pull request Jul 13, 2020

Add x-request-id header.

5be650a

The x-request-id header is a uuid4 value that may be used by ingest in the future. See newrelic/newrelic-telemetry-sdk-specs#17 for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a request id header for all communications. #17

Add a request id header for all communications. #17

a-feld commented Jul 8, 2020 •

edited

mwlang commented Jul 8, 2020

a-feld commented Jul 8, 2020

jkwatson commented Jul 8, 2020

justinfoote commented Jul 8, 2020 •

edited

mwlang commented Jul 8, 2020

jkwatson commented Jul 8, 2020

a-feld commented Jul 8, 2020

mwlang commented Jul 8, 2020

justinfoote left a comment

a-feld commented Jul 10, 2020

michaelgoin commented Jul 14, 2020

Add a request id header for all communications. #17

Add a request id header for all communications. #17

Conversation

a-feld commented Jul 8, 2020 • edited

mwlang commented Jul 8, 2020

a-feld commented Jul 8, 2020

jkwatson commented Jul 8, 2020

justinfoote commented Jul 8, 2020 • edited

mwlang commented Jul 8, 2020

jkwatson commented Jul 8, 2020

a-feld commented Jul 8, 2020

mwlang commented Jul 8, 2020

justinfoote left a comment

Choose a reason for hiding this comment

a-feld commented Jul 10, 2020

michaelgoin commented Jul 14, 2020

a-feld commented Jul 8, 2020 •

edited

justinfoote commented Jul 8, 2020 •

edited