Updating headers to use new scheme #265

MSLaguana · 2017-06-05T16:46:11Z

Request IDs are expected to follow https://github.com/lmolkova/correlation/blob/master/hierarchical_request_id.md
and https://github.com/lmolkova/correlation/blob/master/http_protocol_proposal_v1.md

This change defaults to using the new hierarchical ID format, but continues
to interoperate with the legacy format as well.

Request IDs are expected to follow https://github.com/lmolkova/correlation/blob/master/hierarchical_request_id.md and https://github.com/lmolkova/correlation/blob/master/http_protocol_proposal_v1.md This change defaults to using the new hierarchical ID format, but continues to interoperate with the legacy format as well.

KamilSzostak · 2017-06-05T17:39:43Z

AutoCollection/ClientRequests.ts

@@ -20,6 +20,8 @@ class AutoCollectClientRequests {

    public static INSTANCE: AutoCollectClientRequests;

+    private static requestNumber = 1;


Do you ever reset this counter? It is possible that uniqueParentId will not be unique. For instance, if Node server restarts and the counter goes to 1, or if there are more than one Node servers and each has a different requestNumber. Will it cause any problems?

The parentId is a hierarchical Id of the form |root.suffix. or |freshRoot., and suffix and freshRoot are randomly generated for this particular request. Assuming that parentId is unique for this particular incoming request (which is reasonable, a collision of suffix or newly created GUID is unlikely) the uniqueParentId will be unique for each outgoing request unless requestNumber is ever reset. If the node server restarts or there are multiple servers, they should all be generating different parentIds

you should not assume that parentId is indeed unique. It could not be unique because of

bug on the sender side

redirects (headers are copied as it on redirects)

some proxy that retries requests with the same headers

The spec suggests to append small e.g. 32-bits random integer (hex encoded to reduce length) for the parent id

Just realized that the implementation does exactly what spec suggests: for the incoming requests it appends random int suffix, for the outgoing requests, it appends monotonically incremented number. So the outgoing request will have unique Id.

One small note here: I've used a single counter in the whole process, while the .NET sdk seems to use one counter for each request. This means that the IDs could get large in a long-lived process.

OsvaldoRosado · 2017-06-05T18:14:51Z

AutoCollection/CorrelationContextManager.ts

        parentId = parentId || operationId;

        if (this.enabled) {
            return {
                operation: {
                    name: operationName,
                    id: operationId,
-                    parentId: parentId
+                    parentId: parentId,
+                    correlationContextHeader: correlationContextHeader


Why is this necessary here? If possible, I would like to keep this operation object free of properties not present on https://github.com/Microsoft/ApplicationInsights-dotnet/blob/de488729c28b721a758e0024ea2d4f79909ff797/src/Core/Managed/Shared/Extensibility/Implementation/OperationContext.cs so that the API surface is somewhat consistent

My understanding is that if an incoming request has a Correlation-Context header, then we must forward that value on in any outgoing requests. Where would you suggest that we store the value, if not here with the other values that have the same (whole-request) lifetime?

Is that the property bag header? It should fill CorrelationContext.customProperties. That was the placeholder object intended for this. The idea is to deserialize it into a dictionary so that its user-editable, and then re-serialize it when outgoing

We'll have to update the comment there. Today it says

Properties here can be exposed in a future SDK release via outgoing HTTP headers for correlating data cross-component.

According to my reading of https://github.com/lmolkova/correlation/blob/master/http_protocol_proposal_v1.md#correlation-context the Correlation-Context header should only be writable in the originating service, and should be read-only in all others. If that's the case, does deserializing into a dictionary still make sense? Although I see at the end of that section it does also talk about adding to it, which is a little confusing.

The spec says that the very first service may add properties to the Correlation-Context.
Any further service MUST consider properties as read only i.e. MUST NOT modify existing properties. And SHOULD NOT add new properties.
This is a guidance, but AppInsights SDK does not know which service is the first one, so you should just allow adding properties to the Correlation-Context.

If the requirement to keep properties came from upstream as read-only is too strict/hard/inefficient to implement and you would rather use simple dictionary - go ahead, make them mutable.

I think we can't deserialize the header into a map, since the spec explicitly says that keys may be duplicated, and we must not aggregate or deduplicate values. I'd prefer to keep this as an opaque string (that we potentially add , key=value to the end of), but I'd also be reasonably happy with converting it to an array of key=value strings, splitting on ', ' and potentially appending more in future.

There's no way to reconcile the API surface here while keeping it an opaque string, unless we ignore the spec requirement to support adding properties and make the entire functionality private/internal to the SDK (which is maybe fine? but it would be great to get this working if the only issue is on how the end user interacts with it).

We have a key-value store already in the public API surface designated for this use, so if we keep this as an opaque string and publicly make it available on the correlation context we now have an easy to use storage that says its for correlation but doesn't work, and a complicated to use one that does work but you have to handle formatting yourself and know you cant use any illegal characters for headers.

We can replace the already designated object perhaps with one with custom get/setters. Avoiding the edge cases about duplication shouldn't be too difficult, this was already handled in .NET https://github.com/Microsoft/ApplicationInsights-dotnet-server/blob/c64fc1d28773c229b4351e6acf7dc6788229dd31/Src/Common/HeadersUtilities.cs

I don't see how .NET is handling duplicates in that code; they essentially aggregate all duplicates together and ignore all but the first value (and when they write a value, they erase all previous instances of the key and write the new key/value to the end). That doesn't seem like it matches the spec at all.

That is strange isn't it.. The get returning only the first value seems perfectly fine (our SDK doesn't allow you to create duplicate keys, so if you have them you must have special code to read them. I don't think this is really an aggregation, nothing is being combined) That set seems totally broken though.

To strictly follow the spec (which as @lmolkova mentioned isn't totally necessary if it is too difficult) I think setting would instead work on a different data set that only you can initialize (and cannot conflict with any values that were prepopulated into the context), and before sending the header this different data set is appended on to the header string

OsvaldoRosado · 2017-06-05T18:16:26Z

AutoCollection/ServerRequestParser.ts

+
+            if (request.headers[RequestResponseHeaders.requestIdHeader]) {
+                this.parentId = request.headers[RequestResponseHeaders.requestIdHeader];
+                this.requestId = CorrelationIdManager.generateRequestId(this.parentId);


Why is the manually supplied parentId ignored in the new format?

I'm also not sure this approach is correct. I would think our request ids for putting into AI telemetry envelopes do not change format but only the id sent out in the header does. This replaces the id for the envelope with the hierarchical id if I'm reading correctly.

@SergeyKanzhelev What's the correct approach here?

I'm not sure that I understand the purpose of the code-supplied parentId. From what I understand of the spec, if an incoming request specifies a Request-Id, then this request is part of a larger tracked operation, and we should take that Id and extend it appropriately, which is what I do here. What I don't understand is how to combine an incoming Request-Id with an ambient/passed-in Request-Id. If the incoming request does not specify a Request-Id then we will go into the else case which checks for legacy headers, and if they are not found it will use the passed-in parentId.

The manually passed in parent id was to allow filling requestId from correlationContext.operation.parentId when manually calling trackRequestSync. TrackRequestSync is too late to get the value from headers (since it should be the last call in the request handler). If you do, you won't be correlated with any of the dependency telemetry that used a different parentId.

I'm also noticing the parameter is renamed here from requestid to parentid. It's a bit confusing to do that as the parameter should set requestid, we just happen to set the requestid with the value of parentid (to resolve the situation above).

EDIT: For clarity, requestId here does not mean the contents of the header Request-Id but the id field on the envelope generated in getRequestData (the plain, non-hierarchical id from the AI schema to associate children who have parentId tags matching this value)

OsvaldoRosado · 2017-06-05T18:21:28Z

AutoCollection/ServerRequestParser.ts

@@ -110,6 +106,10 @@ class ServerRequestParser extends RequestParser {
        return this.requestId;
    }

+    public getCorrelationContext() {
+        return this.correlationContext;


This is fairly confusing naming, given that this is not a CorrelationContext object

Agreed. I'm open to suggestions; perhaps getCorrelationContextHeader? It is "whatever value is specified in the correlation-context header, if any"

This could get removed entirely since it should be CorrelationContext.customProperties

Right now this getter is how we get access to the value to put it in the custom properties; all the header parsing happens in this class.

Ah yes, sorry I thought this class had access to the correlation context and could fill it directly. It would make sense since that is part of the header parsing

lmolkova · 2017-06-05T19:48:27Z

AutoCollection/ServerRequestParser.ts

+                this.requestId = CorrelationIdManager.generateRequestId(rootId || this.parentId || parentId);
+                this.correlationContext = null;
+            }
+            this.operationId = CorrelationIdManager.getRootId(this.requestId); 


what if there were nor Request-Id nor legacy headers?

If no headers are present, then this.parentId will be null, but this.requestId will be CorrelationIdManager.generateRequestId(parentId) which will generate a hierarchical ID from the passed-in identifier (if any) or from nothing, so this.requestId should always be set.

right, I thought there is the else-if branch for legacy headers. never mind

OsvaldoRosado

Please remove side-effects this has on AI schema constructs. Notably the id field in getRequestData should continue to be a plain guid. ParentId restoration logic for TrackRequestSync should be unaffected as well. In general, there should be no changes to the form of the data sent out of the SDK to the AI backend.

Edit: Cleared up offline that this id logic will actually work okay. TrackRequestSync still needs to be addressed however

Additionally adding 'Id' data to tracked Dependency events.

MSLaguana · 2017-06-06T23:44:00Z

I believe I have addressed these issues now.

MSLaguana · 2017-06-07T16:45:49Z

Looks like one of the tests failed, but only on one of the node versions. It seems that client.sendPendingData invoked its callback twice? Nothing I touched should impact that though.

OsvaldoRosado · 2017-06-07T16:57:47Z

AutoCollection/ClientRequests.ts

@@ -91,6 +91,9 @@ class AutoCollectClientRequests {

        let requestParser = new ClientRequestParser(requestOptions, request);

+        const currentContext = CorrelationContextManager.getCurrentContext();
+        const uniqueParentId = currentContext && currentContext.operation && (currentContext.operation.parentId + AutoCollectClientRequests.requestNumber++ + '.');


Nit: Isn't this a request id not a parent id?

OsvaldoRosado · 2017-06-07T16:59:19Z

AutoCollection/CorrelationContextManager.ts

+        addHeaderData(header: string): void;
+        getProperty(prop: string): string;
+        setProperty(prop: string, val: string): void;
+        serializeToHeader(): string;


I don't think anything other than get and set should be part of the public API

OsvaldoRosado · 2017-06-07T16:59:41Z

AutoCollection/CorrelationContextManager.ts

@@ -8,13 +8,17 @@ export interface CorrelationContext {
        name: string;
        id: string;
        parentId: string; // Always used for dependencies, may be ignored in favor of incoming headers for requests
-        correlationContextHeader?: string;
    };

    /** Do not store sensitive information here. 
     *  Properties here can be exposed in a future SDK release via outgoing HTTP headers for correlating data cross-component.


This comment should be updated since we now support this 😄

OsvaldoRosado

Just some minor comments to address. Looks awesome!

OsvaldoRosado · 2017-06-07T18:07:36Z

@MSLaguana Looks like a re-run of the unit tests removed the failure. Might have already been a flaky test

Also addressing other code review comments.

msftclas added the cla-not-required label Jun 5, 2017

KamilSzostak reviewed Jun 5, 2017

View reviewed changes

OsvaldoRosado reviewed Jun 5, 2017

View reviewed changes

lmolkova reviewed Jun 5, 2017

View reviewed changes

KamilSzostak approved these changes Jun 6, 2017

View reviewed changes

OsvaldoRosado suggested changes Jun 6, 2017

View reviewed changes

Addressing code review

e6da4b5

Additionally adding 'Id' data to tracked Dependency events.

OsvaldoRosado reviewed Jun 7, 2017

View reviewed changes

OsvaldoRosado approved these changes Jun 7, 2017

View reviewed changes

Splitting CustomProperties type into private and public components

7205e1d

Also addressing other code review comments.

OsvaldoRosado merged commit 4a5fa4e into microsoft:develop Jun 8, 2017

OsvaldoRosado added this to the 0.21.0 milestone Jun 8, 2017

KamilSzostak mentioned this pull request Jun 30, 2017

Update correlation headers to use new schema microsoft/ApplicationInsights-JS#478

Open

		@@ -20,6 +20,8 @@ class AutoCollectClientRequests {

		public static INSTANCE: AutoCollectClientRequests;

		private static requestNumber = 1;

Updating headers to use new scheme #265

Updating headers to use new scheme #265

Conversation

MSLaguana commented Jun 5, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jun 5, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OsvaldoRosado Jun 5, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jun 5, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OsvaldoRosado left a comment • edited

Choose a reason for hiding this comment

MSLaguana commented Jun 6, 2017

MSLaguana commented Jun 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OsvaldoRosado left a comment

Choose a reason for hiding this comment

OsvaldoRosado commented Jun 7, 2017

lmolkova Jun 5, 2017 •

edited

OsvaldoRosado Jun 5, 2017 •

edited

lmolkova Jun 5, 2017 •

edited

OsvaldoRosado left a comment •

edited