Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[http] Fix repeated traces on Node < v9 when tracing https requests. #268

Merged

Conversation

alloy
Copy link
Contributor

@alloy alloy commented Sep 1, 2018

When your application loads both the http and https modules while running on Node < v9, the https.request function internally calls http.request and ends up being double-counted https://github.com/nodejs/node/blob/v8.11.4/lib/https.js#L239

This patch fixes that situation by marking objects as already being traced and ignoring them when encountered again.


describe('with service configuration', () => {
let config
it('should only record a request once', done => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here starts the actual new test

@@ -48,6 +48,7 @@ dev,proxyquire,MIT,Copyright 2013 Thorsten Lorenz
dev,redis,MIT,Copyright 2016 NodeRedis
dev,require-dir,MIT,2012-2015 Aseem Kishore
dev,retry,MIT,Copyright 2011 Tim Koschützki Felix Geisendörfer
dev,selfsigned,MIT,Paolo Fragomeni and Charles Bushong
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a dev dependency here, hope that’s ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok but it's missing Copyright <years> in front of the names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the Copyright part because that’s implicit, but they have not put any years on that themselves. (Or at least I couldn’t find a copyright line.)

I’ll ask them in an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alloy alloy force-pushed the fix-http-double-counting-on-node-older-than-v9 branch from e237277 to 1f990c1 Compare September 2, 2018 06:35
}
// Here we mark the `req` object, as opposed to also marking the `options` object like we do in `get`. The problem
// is that if we’d mark `options` here, it would get lost while upstream `request` derives a clone from it and
// thus would never get to the recursive invocation of `request`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code could be refactored to make the behavior clearer without the need for comments that detailed by using different code paths for the different scenarios. I tried to use the same code for both http and https but it seems it didn't necessarily make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s definitely what I tried to do 😅

Maybe the comments are just making it harder to follow along? Or do you have more concrete suggestions of how to split them up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned an idea in the get/request discussion below, hopefully it makes sense :)

}
})
// Prevent from double-counting in `request`.
markAsTraced(options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just wait until req is returned and mark it there? (same as request)

Copy link
Contributor Author

@alloy alloy Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because get calls end on the req object before returning it, which is too late to inject headers.

(Misread, will leave a new comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because get() internally calls request() we need to signal to our patched version of request() that this request is already being traced so it can skip it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it seems there is another problem with this, in that the span is created after calling request() and thus would exclude its execution time.

if we’d mark options here, it would get lost while upstream request derives a clone from it

Is this done by the plugin or by the https module?

Also, wouldn't it be possible to simply detect the protocol in the http patch, so that if it's https it skips instrumentation (since it's the handler specifically for http). Then thehttp handler could simply use the https handler and wrap it with this check.

Copy link
Contributor Author

@alloy alloy Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it seems there is another problem with this, in that the span is created after calling request() and thus would exclude its execution time.

The span is created on line 50 before invoking the actual implementation on lines 53 or 57.

if we’d mark options here, it would get lost while upstream request derives a clone from it

Is this done by the plugin or by the https module?

The https module https://github.com/nodejs/node/blob/v8.11.4/lib/https.js#L226-L237

Also, wouldn't it be possible to simply detect the protocol in the http patch, so that if it's https it skips instrumentation (since it's the handler specifically for http). Then the http handler could simply use the https handler and wrap it with this check.

Ah yes, I forgot to include this in my initial post, but my very first attempt basically did something along those lines, but it meant that:

  1. When patching the https module we also need to activate the http plugin to ensure that both would be patched. This worked fine, but led to friction when deciding what/how to unpatch.
  2. Because things work differently on Node < and >= v9, we’d need to conditionally patch some modules in some cases and not others. In and of itself this wasn’t really a big issue, because other plugins have version requirements too, it just added to the growing feeling of this not being the cleanest solution. (I had patched instrumenter.js to use the node version when a module couldn’t be located in node_modules.)

There was another hard issue I ran into but can’t remember offhand any longer, but it was probably the straw that broke the camel’s back and made me decide to take another stab at it from a different approach that would be agnostic to version specific issues, which is this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The span is created on line 50 before invoking the actual implementation on lines 53 or 57.

Sorry for the confusion I actually meant line 15

The https module https://github.com/nodejs/node/blob/v8.11.4/lib/https.js#L226-L237

Since we always convert the options to an object, the property would be copied if defined as enumerable (or with a simple assignment). The reasoning behind using Object.defineProperties was to hide very large data that we might add from console.log such as a span. Also, in this case, the options object is a copy anyway so any value assigned here will never be seen by the user.

When patching the https module we also need to activate the http plugin to ensure that both would be patched. This worked fine, but led to friction when deciding what/how to unpatch.

I don't see why this would be the case if each of them have their own individual wrappers.

Because things work differently on Node < and >= v9, we’d need to conditionally patch some modules in some cases and not others.

I would argue that I'd prefer a clear line of code to do that then unclear code and a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion I actually meant line 15

Ah right, it should be ok because, as I understand, the request won’t actually start until end() is called.

The reasoning behind using Object.defineProperties was to hide very large data that we might add from console.log such as a span.

That’s good to know 👍

I would argue that I'd prefer a clear line of code to do that then unclear code and a comment.

Touché 😏

I’ll take another stab at that once I’m done with a patch for receiving sampling priority from the agent on top of #217.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once I’m done with a patch for receiving sampling priority from the agent

@alloy Please definitely let me know when you start working on that to avoid duplicating work as we were planning on adding that in the next few days.

Copy link
Member

@rochdev rochdev Sep 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alloy Did you make any progress on sampling priority? Otherwise that was the next item on my priority list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit. I’ll hit you up in Slack in a few minutes


appListener = server(app, port, () => {
// Activate a new parent span so we capture any double counting that may happen
const span = tracer.startSpan('http-test')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the two spans that are being emitted in the current release are siblings, thus the test would stop after the first one came in. Adding a parent span for them both makes sure the test takes all emitted spans into consideration.

@@ -48,6 +48,7 @@ dev,proxyquire,MIT,Copyright 2013 Thorsten Lorenz
dev,redis,MIT,Copyright 2016 NodeRedis
dev,require-dir,MIT,2012-2015 Aseem Kishore
dev,retry,MIT,Copyright 2011 Tim Koschützki Felix Geisendörfer
dev,selfsigned,MIT,Paolo Fragomeni and Charles Bushong
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok but it's missing Copyright <years> in front of the names.

@timrogers
Copy link
Contributor

I'd be keen to get this one out - we're seeing all of these double recordings of our traces 😦 : Let me know if I can do anything to help.

@alloy
Copy link
Contributor Author

alloy commented Sep 6, 2018

@timrogers FYI that we’re running this on our fork in production just fine, so that may be an option for you until this is finished and released.

@timrogers
Copy link
Contributor

@alloy Good shout! I'll do that.

*/
require('http')
}
},
unpatch
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rochdev Ok this is the simplified version that just checks the Node version and patches the http module instead of the https module. You said in Slack that you think the unpatching part is fine to be left untouched like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold off, I’m investigating some issues on at least Node v10.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken the signature changed in Node 10 too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed, I forgot that the instrumenter expects the functions to bind to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, still appears to be an issue on Node v6. Looking into it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that Node 10 accepts 3 parameters while Node 8 accepts only 2:

https://nodejs.org/docs/latest-v10.x/api/http.html#http_http_request_url_options_callback
https://nodejs.org/docs/latest-v8.x/api/http.html#http_http_request_options_callback

This is unrelated and can be done in another PR, but if it's an easy fix it would be great to fix all Node versions at once :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yeah sure, I can get that done too 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* not use externally patched versions, which is why we need to also patch `get` here separately.
*/
patch.call(this, http, 'get', tracer, config)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rochdev I tried to scour the Node release notes for when/how this changed by searching for things like ‘module’ and ‘binding’ but came up short. In any case, I tried to describe the behaviour I was able to deduce as well as possible, let me know if you’d like to amend this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably rephrase it simpler. Something like "Patch http.get as well in Node >=8 since it doesn't call http.request"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh but it does https://github.com/nodejs/node/blob/v10.10.0/lib/https.js#L284. It’s just that something changed in Node between v6 and v8 that somehow makes it so that other functions exported from a module (such as get) than the patched function (such as request) still refer to the original unpatched implementation. So get needs to explicitly be patched here to trace it.

I wasn’t able to find anything in the Node changelog, but also didn’t have a great idea of what to search for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it does call the internal request yes, but it doesn't call http.request.

options.headers = options.headers || {}
return { uri, options, callback }
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it this way to cut down on runtime work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the performance gain of optimizing such an already fast operation warrants increased code complexity. Did you benchmark this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I’ll update 👍

* Below Node v9 the `https` module invokes `http.request`, which would end up counting requests twice.
* So rather then patch the `https` module, we ensure the `http` module is patched and we count only there.
*/
require('http')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If https requires http, why does it need to be additionally required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably related to the change described here
https://github.com/DataDog/dd-trace-js/pull/268/files#r217523774. That is, even though https itself loads http it’s internally still referring to the unpatched version.

So maybe the Node change is that it makes an HTTP request very early on (before even running the user’s code?), before DataDog has the opportunity to patch anything? 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it should be impossible that it refers to the unpatched version, unless we load http at some other point in the tracer before it's instrumented. This happened with other instrumented modules in the past. In that case the module cache should be wiped before the test. I'll take a look to try and find what could be causing this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems built-in modules are never stored in the cache, so they cannot be wiped. I'm still not 100% comfortable with this solution, but I don't see any other ways either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you, this type of faffing is why after going down this route the first time I figured to use an approach that did not do conditional patching. But you were right that that made the code a lot more complex.

options.headers = options.headers || {}
return { uri, options, callback }
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the performance gain of optimizing such an already fast operation warrants increased code complexity. Did you benchmark this?

* not use externally patched versions, which is why we need to also patch `get` here separately.
*/
patch.call(this, http, 'get', tracer, config)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably rephrase it simpler. Something like "Patch http.get as well in Node >=8 since it doesn't call http.request"

@rochdev rochdev merged commit 2d8f49f into DataDog:master Sep 14, 2018
@alloy alloy deleted the fix-http-double-counting-on-node-older-than-v9 branch September 17, 2018 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community integrations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants