New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] "when" 3.7.7 package makes context get mixed up between concurrent HTTP requests #17

Closed
josieusa opened this Issue Dec 28, 2016 · 7 comments

Comments

Projects
None yet
3 participants
@josieusa
Collaborator

josieusa commented Dec 28, 2016

Link to strongloop/loopback#2871

Description of feature (or steps to reproduce if bug)

If using the when@3.7.7 package, the context may get lost mixed up between concurrent HTTP requests, as seen many, many times in similar issues, for example strongloop/loopback#2597 strongloop/loopback#2397. Sorry if I won't repeat myself.

Link to sample repo to reproduce issue (if bug)

Sorry, I'm going to reproduce the issue more properly using loopback-sandbox. Meanwhile, if someone wants to try to reproduce it, he/she can see this comment:
strongloop/loopback#2871 (comment)

Expected result

Actual result (if bug)

Additional information (Node.js version, LoopBack version, etc)

Important: I'm using my fork of loopback-context from PR #11 and following its updated README:
https://github.com/josieusa/loopback-context/blob/feature/cls-hooked-3/README.md
(PR 11 merged)
One note: a similar issue with async package was solved using by my PR #11 (it's tested), but the PR doesn't solve this other issue, so it's evident that when makes different things than async.
I don't know if the point in the code where the context gets lost is this one, but it may be:
https://github.com/cujojs/when/blob/3.7.7/lib/env.js#L25

@josieusa josieusa changed the title from [BUG] "when" package makes context get lost to [BUG] "when" 3.7.7 package makes context get lost Dec 29, 2016

@josieusa

This comment has been minimized.

Show comment
Hide comment
@josieusa

josieusa Jan 2, 2017

Collaborator

If I manage to include a solution to this issue in loopback-context, I'll post it here.
Maybe I could have opened this issue on continuation-local-storage instead.
In fact, the bug is likely inherited from that module (it was foreseeable).
Many issues there, in fact, happen because the context gets lost when some library (for example, a Promise library) uses a queue internally (or a connection pool, which is not the case unless you're using a DB connector, I guess).
Here is where the when package uses a queue:
https://github.com/cujojs/when/blob/3.7.7/lib/Scheduler.js#L20
Here is how to search for similar issues in continuation-local-storage:
https://github.com/othiym23/node-continuation-local-storage/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20queue
For example, here is a similar issue:
othiym23/node-continuation-local-storage#72#issuecomment-231448918

Collaborator

josieusa commented Jan 2, 2017

If I manage to include a solution to this issue in loopback-context, I'll post it here.
Maybe I could have opened this issue on continuation-local-storage instead.
In fact, the bug is likely inherited from that module (it was foreseeable).
Many issues there, in fact, happen because the context gets lost when some library (for example, a Promise library) uses a queue internally (or a connection pool, which is not the case unless you're using a DB connector, I guess).
Here is where the when package uses a queue:
https://github.com/cujojs/when/blob/3.7.7/lib/Scheduler.js#L20
Here is how to search for similar issues in continuation-local-storage:
https://github.com/othiym23/node-continuation-local-storage/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20queue
For example, here is a similar issue:
othiym23/node-continuation-local-storage#72#issuecomment-231448918

@josieusa

This comment has been minimized.

Show comment
Hide comment
@josieusa

josieusa Jan 2, 2017

Collaborator

I just found a workaround to this issue!!!
🎉
This lets you use when, though I strongly discourage to use when instead of bluebird. But it's very interesting to know.

In @vijay22sai's code here, just replace this:

promise.then(() => {
  console.log('get query: ', LoopBackContext.getCurrentContext().get('query'));
  res.status(200).send('ok');
});

with this:

const ns = LoopBackContext.createContext('loopback');
const lbc = LoopBackContext.getCurrentContext();
const get = ns.bind(lbc.get.bind(lbc));
promise.then(() => {
  console.log('get query: ', get('query'));
  res.status(200).send('ok');
});

Explanation: the promise.then() method from when package is known to lose context. (The reason is that it uses a queue internally, as many other packages with which continuation-local-storage has issues. Another issue is with connection pools, but this is not the case).

The ns.bind() method from continuation-local-storage and cls-hooked (also exported by loopback-context) fixes this, if applied on the method loopbackContext.get()which is (indirectly) called back by the bad method.

The trick is to "capture" both the LoopBackContext.getCurrentContext() result and the loopbackContext.get function (not the result), and to call ns.bind() on get (the function, not the result), before the context gets lost, i.e. before you call the "bad" method. Once the context is lost, it's lost!

A good place to do this would be at the beginning of every user-defined middleware that uses the context. But this is not a trivial task. And it's not foolproof. In fact, some third-party middleware (i.e. neither one of yours, nor one from Strongloop) could lose context. And remember: once the context is lost, it's lost, no matter who wrote the middleware that lost it.

Another trick is to write const get = ... instead of loopbackContext.get = ... or even global.get = .... The result of ns.bind() MUST be assigned to a variable in the local scope, not to a field of some other variable. I don't know why, maybe it's related to the value of the this variable when the function gets called. But the reason is not important. Just follow the rule. And of course, use this get variable; do not use the original function loopbackContext.get anymore in the code after you assigned get.

Same thing for loopbackContext.set, even if I didn't show it here.

Also, I called Function.prototype.bind() on loopbackContext.get(), just because it's an instance method. This is no surprise, if you read Function.prototype.bind()docs on MDN.

In general, if the package foo has a method bar, which loses context, and it accepts an asynchronously called callback cb, and getCurrentcontext() is called directly inside cb, you can try this.

It's good to know that a workaround exists, even in difficult cases like this! But I need to investigate this, because it's error-prone and needs a lot of care.

Maybe there is still some room for improvement?

Collaborator

josieusa commented Jan 2, 2017

I just found a workaround to this issue!!!
🎉
This lets you use when, though I strongly discourage to use when instead of bluebird. But it's very interesting to know.

In @vijay22sai's code here, just replace this:

promise.then(() => {
  console.log('get query: ', LoopBackContext.getCurrentContext().get('query'));
  res.status(200).send('ok');
});

with this:

const ns = LoopBackContext.createContext('loopback');
const lbc = LoopBackContext.getCurrentContext();
const get = ns.bind(lbc.get.bind(lbc));
promise.then(() => {
  console.log('get query: ', get('query'));
  res.status(200).send('ok');
});

Explanation: the promise.then() method from when package is known to lose context. (The reason is that it uses a queue internally, as many other packages with which continuation-local-storage has issues. Another issue is with connection pools, but this is not the case).

The ns.bind() method from continuation-local-storage and cls-hooked (also exported by loopback-context) fixes this, if applied on the method loopbackContext.get()which is (indirectly) called back by the bad method.

The trick is to "capture" both the LoopBackContext.getCurrentContext() result and the loopbackContext.get function (not the result), and to call ns.bind() on get (the function, not the result), before the context gets lost, i.e. before you call the "bad" method. Once the context is lost, it's lost!

A good place to do this would be at the beginning of every user-defined middleware that uses the context. But this is not a trivial task. And it's not foolproof. In fact, some third-party middleware (i.e. neither one of yours, nor one from Strongloop) could lose context. And remember: once the context is lost, it's lost, no matter who wrote the middleware that lost it.

Another trick is to write const get = ... instead of loopbackContext.get = ... or even global.get = .... The result of ns.bind() MUST be assigned to a variable in the local scope, not to a field of some other variable. I don't know why, maybe it's related to the value of the this variable when the function gets called. But the reason is not important. Just follow the rule. And of course, use this get variable; do not use the original function loopbackContext.get anymore in the code after you assigned get.

Same thing for loopbackContext.set, even if I didn't show it here.

Also, I called Function.prototype.bind() on loopbackContext.get(), just because it's an instance method. This is no surprise, if you read Function.prototype.bind()docs on MDN.

In general, if the package foo has a method bar, which loses context, and it accepts an asynchronously called callback cb, and getCurrentcontext() is called directly inside cb, you can try this.

It's good to know that a workaround exists, even in difficult cases like this! But I need to investigate this, because it's error-prone and needs a lot of care.

Maybe there is still some room for improvement?

@vijay22sai

This comment has been minimized.

Show comment
Hide comment
@vijay22sai

vijay22sai Jan 2, 2017

@josieusa Awesome!! It's great to know that a workaround exists for some third-party module like when which is not cls aware. I will test this approach with my code. However I will continue to use bluebird instead of when. You've done a great research on this, Thank you.

vijay22sai commented Jan 2, 2017

@josieusa Awesome!! It's great to know that a workaround exists for some third-party module like when which is not cls aware. I will test this approach with my code. However I will continue to use bluebird instead of when. You've done a great research on this, Thank you.

@josieusa

This comment has been minimized.

Show comment
Hide comment
@josieusa

josieusa Jan 3, 2017

Collaborator

I'm going to include included this plus tests in another PR.

Here is a more elegant solution (ES5-compatible). This code should be placed outside the middlewares (an "util" module could be ok):

var oldGetCurrentContext = LoopBackContext.getCurrentContext;
LoopBackContext.getCurrentContext = function() {
var context = oldGetCurrentContext();
return context ? {
get: context.bind(context.get).bind(context),
set: context.bind(context.set).bind(context),
} : null;
};

You must call LoopBackContext.getCurrentContext() only at the very first line of every user-defined middleware which uses the context:

app.get('/t', function(req, res, next) {
  var context = LoopBackContext.getCurrentContext(); // **only** at the first line of the middleware!
  ....
 // Example of cls-unaware code
 promise.then(() => {
   console.log('get query: ', context.get('query'));
   res.status(200).send('ok');
 });

Only this way, if you call a method from a "cls-unaware" package (like when) later, i.e. after the first line, no problem, since you already bound the get and set methods on the first line, in a foolproof way.

Don't forget that other middlewares/connectors that you didn't write yourself could still lose context, because of course they don't know about this workaround, and therefore they don't follow this rule. Issues happening in that code (which should be open source) should be easier to track, though.

Anyway, it's good to know that the loss of context won't happen anymore inside user code.
The worst source of bugs will be gone (if tests say so)!
🎉

Collaborator

josieusa commented Jan 3, 2017

I'm going to include included this plus tests in another PR.

Here is a more elegant solution (ES5-compatible). This code should be placed outside the middlewares (an "util" module could be ok):

var oldGetCurrentContext = LoopBackContext.getCurrentContext;
LoopBackContext.getCurrentContext = function() {
var context = oldGetCurrentContext();
return context ? {
get: context.bind(context.get).bind(context),
set: context.bind(context.set).bind(context),
} : null;
};

You must call LoopBackContext.getCurrentContext() only at the very first line of every user-defined middleware which uses the context:

app.get('/t', function(req, res, next) {
  var context = LoopBackContext.getCurrentContext(); // **only** at the first line of the middleware!
  ....
 // Example of cls-unaware code
 promise.then(() => {
   console.log('get query: ', context.get('query'));
   res.status(200).send('ok');
 });

Only this way, if you call a method from a "cls-unaware" package (like when) later, i.e. after the first line, no problem, since you already bound the get and set methods on the first line, in a foolproof way.

Don't forget that other middlewares/connectors that you didn't write yourself could still lose context, because of course they don't know about this workaround, and therefore they don't follow this rule. Issues happening in that code (which should be open source) should be easier to track, though.

Anyway, it's good to know that the loss of context won't happen anymore inside user code.
The worst source of bugs will be gone (if tests say so)!
🎉

@bajtos

This comment has been minimized.

Show comment
Hide comment
@bajtos

bajtos Jan 3, 2017

Member

I am adding a mention of this problem to project's README - see #18.

@josieusa Thank you for investigating this problem and describing possible workarounds!

To be honest, they all look a bit hacky to me and I am little bit reluctant to include them in the official documentation. I think this issue is only emphasising the fact that continuation local storage in Node.js is simply not ready for prime time yet :(

I'm going to include this plus tests in another PR.

Let's see what you come up with and discuss the details in the pull request. If there is a reasonable way how to make loopback-context more robust, then I think it makes sense to implement it, even if it does not get us to 100% reliability.

Member

bajtos commented Jan 3, 2017

I am adding a mention of this problem to project's README - see #18.

@josieusa Thank you for investigating this problem and describing possible workarounds!

To be honest, they all look a bit hacky to me and I am little bit reluctant to include them in the official documentation. I think this issue is only emphasising the fact that continuation local storage in Node.js is simply not ready for prime time yet :(

I'm going to include this plus tests in another PR.

Let's see what you come up with and discuss the details in the pull request. If there is a reasonable way how to make loopback-context more robust, then I think it makes sense to implement it, even if it does not get us to 100% reliability.

josieusa added a commit to josieusa/loopback-context that referenced this issue Jan 3, 2017

@josieusa josieusa referenced this issue Jan 3, 2017

Merged

Add bind:true option to getCurrentContext #19

3 of 3 tasks complete
@josieusa

This comment has been minimized.

Show comment
Hide comment
@josieusa

josieusa Jan 3, 2017

Collaborator

@bajtos

continuation local storage in Node.js is simply not ready for prime time yet :(

This has been discussed many times. I agree that it doesn't work out of the box. I didn't investigate enough the issues with connection pools yet, but I'll have to, sooner or later. Regarding the other issues with queues, I'm not sure that it all can't be solved (in an acceptable way, with some boilerplate) in user code; let's see how PR #19 goes.

Collaborator

josieusa commented Jan 3, 2017

@bajtos

continuation local storage in Node.js is simply not ready for prime time yet :(

This has been discussed many times. I agree that it doesn't work out of the box. I didn't investigate enough the issues with connection pools yet, but I'll have to, sooner or later. Regarding the other issues with queues, I'm not sure that it all can't be solved (in an acceptable way, with some boilerplate) in user code; let's see how PR #19 goes.

josieusa added a commit to josieusa/loopback-context that referenced this issue Jan 4, 2017

@josieusa josieusa changed the title from [BUG] "when" 3.7.7 package makes context get lost to [BUG] "when" 3.7.7 package makes context get mixed up between concurrent HTTP requests Jan 4, 2017

josieusa added a commit to josieusa/loopback-context that referenced this issue Jan 10, 2017

@josieusa

This comment has been minimized.

Show comment
Hide comment
@josieusa

josieusa Feb 23, 2017

Collaborator

I believe this is fixed... just waiting for someone to take a look at my PR
EDIT fixed in d5f977f

Collaborator

josieusa commented Feb 23, 2017

I believe this is fixed... just waiting for someone to take a look at my PR
EDIT fixed in d5f977f

josieusa added a commit that referenced this issue Feb 24, 2017

Add bind option to getCurrentContext()
Add bind:true option to getCurrentContext

Workaround issue #17 with cujojs/when v3.7.7 due to the fact
that it internally uses a mechanism of user-land queuing for
async-related operations, and possibly issues with similar
packages using queuing (mostly Promise libraries)

@josieusa josieusa closed this Feb 24, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment