-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError: Cannot read property 'enter' of undefined #30122
Comments
In 12.13.0 the line would appear to be 79: |
@emhagman Is that all the stack trace you get? Is there any chance you could figure out what type of resource the hook is called for? |
@addaleax This is the only stack trace that shows up, unfortunately. I can try to add a lot more logging and get some more context around when it happens but that is about it. Can I run |
@emhagman Can you dump async hooks output to a file (à la https://github.com/addaleax/node/blob/ee4027d5bbabe87d4446a62ec0b7b60e7af59c30/test.js#L3-L18 except not |
@addaleax Sure thing I'll get it to you as soon as I can. Thanks for the help! |
Did Github delete a comment here? Anyway; just thinking out loud… The debug log shows that there is no I’m not really sure how to best debug this further – one thing you could try is running node with |
@addaleax Sorry, I deleted the comment because I left the I'm not sure if it helps but we can anecdotally confirm (other users in that thread) that not using the |
@emhagman I’m not personally familiar with their code, although you can of course also report this as an issue to them if you haven’t already (and ideally add a link this one). |
@addaleax I have and they have deemed it an issue with Node, unfortunately. Since this has to do with GC, are there any ways to force a situation where this is more likely to occur? As of now, I have no idea how to reproduce it other than to use our app until it breaks |
@emhagman You could set the value of the |
@addaleax I have the logs with the resources. Adding
Not sure if that helps. Working on getting the dump from my container now EDIT: We use ECS Fargate on AWS so that will be impossible. I'll have to try and reproduce this locally so I can get access to the dump. |
@emhagman Alright, this makes a bit more sense now. The I’m assuming that 3d9d1ad is responsible for this … any chance you could verify that the bug was introduced in Node v12.3.0? Also, if it helps with reproducing: If i’m correct, this will mostly happen when the |
@addaleax I believe others have already reproduced this bug in 12.3.0 and mentioned reverting to 12.2.0 fixes it in the Sentry issue thread. If I can make this happen on demand I'll be able to be positive about that. What you said about the https agent makes perfect sense, we have a client that uses the same https agent with keepAlive on to the same host to try to save on connections to one of our internal services. That definitely helps with me getting closer to a reproducible bug. I'll keep digging, thanks again for the help. |
@emhagman If you’re in a position to try out patches to Node.js, could you try this one? diff --git a/lib/_http_agent.js b/lib/_http_agent.js
index dcb5ed376de8..f4d9bb7fe36a 100644
--- a/lib/_http_agent.js
+++ b/lib/_http_agent.js
@@ -44,10 +44,15 @@ const {
// ClientRequest.onSocket(). The Agent is now *strictly*
// concerned with managing a connection pool.
+const kReusedHandle = Symbol('kReusedHandle');
+
class ReusedHandle {
constructor(type, handle) {
this.type = type;
this.handle = handle;
+ // Tie the lifetime of the two objects together, mostly for the 'domain'
+ // module.
+ handle[kReusedHandle] = this;
}
}
|
As soon as I can reproduce it easily I'll try the patch 👍 It appears the library in question also uses |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@addaleax I am still trying to reproduce it but I noticed I linked to old code. They currently use https://github.com/getsentry/sentry-javascript/blob/master/packages/node/src/transports/https.ts#L18 at Sentry. The Does that help you at all? |
I'm following this issue as well as I'm really interested in solving it, just one thing about the keepalive flag and the httpsagent. Ist the agent also responsible for regular requests in express? I'm asking because the domain module is used in the Middleware of express, and as far as I understand the sentry client is using the http agent also for transferring the results back the the sentry backend, but in this case the domain is not that relevant. It is in the middleware though ... Assuming this is correctl, the "right way" of reproducing the issue would be sending parallel requests to your server so that the agent gets reused after some calls. Not sure though,just my 2 cents. |
@simllll That particular agent is not responsible for regular requests in Here is my current code I am using to try to reproduce the issue. Using const https = require('https');
const fs = require('fs');
const util = require('util');
const crypto = require('crypto');
const async_hooks = require('async_hooks');
// async_hooks debug dump code...
function debug(...args) {
fs.writeFileSync('/dev/stdout', `ASYNC_HOOK_DEBUG: ${util.format(...args)}\n`, { flag: 'a' });
}
async_hooks.createHook({
init(id, type, triggerId, resource) {
debug('init', { id, type, triggerId, resource });
},
before(id) {
debug('before', { id });
},
after(id) {
debug('after', { id });
},
destroy(id) {
debug('destroy', { id });
},
}).enable();
const httpsAgent = new https.Agent({ keepAlive: false, maxSockets: 5, timeout: 2000 });
const httpsOpts = {
protocol: 'https:',
method: 'POST',
hostname: 'postman-echo.com',
path: '/post',
port: 443,
agent: httpsAgent
};
function imitateSentrySend(httpModule, event) {
return new Promise((resolve, reject) => {
const opts = Object.assign(httpsOpts, event.options);
const req = https.request(opts, function(res) {
res.setEncoding('utf8');
if (res.statusCode >= 200 && res.statusCode < 300) {
resolve({ status: res.statusCode });
} else {
reject(new Error(`Request failed with statusCode: ${res.statusCode}`));
}
res.on('data', () => {});
res.on('end', () => {});
});
req.on('error', reject);
req.end(JSON.stringify(event));
});
}
(async function main() {
while (true) {
const id = crypto.randomBytes(8).toString('hex');
const event = { id, msg: 'Some random error has occurred, sending to Sentry...' };
const fakeArray = Array(Math.ceil(Math.random() * 4)).fill(Math.random());
const promises = fakeArray.map(value => {
const delay = Math.round(Math.random() * 2);
const opts = delay ? { path: '/delay/' + delay, method: 'GET' } : {};
return imitateSentrySend(https, { ...event, value, options: opts });
});
await Promise.all(promises);
}
})(); |
@emhagman I think it’s the combination of both that’s making this problematic.
Yeah, I don’t think this is going to crash the way it does in your production code without using domains…
@simllll I don’t know for sure, but I wouldn’t think that it is being used for those. |
@addaleax Gotcha. Do I need to bind anything in particular or just be running this inside my own domain?
Sorry if I seem ignorant, I have never really worked with domains directly before. My basic understanding is they're used to hold context and to prevent errors from bubbling up to the main event loop. |
@emhagman I would think that that’s enough but that doesn’t really mean that I’m sure that it’ll work as a reproduction. (I’ll try to put one together myself, too – likely tomorrow though.)
Yeah, that was kind of the original idea behind them when they were introduced. |
@emhagman I'm only on my phone, but my thoughts would be to send the error (your imitatesentrysend) in the on error handler of the domain, and inside the run async method just throw a lot of errors with some context (e.g. create objects that you attach to the error object). They should get picked up by the error handler, where the agent will try to send the data somewhere. I also would think of a way to make the transfer to the backend as slow as possible so that gc can kick in while sending the data. Maybe try attach some megabytes of Data to the error object so that it will take a while ;) |
@simllll Thanks for the tips. I am probably going to take a break for a bit. This is what I currently have if anyone wants to take a stab at it. I've set it up exactly as Sentry, with the request middleware and the error middleware and used a local domain just like they do in their request middleware. It then "sends the errors" to the echo service to imitate Sentry. I've used the same With all of this, and using Run the app with
const express = require('express');
const https = require('https');
const fs = require('fs');
const util = require('util');
const crypto = require('crypto');
const async_hooks = require('async_hooks');
const domain = require('domain');
// async_hooks debug dump code...
function debug(...args) {
fs.writeFileSync('/dev/stdout', `ASYNC_HOOK_DEBUG: ${util.format(...args)}\n`, { flag: 'a' });
}
async_hooks.createHook({
init(id, type, triggerId, resource) {
debug('init', { id, type, triggerId, resource });
},
before(id) {
debug('before', { id });
},
after(id) {
debug('after', { id });
},
destroy(id) {
debug('destroy', { id });
},
}).enable();
const httpsAgent = new https.Agent({ keepAlive: false, maxSockets: 30, timeout: 2000 });
const httpsOpts = {
protocol: 'https:',
method: 'POST',
hostname: 'postman-echo.com',
path: '/post',
port: 443,
agent: httpsAgent
};
function imitateSentrySend(httpModule, event) {
return new Promise((resolve, reject) => {
const opts = Object.assign(httpsOpts, event.options);
const req = https.request(opts, function(res) {
res.setEncoding('utf8');
if (res.statusCode >= 200 && res.statusCode < 300) {
resolve({ status: res.statusCode });
} else {
reject(new Error(`Request failed with statusCode: ${res.statusCode}`));
}
res.on('data', () => {});
res.on('end', () => {});
});
req.on('error', reject);
req.end(JSON.stringify(event));
});
}
const app = express();
// Fake Sentry RequestHandler (binds domain)
app.use(function(req, res, next) {
const local = domain.create();
local.bind(req);
local.bind(res);
local.on('error', next);
local.run(() => {
next();
});
});
// Normal express route
app.get('/debug-sentry', function(req, res) {
const error = new Error('Random error');
error.randomData = crypto.randomBytes(1024 * 1024); // 1mb
throw error;
});
// Fake ErrorHandler, always sending event
app.use(function(err, req, res, next) {
if (err) {
const id = crypto.randomBytes(8).toString('hex');
const event = { id, msg: 'Some random error has occurred, sending to Sentry...' };
imitateSentrySend(https, { ...event, data: err.randomData }); // they don't await in their code either
next(err);
} else {
next();
}
});
app.listen(9999);
const http = require('http');
function r() {
return new Promise((resolve, reject) => {
const req = http.get('http://localhost:9999/debug-sentry', (resp) => resolve(resp));
req.on('error', reject);
});
}
(async function main() {
while (true) {
const promises = Array(100).fill(0).map(() => r());
try {
await Promise.all(promises);
} catch (err) {
console.log('Failed to resolve promises', err);
}
}
})(); |
Fwiw, I’m somewhat optimistic about being able to put together a reproduction myself, and then using this as a regression test. In the worst case, we could still apply something like the patch above and hope for the best, although it would of course be nice to actually confirm that it does fix a real issue. |
@addaleax Sounds good! I didn't know if you needed definitive proof to able to put it into the codebase which is why I was asking. I know I keep saying it but thanks for the help! This is the only thing preventing us from upgrading to the higher number 12.X and we'd love to be able to use the |
@saniagh Ran into this issue without Sentry. Of note, we both use |
Indeed, I've also checked to see if any module uses the I've upgraded our version of I'll be back if I get something. |
If `asyncReset()` is used to specify an alternative resource object to mark a re-used socket in the HTTP Agent implementation, store that object and keep it alive, because domains rely on GC tracking for resource objects to manage their own lifetimes, and previously that resource object might have been garbage-collected too early, leading to crashes. Fixes: nodejs#30122
@addaleax Thanks so much 💯 I appreciate you taking the time to look into and fix it. Your test code is very helpful in learning how to debug/reproduce things like this in the future. Thanks! EDIT: Any idea on how long this will take to get cut into a release for 12.X? |
@emhagman So, typically the way this works is that the PR takes at least 48 hours to land unless explicitly being fast-tracked (I don’t think this qualifies as a trivial change), plus the time to the next Current/13.x release (Tuesday according to nodejs/Release#487); and then the LTS rules says that commits need to have been released two weeks before being backported into LTS, but if there’s a strong case for doing so sooner, that’s usually not an issue for a low-risk patch like this one. |
@addaleax Got it, thanks 👍 |
Awesome job @addaleax, thank you! :) |
If `asyncReset()` is used to specify an alternative resource object to mark a re-used socket in the HTTP Agent implementation, store that object and keep it alive, because domains rely on GC tracking for resource objects to manage their own lifetimes, and previously that resource object might have been garbage-collected too early, leading to crashes. Fixes: #30122 PR-URL: #30196 Reviewed-By: Vladimir de Turckheim <vlad2t@hotmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
If `asyncReset()` is used to specify an alternative resource object to mark a re-used socket in the HTTP Agent implementation, store that object and keep it alive, because domains rely on GC tracking for resource objects to manage their own lifetimes, and previously that resource object might have been garbage-collected too early, leading to crashes. Fixes: #30122 PR-URL: #30196 Reviewed-By: Vladimir de Turckheim <vlad2t@hotmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
If `asyncReset()` is used to specify an alternative resource object to mark a re-used socket in the HTTP Agent implementation, store that object and keep it alive, because domains rely on GC tracking for resource objects to manage their own lifetimes, and previously that resource object might have been garbage-collected too early, leading to crashes. Fixes: #30122 PR-URL: #30196 Reviewed-By: Vladimir de Turckheim <vlad2t@hotmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
We still see the issue with node:12.14.1 (and above), @sentry/node and express
|
We still see this issue. I don't understand enough about the node garbage collection, but this does not seem to work. Keeping a strong reference myself to the domain makes the problem go away:
|
I am also still seeing this issue. However I cannot always reproduce this result. Maybe it is a race condition or something? It occurs here: https://github.com/nodejs/node/blob/v14.18.1/lib/domain.js#L97
|
We ran into this in node 16 in our CI the other day: Specifically, it was in 16.12.0. Couldn't repro it locally, and restarting the job a few times fixed it, so agree that it's flaky. |
Can confirm that this or something like it remains in play. Same stack trace as those newer reports above, under both v17.1.0 and v16.13.0:
However, running the same script under v12.22.7 does work as expected. I can very reliably reproduce the problem on newer node versions, so that's something. I suppose I should open a new issue entirely......? |
Version: 12.13.0
Platform: Alpine 3.9 (Linux x86_64) on AWS ECS
Subsystem: domain
We use
@sentry/node
to log our errors inexpress
, which is using thedomain
module here:https://github.com/getsentry/sentry-javascript/blob/master/packages/node/src/handlers.ts#L271
So far it is extremely hard to reproduce but if we run our production server for around 10 minutes, our application will crash with the above error and no other stack trace. At first, we thought it was related to #28275 but it appears to be different as it looks like the fix for this landed around
12.8.0
and we can confirm that this happens in12.13.0
still.Versions
12.2.0
or below do not have this issue.I believe the actual line is 78, not 76 which can be seen here:
https://github.com/nodejs/node/blob/master/lib/domain.js#L78
EDIT: Thanks @richardlau, line 79 in v12.13.0 https://github.com/nodejs/node/blob/v12.13.0/lib/domain.js#L79
The text was updated successfully, but these errors were encountered: