Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis connection to *** failed - connect ECONNREFUSED #88

Closed
Prinzhorn opened this issue Dec 4, 2013 · 9 comments
Closed

Redis connection to *** failed - connect ECONNREFUSED #88

Prinzhorn opened this issue Dec 4, 2013 · 9 comments

Comments

@Prinzhorn
Copy link

Node v0.10.22 on Heroku
Newrelic 1.0.x (I will upgrade to 1.1.x if it matters)

I'm not sure if this is an issue with New Relic itself.

It seems like the connection to my remote redis instance dropped for about two seconds. Which is not an issue itself, maybe the network had a hiccup.

What does concern me though is that over the course of about two seconds I got hundreds of these logentries with basically zero load on the server, which means that something else was going crazy over this and I assume it was newrelic.

Error: Redis connection to *** failed - connect ECONNREFUSED
at RedisClient.on_error (/app/node_modules/redis/index.js:189:24)
at Socket. (/app/node_modules/redis/index.js:95:14)
at Socket.EventEmitter.emit (events.js:95:17)
at net.js:441:14
at /app/node_modules/newrelic/node_modules/continuation-local-storage/node_modules/async-listener/glue.js:177:31
at process._tickDomainCallback (node.js:459:13)
at process. (/app/node_modules/newrelic/node_modules/continuation-local-storage/node_modules/async-listener/index.js:18:15)

@groundwater
Copy link
Contributor

Hi @Prinzhorn, it's hard to tell immediately if this is caused by the newrelic module or not. Due to the way the module instruments code, it will always end up in an error stack trace somewhere.

Having said that, we are definitely interested in making sure the module is not interacting with your product in a negative way, so I am not ruling anything out from the start.

If you're willing to do a little troubleshooting we may be able to hunt down what's going wrong.

  1. does the error occur every time you deploy or start your application?
  2. does the error disappear if you do not include the newrelic module?

We provide a detailed log of what the agent is doing, but on Heroku there is no persistent file system so I believe we stream the log to your heroku logs output. (confirm @othiym23 ?) Try setting the log level to trace in your newrelic.js file and grabbing as much log output as possible. Feel free to send a gzipped log file to us (jacob@newrelic.com and forrest@newrelic.com), and we'll take a look.

@Prinzhorn
Copy link
Author

I will try to create a test case locally. I don't want to mess with the production env in any way.

But on the other hand, will newrelic work locally? I don't want to stats to be mixed with the production stats.

@groundwater
Copy link
Contributor

The data generated by the newrelic module is segmented by application name, and license key. If you give your local application a different name from production it will show up as a separate application in the web ui.

The newrelic.js file is a proper node module, so you can include javascript code to change the application name based on environment, for example:

var APP_NAME;
if (process.env.NODE_ENV == 'production') {
  APP_NAME = "My App (Production)";
} else {
  APP_NAME = "My App (Development)";
}

exports.config = {
  app_name : [APP_NAME],
  license_key : 'license key here'
};

@Prinzhorn
Copy link
Author

Never mind, the underlying error was caused by redis/node-redis#457
I've upgraded the redis module.

@Prinzhorn
Copy link
Author

I'm reopening this because I just noticed the following: In those few seconds where the redis instance was down, new relic says I had 14.3k rpm (not sure what the peak in this particular minute was, about 29k I guess from the response time graph). It's usually orders of magnitude less than that.

I know that this was caused by an infinite loop (I didn't exit the process in the uncaughtException event so basically I ended up with an error every tick). So I guess rew relic can't do anything about it, but maybe just think about it for a moment and then close this issue. Maybe it will suddenly dawn on you that there's a flaw with the new relic reporter as well.

@Prinzhorn Prinzhorn reopened this Dec 10, 2013
@othiym23
Copy link
Contributor

New Relic isn't originating any of these requests. It's just observing them. The default node-redis error-handling strategy is to make sure that the entire response is parsed before any errors are thrown, so it next-ticks them.

I'm not 100% sure, but I think the error count metric inside New Relic may count towards the RPM figure, which would account for the request spike you saw. I'm still working out with the team responsible for that part of the application whether that's the case. If so, it's a little confusing, but it's not really a bug in the Node module, just an inconsistency between how New Relic thinks about requests and how things happen in an asynchronously concurrent environment like Node's. (Not the first area in which this has happened.)

Unless you've got a reproducible test case we can look at, there's nothing we can act on here. Either way, thanks for the information!

@Prinzhorn
Copy link
Author

I should have mentioned that the error rate/count was zero during this period. That's because during the processing of the error event there was an uncaught exception inside the redis module (TypeError, as mentioned in the linked node-redis issue redis/node-redis#457) and new relic never got the chance to process it I guess.

Unless you've got a reproducible test case we can look at, there's nothing we can act on here. Either way, thanks for the information!

I can try to create one by forcing a type error like (void 0).foo or whatever inside a error event, if that helps. The thing is that usually a TypeError would take down the node process, unless you provide a custom uncaughtException handler. I have since removed the custom handler and let the process die in order to prevent these inconsistent states.

@othiym23
Copy link
Contributor

If your process is caught in a tight loop of uncaughtException listeners firing, it's probably not getting data out to New Relic in the first place. I feel like I'm missing something that would help me help you figure out what's going on. Sorry! 😒

Just FYI, v1.2.0 of New Relic for Node uses a completely different, much more unobtrusive mechanism for error-handling that should deal with this kind of weirdness better. If you haven't given it a spin yet, please do!

@Prinzhorn
Copy link
Author

If your process is caught in a tight loop of uncaughtException listeners firing, it's probably not getting data out to New Relic in the first place.

I did definitely not get 14k requests within one second. So something must have been sent to new relic.

Just FYI, v1.2.0 of New Relic for Node uses a completely different, much more unobtrusive mechanism for error-handling that should deal with this kind of weirdness better. If you haven't given it a spin yet, please do!

I will upgrade within the next days!

jsumners-nr pushed a commit to jsumners-nr/node-newrelic that referenced this issue Apr 11, 2024
…protobufjs-6.11.3

Bump protobufjs from 6.11.2 to 6.11.3
jsumners-nr pushed a commit to jsumners-nr/node-newrelic that referenced this issue Apr 16, 2024
bizob2828 added a commit to bizob2828/node-newrelic that referenced this issue Apr 19, 2024
bizob2828 added a commit to bizob2828/node-newrelic that referenced this issue Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants