Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup sockets when creating new connections #46

Merged
merged 16 commits into from
Apr 29, 2016

Conversation

dmiddlecamp
Copy link
Contributor

We noticed a bug where many thousands of simultaneous connections can be leaked. This closes sockets and also prevents uncatchable exceptions from being thrown when many quick connection attempts are made in a short period of time.

@troy
Copy link
Contributor

troy commented Oct 29, 2015

@kenperkins if we can help test and/or review this, I'm at your service

@kenperkins
Copy link
Owner

@dmiddlecamp can you expand on what you're seeing? I'd like to better understand.

@dmiddlecamp
Copy link
Contributor Author

Sure sure!

We found an installation running this module that had spawned many thousands of connections to papertrail, which caused us to be rate limited. Since the TLS session is created independently of the TCP session, it's possible for the module to attempt hundreds (or thousands) of simultaneous connections while it's waiting on the TLS session to be established. These sockets aren't cleaned up, and the rate limiting wasn't enforced in this scenario. This patch guarantees that the module will never open more than 1 socket at any given moment, and will cleanup the socket in the event that the TLS session doesn't succeed.

We were also seeing uncatchable crashes when many connection attempts were made quickly, so I've also added a fix for that and a unit test to confirm.

edit: When I say "an installation", I mean all copies of all our microservices in multiple cloud environments using this module. We had... a lot of connections open to papertrail.

edit 2: I could certainly be wrong here as well if I misunderstood the cause, but some socket cleanup logic felt like a pretty safe / straightforward approach to this either way.

Thanks!
David

self.socket = null;
}
}
catch (e) { }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're just catching here and swallowing the error. I'm wondering if we should emit it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the destroy here prevents the error from being thrown as opposed to close or end. The try/catch does nothing to catch that socket error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we could certainly emit an error event on the class object, at least that would be catchable. But it wouldn't be the aforementioned uncatchable socket error in TLS.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was wondering; is there value in self.emit(e)? Not being super familiar with when socket.destroy() throws I'm speculating.

@dmiddlecamp
Copy link
Contributor Author

Also BTW, thank you for taking the time to review, much appreciated! :)

@kenperkins
Copy link
Owner

Did we decide whether or not we should emit those two exceptions?

I've typically biased to emitting so that if there's something weird going on, you bubble it up to a higher level where it might still be visible. I've learned that swallowing errors in your logging is generally bad :)

@dmiddlecamp
Copy link
Contributor Author

I'm okay with emitting those if you feel strongly about it, but I only try to emit errors for things where something should / can be done. In a scorched earth / cleanup scenario I'm not sure what recovery steps we'd take out of band between disposing sockets and re-establishing them?

@kenperkins
Copy link
Owner

I think the only thing that could be done is being able to log something out to the console for investigation purposes. And given that I don't even know what this will throw, lets not bother for now.

If we see evidence to the contrary, we can revisit.

@matteocontrini
Copy link

I tend to agree with @dmiddlecamp, I wouldn't know what to do knowing that winston-papertrail had an internal error. Would I log that the logging library had an error? :P

@kenperkins
Copy link
Owner

So be it :)

@dcollinsf5
Copy link

Hi guys!
I believe that this issue impacted our production app this last week during a Papertrail outage. Could you guys shed some light on this pull request? Does it resolve the many-connections issue, and if so why isn't it merged?

@troy
Copy link
Contributor

troy commented Apr 28, 2016

@dmiddlecamp could I trouble you to pull master and merge it? We're merging this and an unrelated PR into master, then letting folks use it for a few weeks before making a release.

@dmiddlecamp
Copy link
Contributor Author

Sure thing, I'll take a look

@dmiddlecamp
Copy link
Contributor Author

Looks like it was just a package.json conflict?

@troy troy self-assigned this Apr 29, 2016
@troy troy merged commit 1bc8933 into kenperkins:master Apr 29, 2016
@troy
Copy link
Contributor

troy commented Apr 29, 2016

If you're reading this message, we'd love if you can pull the current master of winston-papertrail and run it. We'll wait at least 2 weeks to see whether any issues come up from wider use of this and one other change. If you encounter problems with master which don't occur in the current release, please open a new GitHub Issue with the transcript and reference this PR.

@troy
Copy link
Contributor

troy commented Aug 24, 2016

This is live on NPM in version 1.0.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants