Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pling thread safe by using a connection pool and thread local storage #2

Merged
merged 4 commits into from Jul 2, 2014

Conversation

mrnugget
Copy link
Contributor

This is an attempt to make the gem thread-safe by changing the way the gateways handle their connections.

Since every gateway is in practice a singleton to which every thread has access when calling Pling.deliver race-conditions and data corruptions might arise.

Especially the long-lasting connection in the APN gateway poses a problem. But the caching of connections in C2DM and GCM gateways via instance variables is not totally thread-safe either.

This change uses a connection pool for APN, so that each thread gets its own connection (which is kept alive) as soon as it wants it.

In the C2DM and GCM gateways this implementation uses thread local storage to cache the connection object instead of an instance variable.

What do you think?

The connections to APN are the only connections in pling that are
steadily held open (Apple requires this).

Without a connection pool there might be race-conditions when using
pling in a multi-threaded context.

This adds a connection pool so that only one thread at a time can use an
open connection to APN.
@benedikt
Copy link
Contributor

I like the connection pool for APN, that certainly makes sense. 👍

I'm not entirely sure about the changes to C2DM and GCM. Are you sure those are a problem as well? As far as I know, Faraday is thread-safe and the "connection" as it is now, is basically a factory for requests, which are encapsulated in their own instances. What problem am I missing here? :)

@mrnugget
Copy link
Contributor Author

I don't think you're missing anything. I was wondering the same thing: is this necessary? The reason for my change was this: I couldn't find anything that said that Faraday is thread-safe. And: ||= is definitely not thread-safe. So I changed it to use the thread-local storage, to be on the safe side, so to say.

My other attempt was to not cache the connection at all. That would create a new object every time a request is made, but as from my understanding of the Ruby GC that shouldn't be a problem, since it's basically a local object that can be swept on every run.

I actually implemented that and then changed it to Thread.current in the last minute. That's why the specs are now failing: changed it to Thread.current, didn't rerun the tests and now they're failing. Will try to fix it.

Or do you think using no cache is the better solution?

Instead of caching the connections in the C2DM and GCM gateways in
instance variables, we use the thread local storage provided by
`Thread.current`.

The reason for that is thread safety, since gateway objects are
practically singleton objects that can be used by multiple threads at
the same time.

With this change each thread that calls `deliver` on a gateway gets a
fresh connection object and doesn't interfer with the other threads'
state.
@mrnugget
Copy link
Contributor Author

Fixed the specs and did a force push.

@benedikt
Copy link
Contributor

You're right, ||= is not thread-safe… However in this case, the worst thing that could happen is that an existing object would be replaced by another one, with the same configuration. It can't be set to nil accidentally, can it?

It's not that I'm totally opposed to Thread.current, I'm just very careful when to use it, because it's basically a global variable for the entire thread. As pling itself is not spawning threads on its own, we can't be sure what else Thread.current is used for.

Not using a cache might be an option too, yes. However it might be a problem when sending a lot of messages in a short amount of time. But then again there are probably other issues as well in this case…

One more thing that came to my mind: What about wrapping the gateway in a pool instead of the connections? This way pling could - in a way - handle the "thread safety" for all gateways, making it much simpler to develop thread-safe gateways…?

@benedikt
Copy link
Contributor

One more thing about Thread.current: While it might be a rare use-case, using Thread.current will prevent having multiple instances of the GCM and C2DM gateways with different configurations.

@mrnugget
Copy link
Contributor Author

Thinking about it some more, I agree with your concern over Thread.current, especially since we're talking about a gem here. Using Thread.current here is essentially hiding a global.

And I think the most reasonable approach, at least in my opinion would just be to not cache the Faraday connection. That's the most simple and easiest to understand solution.

Using a connection pool for each gateway could either slow sending a lot of messages down (since threads had to wait for the gateway, even if it's just a stateless HTTP connection) or cause a higher memory usage. Even though the approach is still nice and does make things easier, I'd prefer to not cache the connections in GCM/C2DM. Opinions?

@fabrik42
Copy link
Contributor

Ok, here is what I'm thinking about:

I share the concerns about Thread.current, especially because this would introduce strange and subtle behavior when initializing multiple objects with different configurations. At the moment, when initializing a second instance, you could pass another endpoint to the initializer and the existing connection would be silently dropped (including its endpoint).

The connection pool is a good idea, but I think in general you want a connection pool not to be inaccessible (wrapped inside a private method of a class), because this means you can't share the connection pool, which is kind of the point of the connection pool concept.

The current implementation allows sharing the connections over multiple threads, but not over multiple instances in the same thread. I think the right way to implement this would be to separate configuration/connection from the rest of the class, so you can have one connection pool by configuration und reuse the pool in other instances - similar to the way ActiveRecord does it.

But I'm not sure if this is beyond this PR and we should tackle the problem another time.

Please note that the TravisCI builds fail, because REE and 1.8.7 are not supported by celluloid (anymore?). Also the RBX build fails, but for other reasons I guess.

I think we can drop support for REE and 1.8.7. Not sure what to do about RBX as I don't use it.

@fabrik42
Copy link
Contributor

So I'd say:

  • Go with the connection pool
  • Don't use Thread.current, but recreate the connection every time to make it thread safe
  • Drop support for REE and 1.8.7
  • Because I don't know anyone that uses this lib with Rubinius: Drop support for RBX as well. If someone needs it we can re-add it later (but I doubt it).

@benedikt
Copy link
Contributor

Looks like we agree about Thread.current then. While I still think that there's not really an issue with using the instance variable and ||=, I'm fine with simply dropping the connection entirely.

I don't agree with implementing the connection pool as a public method. The idea is to have multiple connections available, so that the gateway can use different connections for different threads. There's no reason why anything else but the gateway should use those connections.

About the REE and 1.8.7 support: I agree that it's safe to drop them.

About the Rubinius issue: Simply change the .travis.yml to state rbx-2 instead of rbx-19mode. This should fix the current issue. It would be sad to add thread-safety and drop support for one of the few ruby implementations that allows native threads ;-)

@fabrik42
Copy link
Contributor

Just for the record: I didn't mean to make the connection just a public method but to separate it from the rest of the class entirely. And this has a point, because at the moment, if you initialize 5 instances of an APN gateway, you will have 5 connection pools with the exact same configuration, which makes no sense.

Connection pools normally have to serve both purposes: multiple instances and multiple threads.
You wouldn't be too happy if ActiveRecord would open up a new connection pool for every instance ;)

@benedikt
Copy link
Contributor

In pling a gateway is basically what an adapter is in Active Record. It takes the credentials and connects to the push service, while implementing a standard interface to send messages via this particular service. The APN::Connection is just an implementation detail of the gateway, just like PG::Connection is an implementation detail of the PostgreSQL adapter in ActiveRecord. There's not much use in having multiple gateways of the same type with the same credentials. Therefore sharing the connection pool does not make much sense either.

One example use-case of having multiple gateways of the same type is to send push notifications to different iOS apps. (Think: Facebook App/Facebook Pages App or Foursquare / Swarm)

Using Thread.current is basically using a hidden global variable. This
is simpler and safer.
@mrnugget
Copy link
Contributor Author

Alright, pushed the changes. I don't get the error ruby-head is producing, but the rest seems fine. I'd like to remove ruby-head and add 2.0 and 2.1 explicitly.

@fabrik42
Copy link
Contributor

fabrik42 commented Jul 2, 2014

@benedikt Yes, these are good points. It's still some kind of unexpected behavior for me, but we can discuss this another time.

@mrnugget Thanks!

fabrik42 added a commit that referenced this pull request Jul 2, 2014
Make pling thread safe by using a connection pool and thread local storage
@fabrik42 fabrik42 merged commit 9ee6bf6 into flinc:master Jul 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants