Avoid using shared connection pool #15

matsluni · 2019-10-16T08:00:02Z

With the current implementation a shared connection pool per ActorSystem is used for all requests.

Its probably better to have a dedicated connection pool per Host.
See akka/alpakka#1958 and akka/alpakka#1983 for similar issue and PR.

The text was updated successfully, but these errors were encountered:

gabfssilva · 2019-10-17T14:43:59Z

Any idea how to ideal with multiple regions? I wonder since depending on the region, it's a different host, so, it should handle multiple connection pools depending on the aws client usage.

matsluni · 2019-10-18T08:03:02Z

Hi @gabfssilva, thanks for giving this a thought. Yes, we would need multiple pools, each for every aws service endpoint (also possible multiple regions per service).

A first naive idea coming to my mind is getting the service url from httpRequest.uri and kind of build a map/cache with ServiceUrl -> ConnectionPool. But I don't know how feasible this is. This would be in the hot code path for every request.

gabfssilva · 2019-10-18T11:39:55Z

That's what I thought too. A synchronized map should be enough. Well, I'll think of something.

matsluni · 2019-10-23T07:59:27Z

I had another idea how a design for this could look like.

What if we extend the builder of the Akka async client with something like withCachedPoolSettings (maybe a more suitable name is better), where we let the user provide the endpoints and regions, used in user code. Out of this, we construct the map of cachedConnectionPools and for the request its just a simple lookup, without any thread synchronization needed.

We can also decide if we want to fail (exception), if an endpoint is not in the map or fallback to the sharedPool.

This approach makes it configurable for the user and avoid the potential synchronization performance penalty.

WDYT?

gabfssilva · 2019-10-24T03:52:55Z

I think it can be done, the only problem here is that the user would need to know which domains he needs to set up. Each AWS service has a different domain, also, using "fake aws" also implies in using different endpoints.
I fear it become complex.

Instead of using a syncronized map we could use an actor to handle the pools:

 //if the pool does not exist, it's created here
val pool = (pools ? Gimme(domain)).mapTo[Pool]

for {
  p <- pool
  r <-  p.offer(request, promise)
  //check `r` if the request is queued
} yield promise.future

I ran a POC over here and it worked quite well, but, hard to measure any performance pernalty over the singleRequest approach. The only issue here is: the first request will always be much slower than the following ones, but, I'm not sure it happens already using singleRequest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid using shared connection pool #15

Avoid using shared connection pool #15

matsluni commented Oct 16, 2019

gabfssilva commented Oct 17, 2019

matsluni commented Oct 18, 2019

gabfssilva commented Oct 18, 2019

matsluni commented Oct 23, 2019

gabfssilva commented Oct 24, 2019 •

edited

Loading

Avoid using shared connection pool #15

Avoid using shared connection pool #15

Comments

matsluni commented Oct 16, 2019

gabfssilva commented Oct 17, 2019

matsluni commented Oct 18, 2019

gabfssilva commented Oct 18, 2019

matsluni commented Oct 23, 2019

gabfssilva commented Oct 24, 2019 • edited Loading

gabfssilva commented Oct 24, 2019 •

edited

Loading