HTTP client connection pool timeout doesn't free up queue #713

DerGuteMoritz · 2024-02-12T11:20:39Z

Problem

The pool-timeout option of aleph.http/request controls how long the caller is willing to await a connection to be acquired from the connection pool. When it expires, the response deferred is rejected with a PoolTimeoutException. However, the "acquire" operation will still clog up the pool's queue until it gets its (by then unnecessary) turn.

Reproduction test case (in aleph.http-test):

(deftest test-pool-timeout
  (let [starved-pool (http/connection-pool
                      {:total-connections 0
                       :max-queue-size 1})]
    (try
      (is (thrown? PoolTimeoutException
                   @(http-get "/" {:pool starved-pool :pool-timeout 2})))
      (is (thrown? PoolTimeoutException
                   @(http-get "/"  {:pool starved-pool :pool-timeout 2})))
      (finally
        (.shutdown ^Pool starved-pool)))))

The second assertion fails with:

expected: (thrown?
           PoolTimeoutException
           @(http-get "/" {:pool starved-pool, :pool-timeout 2}))
   error: java.util.concurrent.RejectedExecutionException

Solution

Eagerly removing the acquire operation from the pool's queue would require changing io.aleph.dirigiste.Pool#acquire to support this. Its (internal) queue already supports this via cancelTake, it just isn't reachable from the public API.

Alternatively: Perhaps the current behavior is fine / good enough? It just happened to surprise me but that could also be addressed with a docstring change.

The text was updated successfully, but these errors were encountered:

KingMob · 2024-02-15T10:33:17Z

Roping in @alexander-yakushev in case they have a comment.

It would be nice to cancel the pending acquire if it's taken too long, but it would also be fine w me to xform the rejected ex into a pool timeout ex.

If you exhaust all conns in reality, you're going to get exceptions of some sort.

alexander-yakushev · 2024-02-15T12:00:22Z

I have a few comments about this indeed.

I found that in practice (in my work, I can be biased), Aleph's approach to having a separate queue for acquiring connections is unnecessary. Caching connections is vital because you can't keep creating and disposing them at a high rate (due to TIME_WAIT), but limiting the total number of them is not as important. I personally set the connection limit to 16k per host and pool queue size to 0, so that a fresh connection is immediately created for the acquirer each time there are no free connections in the pool, up until the limit is hit at which point it's an error.

There are some cases where you would use the pool with the queue as a parallelism controller. I think all those cases could be rewritten to something like Semaphores if Aleph connection pools offered no queueing.

So, this is my take. In my world, the breaking Aleph2 would have connection pools with no acquisition queues. Such pools are much much easier to implement, way more performant at high loads, and give the user one less option and timeout to think about. But I understand that people can have other usecases where this won't fly.

alexander-yakushev · 2024-02-15T12:13:01Z

Sorry I haven't answered the original question. I think the current behavior is fine. I doubt anybody fine-tunes :pool-timeout and handles it somehow specially, it is an awkward option to give to the user and work with.

DerGuteMoritz · 2024-02-16T15:15:36Z

Thanks for chiming in @alexander-yakushev, very good insights! I agree that the acquire queue and timeout are quite awkward and needlessly complicate matters. I also agree that limiting parallelism should rather be handled by the caller/application itself. For example, if I'm already using core.async, I'd rather use that instead of having to also deal with RejectedExecutionException and PoolTimeoutException. I wonder whether this must necessarily be considered a breaking change: Most users probably won't notice much of a difference and, as pointed out by @KingMob, they will already have to deal with exceptions due connection exhaustion in some manner anyway. And we could just keep around the old pool implementation (maybe move to a namespace which isn't loaded by default) so users who do rely on that specific behavior can still swap it back in. I filed #716 to track this idea.

As for status quo: Given all of the above, I think it's indeed fine to keep the behavior as-is. Closing but @KingMob feel free to re-open if you'd still like to see a change!

DerGuteMoritz changed the title ~~HTTP connection pool timeout doesn't free up queue~~ HTTP client connection pool timeout doesn't free up queue Feb 12, 2024

DerGuteMoritz mentioned this issue Feb 16, 2024

Drop HTTP client connection pool's acquire queue #716

Open

DerGuteMoritz closed this as completed Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP client connection pool timeout doesn't free up queue #713

HTTP client connection pool timeout doesn't free up queue #713

DerGuteMoritz commented Feb 12, 2024

KingMob commented Feb 15, 2024

alexander-yakushev commented Feb 15, 2024 •

edited

Loading

alexander-yakushev commented Feb 15, 2024 •

edited

Loading

DerGuteMoritz commented Feb 16, 2024 •

edited

Loading

HTTP client connection pool timeout doesn't free up queue #713

HTTP client connection pool timeout doesn't free up queue #713

Comments

DerGuteMoritz commented Feb 12, 2024

Problem

Solution

KingMob commented Feb 15, 2024

alexander-yakushev commented Feb 15, 2024 • edited Loading

alexander-yakushev commented Feb 15, 2024 • edited Loading

DerGuteMoritz commented Feb 16, 2024 • edited Loading

alexander-yakushev commented Feb 15, 2024 •

edited

Loading

alexander-yakushev commented Feb 15, 2024 •

edited

Loading

DerGuteMoritz commented Feb 16, 2024 •

edited

Loading