Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blocking calls swallow InterruptedException; thread-pools cannot be shutdown cleanly #266

Closed
olttwa opened this issue May 19, 2022 · 7 comments

Comments

@olttwa
Copy link

olttwa commented May 19, 2022

Issue

When making blocking calls like BLPOP from a thread-pool, carmine doesn't exit when InterruptedException is thrown to shutdown the pool. Instead, it continues polling redis until the defined timeout.

How to reproduce

(ns foobar.redis
  (:require
    [taoensso.carmine :as car :refer (wcar)]
    [com.climate.claypoole :as cp]))

(def redis-conn {:pool {} :spec {:uri "redis://localhost:6379/"}})
(defmacro wcar* [& body] `(car/wcar redis-conn ~@body))

(def pool (cp/threadpool 1))
(defn blpop-fn [pool]
  ; Block until 100s or an element is present in the list.
  (println "Long-polling redis...")
  (println
    (wcar* (car/blpop "random-list" 100000)))
  (println "Exiting the loop..."))

(cp/future pool (blpop-fn pool))

; Despite sending an InterruptedException,
; blpop-fn will not exit until 100s.
(cp/shutdown! pool)

Suggestions

Celtuce and Obiwan expose a (close-conn) function.
Can Carmine also expose a similar method? Right now, the connection pool is encapsulated with only Carmine having access to release-conn method

@ptaoussanis
Copy link
Member

@olttwa Hi Akshat,

; Despite sending an InterruptedException, blpop-fn will not exit until 100s.

Correct, in general throwing an InterruptedException will not automatically break any active connections. To do so would be quite dangerous (e.g. risk data corruption or leaving data in an inconsistent state).

Redis (and so Carmine) don't by default treat blocking commands differently to non-blocking commands. In particular, it could be that a blocking command is used as part of normal command pipeline.

Celtuce and Obiwan expose a (close-conn) function. Can Carmine also expose a similar method? Right now, the connection pool is encapsulated with only Carmine having access to release-conn method

Carmine does provide the ability to close connection pools (pools implement java.io.Closeable), and connections can be manually closed with carmine.connections/release-conn (pool method) or carmine.connections/close-conn (connection method) - though for both you'd need to provide the specific connection which the default wcar API doesn't normally give you access to.

It would be helpful to understand better exactly what you're trying to achieve.

Is there a reason you'd want to open such a long-held blocking command? If you're looking to implement polling behaviour, the typical pattern would be to set up a much shorter blocking call, then loop while (not (Thread/interrupted)), etc.

This way you don't need to force terminate any open connections (which can be risky).

I'll also note that your example as-is would not print "Exiting the loop..." even if the connection were force-terminated; you'd need that println to be in a finally block.

Hope that helps!

@olttwa
Copy link
Author

olttwa commented Jul 25, 2022

Hello @ptaoussanis,
Thanks for your response. Apologies for not getting back earlier.

It would be helpful to understand better exactly what you're trying to achieve.

I'm building a sidekiq-like background-job processing library for Clojure Goose.
When doing a graceful shutdown, a server stops accepting more traffic. In the same manner, a worker should stop pulling new jobs from queue. Hence, the need to close connections immediately.

typical pattern would be to set up a much shorter blocking call, then loop while (not (Thread/interrupted))

This is exactly what I ended up building: link to code.
However, a long-poll of 300 seconds timeout will have better performance+SLIs compared to 300 polls of 1 second timeout.

connections can be manually closed

I tried doing this but no luck. The connection still remain active. Please find my code:

(defmacro close
  {:arglists '([conn-opts :as-pipeline & body] [conn-opts & body])}
  [conn-opts]
  ; I expect to receive same connection pool since call to `conn-spec` & `conn-pool` is memoized.
  `(let [[pool# conn#] (conns/pooled-conn ~conn-opts)]
     ; Despite closing the connection pool, BLPOP stays active.
     (.close pool#)
     ; Releasing works on a single connection from the pool, hence not useful. Right?
     (conns/release-conn pool# conn#)
     (conns/close-conn conn#)))

; Call above macro with same conn used for BLPOP.
(close conn)

Can you suggest a way of closing a connection in Carmine? We're aiming for a long-polling design, not multiple short-polls.

Thanks for your help 😄

@lukaszkorecki
Copy link

Just adding my 2c - having explicit connection management would make Carmine work more like JDBC, RabbitMQ and other client libraries and make it easier to integrate with tools like Component.
Also, this comment summarizes it very well: #224 (comment)

@ptaoussanis
Copy link
Member

@lukaszkorecki Thanks for the input Łukasz, will be looking at this and several other issues for the next Carmine release (which is now next on my open-source TODO list).

@ptaoussanis
Copy link
Member

@olttwa Hi Akshat. I'm sorry, I'm afraid I'm struggling to understand your linked code or example macro.

How are you expecting to use this? It looks like you're borrowing a connection then immediately returning and closing it? Is there a reason this is a macro? It might be helpful if you could provide an example of how you're trying to call this.

I'll try address some code comments in your code below-

; Despite closing the connection pool, BLPOP stays active.

Correct, closing the connection pool currently just calls this close method - which will prevent new connections from being borrowed, but will not interrupt any connections currently in use (e.g. by a long-running blpop call).

The next version of Carmine will include an option for a hard pool shutdown that'll also interrupt in-use connections after a timeout.

; Releasing works on a single connection from the pool, hence not useful. Right?

Releasing a connection just returns it to the pool, but I'm not sure what you mean here by "not useful", sorry.

To interrupt an active connection, you can call conns/close-conn (or java.io.Closeable's .close) on the connection. That'll immediately close the underlying socket, which will interrupt any pending requests (e.g. blpop calls).

You'd need to get a hold of the connection though, so that you can call conns/close-conn on it.
That's currently needlessly tricky to do with the current API, though it is possible - you can see the wcar code for an example, and modify that for your needs.

Just make sure you pass out your polling connection returned from conns/pooled-conn so that you can close it as needed.

Apologies, I know that the connections API is currently poorly documented and quite opaque - especially if you want to do something outside of the typical use case.

@ptaoussanis
Copy link
Member

Closing due to inactivity, hopefully my responses above answer the questions posed here.
To summarise:

  • Closing a connection will indeed interrupt any blocking Redis commands.
  • Getting a hold of the connection to close it is a little awkward in the current Carmine v3 API. This will be easier with the upcoming v4 API, but in the meantime one could copy and tweak the code from the wcar macro.
  • In general though I'd recommend against the pattern of starting blocking calls without a timeout, or with an excessively long timeout. The typical pattern would be to long-poll for some reasonable amount of time (e.g. several seconds to a couple minutes), timeout of there's no activity, then loop (with or without a backoff, as appropriate). The time to establish a new poll will be milliseconds or less if pooling, and this'd avoid all kinds of issues with connections going bad.

Cheers!

ptaoussanis added a commit that referenced this issue Dec 3, 2022
…onstructor in main ns

Should hopefully help with the confusion re: pooling behaviour.
@olttwa
Copy link
Author

olttwa commented May 8, 2023

@ptaoussanis Apologies for not responding earlier & thanks for your detailed response.

The documentation in this commit & your issue comments have addressed my issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants