Skip to content

Latest commit

 

History

History
236 lines (187 loc) · 9.89 KB

README.md

File metadata and controls

236 lines (187 loc) · 9.89 KB

clj-pool-party

Simplistic, performant object pooling library for Clojure. Has no dependencies other than Clojure itself. Designed for multi-threaded environments. Supports virtual threads.

NOTE: While this library is being used in at least one production environment, the API is subject to change. Please provide any feedback you may have on the usefulness of this library with respect to your own object pooling use case.

Support this library

This library is fueled by dopamine hits generated by positive feedback. Show your support by "starring" the project. You can also DM me on the Clojurians Slack with your tales of how this library has been helpful to you.

Rationale

I was recently upgrading an application to JDK21 and was experimenting with virtual threads. A few of the libraries I was using for IO couldn't be used from virtual threads simply because of how they were using object pooling. I was able to work around these issues by writing my own object pool. In the process, I came to the conclusion that object pooling is neither complicated to implement nor should it be complicated to use.

  • Object pools should be thread-safe and virtual thread-safe
  • Object pools should be easy to use
  • One should be able to define a basic object pool using nothing but a generator function and a maximum size for the pool. Everything else is optional. Implementing an entire interface or learning the syntax of a fancy macro is needless complexity.

Does object pooling actually provide a performance benefit?

This depends on a number of factors. Suppose you have some arbitrary operation, f.

  1. Does f need to run in parallel?
  2. Does f require some object, obj whose combined creation + cleanup time is more than half a millisecond?
  3. Is obj potentially reusable across invocations of f?

If you answered "yes" to these three questions, then object pooling with clj-pool-party is likely a good fit for your scenario.

Benchmarks

Preliminary, naive benchmarks show clj-pool-party to be faster than alternatives. See clj-object-pool-benchmarks for more details.

Why didn't you just use these other libraries?

  • pool - This library wraps the Apache commons object pool. This is annoying in situations where you're trying to keep your dependency footprint light or you end up needing a different, incompatible version of a transitive dependency for some reason. The library hasn't seen any real activity since 2014. It wasn't clear at first glance if it's thread safe / virtual thread safe and I'm not about to start searching Apache commons object pool code to figure that out. Also, I didn't see a way to set a maximum size for the object pool and for all of my use cases, there exists a reasonable upper bound to the size of the object pool. Furthermore, you return an object to the pool by passing that object back to the pool, so each object generated by your generator function must have a unique hash.
  • deepend - This library has some neat features, but it's heavily dependent on another library. The source code seems far more complicated that an object pool needs to be. The library also requires you to provide your own "key" for each object when calling acquire / return and I'd rather just let the library generate and track the key on my behalf. This library hasn't seen any activity since January 2019.

Usage

Install from Clojars: Clojars Project

Import like so:

(ns com.example
  (:require [com.github.enragedginger.clj-pool-party.core :as pool-party]))

build-pool, with-object, and evict-all are the important functions. See their corresponding doc strings for more info after checking out the examples below.

Basic, contrived example

(ns com.example
  (:require [com.github.enragedginger.clj-pool-party.core :as pool-party])
  (:import (java.util UUID)))

;;manage at most 5 objects in the pool
(def max-size 5)

;;generate a new UUID when necessary
(defn gen-fn []
  (UUID/randomUUID))

;;build the pool
(def pool (pool-party/build-pool gen-fn max-size))

;;execute an arbitrary function in the context of an object from the pool
(pool-party/with-object pool
  (fn [uuid]
    (println (str "borrowing obj: " uuid))))

Health check example

(def id-atom
  (atom 0))

;;define a 0-arg generator function
;;clj-pool-party will call this function whenever it needs a new object for the pool
(defn sample-gen-fn []
  (let [new-id (swap! id-atom inc)]
    {:id new-id}))

;;define a health check function that takes an object from the pool as an argument
;;and returns a non-truthy value iff the object is considered unhealthy and should
;;be removed from the pool
(defn health-check-fn [x]
  (println "checking" x (-> x :id even?))
  (-> x :id even?))
  
;;construct a pool of max-size 5
;;borrow-health-check-fn will be called whenever we're about to re-use
;;an object from the pool. If it returns a non-truthy value, that object
;;will be removed from the pool and we'll acquire a different one.
;;
;;NOTE: borrow-health-check-fn is not called when a new object
;;has been created by calling `gen-fn`. clj-pool-party assumes that any
;;new, unused instances from `gen-fn` are healthy.
;;
;;return-health-check-fn will be called whenever we're returning an object
;;to the pool. If it returns a non-truthy value, that object
;;will be removed from the pool
(def pool-ref (pool-party/build-pool sample-gen-fn 5
                {:borrow-health-check-fn health-check-fn
                 :return-health-check-fn health-check-fn}))

;;Run a function in the context of an object from the pool
(pool-party/with-object pool-ref
  (fn [obj]
    (println "borrowing obj:" obj)))

Using clj-pool-party to make calls to an HTTP server using Hato and virtual threads

(ns com.example
  (:require [com.github.enragedginger.clj-pool-party.core :as pool-party]
            [hato.client :as client])
  (:import (java.net.http HttpClient)
           (java.util.concurrent Executors)))

;;limit to just 5 concurrent requests
(def max-size 5)

;;virtual threads are great for IO tasks
(def vthread-executor (Executors/newVirtualThreadPerTaskExecutor))

;;this function will be called any time the pool needs a new HttpClient instance
(defn build-fn []
  (client/build-http-client {:executor vthread-executor}))

(defn close-fn [^HttpClient http-client]
  (.close http-client))

;;wait at most 1 second for a connection from the pool and then throw an exception
(def wait-timeout-ms 1000)

;;build our pool
(def pool (pool-party/build-pool build-fn max-size {:close-fn close-fn
                                                    :wait-timeout-ms wait-timeout-ms}))

;;example of using the pool to acquire an HttpClient instance and make an HTTP call
(pool-party/with-object pool
  (fn [^HttpClient http-client]
    (client/get "https://www.github.com" {:http-client http-client})))

;;When you're done using a pool, you can remove and close all of the
;;objects in the pool by calling `evict-all`
;;If `:close-fn` was defined when the pool was created, `evict-all`
;;will pass each instance in the pool to the `close-fn`
;;note: `evict-all` doesn't swallow errors when calling `close-fn`.
;;If errors are likely to happen when calling `close-fn`, I recommend
;;handling those inside of `close-fn` otherwise,
;;`evict-all` isn't guaranteed to clean up all resources 
(pool-party/evict-all pool-ref)

Example scenario 1

I have an application that makes frequent calls to third party APIs. It uses the JDK native internal HttpClient via the Hato library. Establishing a connection to an HTTP server is known to be expensive, so there's potential performance improvements to be gained by re-using connections in the pool when calling the same server repeatedly.

Here's a table of the time required to make these calls when comparing pooled API calls vs non-pooled (i.e. new-connection-per-request) calls:

Simultaneous Calls New connection per request clj-pool-party
10 350 ms 241 ms
20 746 ms 252 ms
30 1041 ms 395 ms

Please note that this particular API server limits clients to 5 concurrent requests per account. For the clj-pool-party use case, we obey this restriction by setting max-size to 5. For the "new connection per request" route, we set up a basic Semaphore with 5 permits.

Using a connection pool for HttpClient instances when making repeated, parallel calls to the same server provides performance benefits that scale with the number of concurrent requests.

Dev

clj-pool-party has a mixture of unit and property based tests. If you see a gap in test coverage, feel free to contribute a PR with additional tests. I've designed this library with JDK21 and virtual threads in mind; you won't be able to run the tests without a JVM that has virtual threads enabled (i.e. JDK 21 or JDK19 with preview enabled). However, the runtime itself has no dependency on virtual threads so you should be able to run this on Java 8 if you really want to do it.

License

Copyright © 2023 Enraged Ginger LLC

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.