Skip to content
Destructuring plus type hinting for more performant clojure.
Clojure
Branch: master
Clone or download
tom
Latest commit afb6954 Jan 1, 2020
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
doc initial commit Dec 29, 2019
src/structural wierd keyword injection error Dec 31, 2019
test/structural initial commit Dec 29, 2019
.gitignore initial commit Dec 29, 2019
CHANGELOG.md initial commit Dec 29, 2019
LICENSE initial commit Dec 29, 2019
README.org Updated readme, spelling Dec 29, 2019
project.clj Optimized let generation, :as for vectors Dec 31, 2019

README.org

structural

This is a small library that provides macros to help convert destructured code into efficient code via type hints.

One of the big problems one runs into in practice is that the expressiveness of destructuring - while elegant - typically renders inefficient code from the clojure compiler. The default is more generic, polymorphic code. If we know the types and are willing to hint to optimize for performance, we’d like to have clojure just efficiently unpack our structures.

This is particularly important for code that lives on a hot loop, where (idiomatic) forms can lead to a “death by a thousand cuts” situation.

installation

https://img.shields.io/clojars/v/structural.svg https://clojars.org/structural

Add the current version from clojars to your project.clj or deps.edn.

(ns blah 
  (:require [structural.core :as s])
  (:import [clojure.lang Indexed Counted IPersistentMap]))

(defprotocol IBlah 
 (blah [this]))

(defrecord point [^long x ^long y]
   IBlah
   (blah [_] (+ x y)))

(s/with-slots [[^IPersistentMap m ^Indexed v ^point p]  [{:a 2 :b 3} [4 5 6] (->point 1 2)]
               {:fields [count] :keys [a b]}       m
               [c d e]                             v
               {:fields [x y blah]}                p]
  (+ a b c d e blah count))

Intro

Functions like:

(defn add [[x y]]
  (+ x y))

will incur a cost in destructuring, as the code will expand to something like…

(defn add [xy]
  (let [x (nth xy 0)
        y (nth xy 1)]
    (+ x y))

where clojure.core/nth is polymorphic and has to run through several tests to determine what the input is and how to coerce it to an appropriate operation for indexed lookup.

A faster route would be something like..

(defn add [^clojure.lang.Indexed xy]
  (let [x (.nth xy 0)
        y (.nth xy 1)]
    (+ x y))

wherein we know the type of the input will always be something supporting Indexed, thus supporting the .nth method, thus something we can to the compiler to emit a direct method invocation.

For smallish function optimizations, this isn’t too bad, but it can get hairy for nested destructuring. Ideally, we’d like to preserve the destructuring forms, and broaden the compiler’s knowledge on how to emit efficient code in the face of type information (to included nested types):

For an initial cut, we define the with-slots macro to establish bindings that respect type hints but act like let bindings for destructuring purposes.

structural.core/with-slots

Allows for efficient, type-based destructuring similar to the idiomatic destructuring forms of Clojure, with some limitations. Bindings are presented as the typical vector, with an even number of entries, where the preceding odd binding establishes binds for the even successor. Unlike typical forms, bindings leverage type-hinting information - both on the left hand side and the right hand side - to establish efficient operations beyond the generic destructuring forms established with maps and vectors, e.g. get and nth.

Callers may use {:fields [a b ^clojure.lang.Counted c] }, along with a type-hinted rhs, to denote establishing bindings for a, b, c, by invoking like-named direct, type-hinted field applications on the rhs, ala (.a ^some-type rhs).

Any binding var hinted on the LHS will propogate its hint throughout later bindings. This allows an expressive form of efficient destructuring for the consenting adult, which allows idiomatic expressivity without the accompanying significant loss of performance.

map destructuring for {:keys […]} follows that of :fields, except the bindings are established via either a (.valAt ..) or (.get ..) or (get …) depending on the presented type, get being the fallback. This allows usage with types supporting the java.util.Map interface. Literal maps are automatically inferred with efficient getters.

Vector or indexed destructuring is similarly supported, [^some-type x y] ^clojure.lang.Indexed coll will invoke efficient .nth indexing operations rather than the slower, more general nth. Depending on the presented type, either .nth, .get, or nth will be used, allowing operation with structures supporting the java.util.List interface. Literal vectors are automatically inferred with efficient getters. The & rest notation is currently NOT supported…

The remaining rules act identically to let semantics. If a symbol is bound to the LHS, then the binding is passed through untouched (including hints).

with-slots tries to scan the input bindings to find discrepancies (such as duplicate binds), and to re-use existing hinted information for binds. In the case that the user decides to re-hint a RHS var that has already been hinted a-priori, with-slots will allow the hint for that binding, but revert to prior hinting unless the user continues to specify new hints. This seems rare in practice.

It’s common to import the symbols for the [clojure.lang Counted Indexed] interfaces when using with-slots.

An example:

By default, structural will warn us if we’re dispatching to slow operations inside a with-slots invocation, and how to help hint stuff:

structural.core> (let [m {:a 2 :b 3}] (with-slots [{:keys [a b]} m] a))
[:with-slots.warning/using-generic 
  :get :ns #namespace[structural.core] 
  :fields {:keys [a b]} :coll m :try-hinting [clojure.lang Associative IPersistentMap java.util.Map]]
2

If we follow the directives, we can get rid of the warning:

structural.core> (let [m {:a 2 :b 3}] (with-slots [{:keys [a b]} ^clojure.lang.IPersistentMap m] a))
2

No warnings this time, and if we look at the macroexpansion:

structural.core> (use 'clojure.pprint)
nil
structural.core> (binding [*print-meta* true] 
                      (pprint (macroexpand-1 '(with-slots [{:keys [a b]} ^clojure.lang.IPersistentMap m] a))))
(clojure.core/let
 [^clojure.lang.IPersistentMap coll18242
  ^clojure.lang.IPersistentMap m
  a
  (.valAt ^clojure.lang.IPersistentMap coll18242 :a)
  b
  (.valAt ^clojure.lang.IPersistentMap coll18242 :b)]
 a)
(ns blah
 (:import [clojure.lang Indexed Counted])
;;a botmove is a pair of vectors...hints aren't explicitly
;;necessary, but we'll use them here for edification:
(defrecord botmove [^clojure.lang.IPersistentVector path
                    ^clojure.lang.IPersistentVector position])

(with-slots
;;the :fields key allows us to define type-hinted method invocations
  [{:fields [^Counted path
             ^Indexed position]} ^botmove (->botmove [] [1 2])
;;literal structures are automatically hinted; in this case
;;we efficient destructure :keys into .valAt calls, and :fields
;;into a hinted .hashCode
   {:keys [a b] :fields [hashCode]}    {:a 2 :b 3}
;;Vectors expand into (ideally) hinted calls to .nth.  Since we've
;;hinted position as ^Indexed
   [x y]          position         
   path-length   (.count path)]
 [hashCode (+ x y)])

;;[2027821082 3]

If we examine the expression’s macroexpansion, we can see that with-slots is dutifully walking the expression, resolving types, and destructuring.

structural.core> 
(def the-expression 
  '(with-slots
    [{:fields [^Counted path
               ^Indexed position]} ^botmove (->botmove [] [1 2])
     {:keys [a b] :fields [hashCode]}    {:a 2 :b 3}
     [x y]          position         
     path-length   (.count path)]
   [hashCode (+ x y)]))

structural.core> (binding [*print-meta* true] (pprint (macroexpand-1 the-expression)))
(clojure.core/let
 [^botmove coll18285
  (->botmove [] [1 2])
  ^Counted path
  (.path ^botmove coll18285)
  ^Indexed position
  (.position ^botmove coll18285)
  ^clojure.lang.IPersistentMap coll18286
  {:a 2, :b 3}
  hashCode
  (.hashCode ^clojure.lang.IPersistentMap coll18286)
  a
  (.valAt ^clojure.lang.IPersistentMap coll18286 :a)
  b
  (.valAt ^clojure.lang.IPersistentMap coll18286 :b)
  x
  (.nth ^Indexed position 0)
  y
  (.nth ^Indexed position 1)
  path-length
  (.count path)]
 [hashCode (+ x y)])
nil

This provides a way to tune performance without deviating too far from Clojure idioms, and provides warnings when the caller is entering a slow path (e.g. causing a function call to get or nth). It’s basically a poor man’s optimizing compiler for the use-case of unpacking type-hinted structures for efficient reads.

The genesis of this library was actually for performance optimizing an ICPFC competition entry. The following examples are naive, but illustrative (a more involved setup would use criterium):

structural.core> (defn add [[x y]] (+ x y))
structural.core> (time (dotimes [i 10000000] (add [1 2])))
"Elapsed time: 140.237211 msecs"
structural.core> (defn add2 [v] (with-slots [[x y]  ^Indexed v] (+ x y)))
#'structural.core/add2
structural.core> (time (dotimes [i 10000000] (add2 [1 2])))
"Elapsed time: 86.436209 msecs"
structural.core> (defn add3 [v] (with-slots [{:fields [x y]}  ^xy v] (+ x y)))
#'structural.core/add3
structural.core> (time (dotimes [i 10000000] (add3 (->xy 1 2))))
"Elapsed time: 29.117979 msecs"

Intended Uses

This is broadly useful for any destructuring code, but will likely be most useful and practical for highly destructured code paths that happen to fall on hot paths indicated by profiling. There’s no reason the clojure compiler (or a variant using core.analyzer) couldn’t leverage this type of performance analysis directly too. It’s probably best to go with stock destructuring, and treat this as another optimization step after testing.

One area that really benefits is the field-based destructuring. At a language level, Clojure doesn’t have this at all. Being able to flow hints and unpack fields is extremely useful when trying to manage performance, particularly when leveraging interop and direct field access from records and types.

Currently, the hinting is directly focused on interop. Thus you are somewhat tied to the whatever the platform’s implementation denotes (e.g. clojure.lang for CLJ jvm). This is a bit brittle, and will likely be extended to support a generic ^counted and ^indexed hint that will dispatch to the appropriate platform-specific backend (e.g. protocols in cljs).

I’d also like to leverage far more sophisticated analyzer support, rather than the current janky code-walker macrology. We should be able to have a much more elegant set of definitions that can flow types and hints. Also, provide optional replacements for defn fn let and any other binding forms.

License

Copyright © 2019 joinr

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

You can’t perform that action at this time.