Skip to content

mobileink/lab.clj.polymorphism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lab.clj.polymorphism

Lab for experimenting with polymorphism mechanisms in Clojure.

More (better) docs on algebras in doc.

homonymy

Mechanisms:

  • delegation: roll-your-own

  • multimethods

  • Clojure Protocols

  • multimodels

What’s the difference between Protocols and multimodels? Multimodels support dynamic rebinding. Protocols are fixed; once you extend a type to support a protocol the binding is fixed; to swap an implementation, you have to dispatch on a different type. You can mimic true multimodeling with the right technique (core.matrix does this), but the technique requires mutable global variables. Below we demonstrate a method that naturally supports multimodeling without mutable globals.

getting started

This assumes you know how to use Cider in emacs. If you use some other development environment you should be able to figure out what to do. The main example is src/clojure/multimodels/examples.clj.

  1. Fork/clone

  2. lein repl, then do refresh-all (see dev/user.clj for config stuff)

Fire up a cider session, open multimodels/examples.clj, and load it (C-c C-k). At the bottom of that file are examples of computing with algebraic models.

Then explore. Several algebras are provided; using them as a pattern it should be easy to implement your own.

  • monoid N0: (N, 0, +), underlying set is Natural numbers with zero, identity element is 0, operation is addition

  • monoid N1: (N, 1, *), underlying set is positive Natural numbers, identity element is 1, operation is multiplication

  • Q3+: quotient group mod 3 over integers, identity is 0, op is addition

user => (register-struct (algebra.struct.N0.))
user => (register-struct (algebra.struct.N1.))
user => (register-struct (algebra.struct.Q3+.))
user => (set-model :n0)
user => (** 3 5)           ;; we use ** as the abstract Group multiplication operator
8
user => (set-model :n1)
user => (** 3 5)
15
user => (set-model :quotient-3)
user => (** 3 5)
2

delegation: roll-your-own

multimethods

Clojure Protocols

You use defprotocol to declare a protocol. Note that this "defines" a protocol object but only "declares" its operators. Furthermore defprotocol is one of those pseudo-functions, like def, whose main purpose is a side-effect; def has the side-effect of interning a var (thus altering the environment), and defprotocol does something similar: it constructs a map representing the protocol and interns a var constructed using the current namespace and the name parameter of the declaration. Instead of returning this map as the "value", defprotocol interns it and returns the var.

You can inspect the (side-) result of defprotocol:

user=> (require '(clojure [pprint :as pp]))
user=> (def p (defprotocol P "test protocol" (f [t])))
#'user/p
user=> (type p)
clojure.lang.Symbol
user=> (pp/pprint p)
P
nil
user=> (pp/pprint P)
{:on user.P,
 :on-interface user.P,
 :doc "test protocol",
 :sigs
 {:f1 {:name f1, :arglists ([t]), :doc nil},
  :f2 {:name f2, :arglists ([t a]), :doc nil}},
 :var #'user/P,
 :method-map {:f1 :f1, :f2 :f2},
 :method-builders
 {#'user/f2
  #object[user$fn__14240 0x39f583af "user$fn__14240@39f583af"],
  #'user/f1
  #object[user$fn__14253 0x19811f83 "user$fn__14253@19811f83"]}}
nil
user=>

First off, notice that we’re looking at a map that is the value of the var #'user/P: user⇒ (= (var P) (:var P)). If you deref (:var P) you’ll see the same map you get by printing P.

Notice the :method-builders key: its value is a map from vars to objects. From the names we can infer they are function objects, but let’s confirm that:

user=> (clojure.test/function? (get-in P [:method-builders #'user/f1]))
true

The form of the method-builder vars tells use that in fact there is another side-effect of executing defprotocol: it also interns the functions of the protocol in the current environment. To see the evidence, run (dir user) in a fresh repl to see the list of interned vars; then execute defprotocol, then run (dir user) again; you should see the protocol symbol and the protocol functions listed as interned vars.

Yet another side-effect: defprotocol generates a Java Interface. From the docstring:

defprotocol will automatically generate a corresponding interface, with the same name as the protocol, i.e. given a protocol: my.ns/Protocol, an interface: my.ns.Protocol. The interface will have methods corresponding to the protocol functions, and the protocol will automatically work with instances of the interface.

NB: more precisely: defprotocol will call gen-interface, which will define the corresponding class (using java.lang.ClassLoader.defineClass). Under aot compilation, it will also write the bytecode to disk.

Let’s have a look:

user=> (type user/P)
clojure.lang.PersistentArrayMap
user=> (require '(clojure [reflect :as r]))
user=> (type user.P)
java.lang.Class
user=> (pp/pprint (r/reflect user.P))
{:bases nil,
 :flags #{:interface :public :abstract},
 :members
 #{{:name f2,
    :return-type java.lang.Object,
    :declaring-class user.P,
    :parameter-types [java.lang.Object],
    :exception-types [],
    :flags #{:public :abstract}}
   {:name f1,
    :return-type java.lang.Object,
    :declaring-class user.P,
    :parameter-types [],
    :exception-types [],
    :flags #{:public :abstract}}}}

Make a note of this example using pprint and clojure.reflect to inspect the side-effect products of defprotocol - the Clojure protocol map and the Java Interface.! It will come in very handy the first time you run in to the dreaded "No single method" exception (see below, troubleshooting.)

Contrast this with deftype, which "[d]ynamically generates compiled bytecode for class with the given name, in a package with the same name as the current namespace…​" but does not intern a var for the type in the current namespace.

algebras

Algebra is where logic meets mathematics. Loosely speaking, an algebra is the marriage of a signature (which is a formal logical calculus) and a structure (which is an informal mathematical "object'); what ties them together is a model, which uses the mathematical structure to interpret the linguistic expressions formable using the signature.

What this implies is that many different structures may be used to model a given signature or language. The classic example, which is implemented in this project, is the algebraic *Group*. Groups are extremely simple; they have an underlying set, one distinguished element of that set that acts as an identity element, and one binary operation; additionally, some basic rules about how the operation works (e.g. a*b=b*a). Infinitely many mathematical structures may behave as Groups. The textbook examples are: (N,0,+), where the set is Nat (with zero), the identity element is 0, and the operation is addition, and (N+,1,*), where the set is the positive Natural numbers, the identity element is 1, and the operation is multiplication. The structures are obviously not the same, but as Groups they behave in exactly the same way.

The relevance of such an algebraic perspective to programming is pretty obvious, even though it is not often explicitly noted. The distinction between signature and structure is analogous to the one between interface and implementation. If you design well, you can swap implementations of an interface without changing the behavior of the system, e.g. going from a hashmap to an arraymap.

One of the beauties of Clojure’s Protocols is that they make it relatively easy to work in this manner.

Clojure’s Protocols only include functions; algebras will always or at least usually include some constant symbols (like the digits 0-9), just as the underlying structures will contains some constant "values" like the natural numbers. So strictly speaking we should think of Protocols as analogous to the "operator signature" of an algebra, i.e. the subset of a signature consisting of all the function symbols.

Once you have a signature, (a Clojure Protocol definition), you need to relate it to a structure in order to use it to express anything meaningful. Mathematically this involves specifying an "interpretation", which is just a mapping from symbols in the signature to values in the structure; I’m calling this a model. Technically it’s a little more complicated than that but the basic concepts of signature, structure, and model seem to be pretty straightforward, and they match actual mathematical practice and terminology pretty well.

So we think of a Clojure Protocol as a Signature (an "OpSig" or operater signature, to be more precise), and we think of the code we write to implement the operators in the signature as a structure. To bind the two together, we use Clojure’s extend macro, which does precisely what we need: expresses a mapping from signature to structure, or, in more programming-oriented language, from interface to implementation.

However, to really make this work - to make it possible to switch from one model to another (swap implementations) - you need more than just defprotocol and friends.

signatures

structures

models

implementation techniques

One implementation trick, which I learned from core.matrix, is to exploit the fact that Clojure’s Protocol mechanism dispatches function calls on the first arg. Knowing this, we just parameterize operation calls with a first arg whose sole purpose is to determine dispatch - the "content" of the arg is irrelevant. Of course to do this you have to intercept the call in the first place, and then decide which type to use for dispatch. For that you keep a var; changing the var effectively switches from one model (interpretation of the signature) to another, by changing the dispatch parameter.

One reason I wrote this little app is to have a clean and simple expression of the technique used by core.matrix. I had to study that code pretty hard before the technique stood out from implementation details. I don’t mean the code is bad or hard to read, I mean it’s mixed up with the details of implementing core.matrix, so I wanted something purely focussed on demonstrating the technique with minimal extra stuff. So that I’ll be able to return to it in six months, after I’ve forgotten everything about core.matrix, and so that others can learn the technique independent of matrix stuff. Also, I wanted to highlight the algebraic structure of the technique, which I’ve tried to do by using the algebraic terminology of signature, structure, and model, and organizing the code accordingly.

The way I do it here is slightly different than the way core.matrix does it. I use a default implementation (defined on java.lang.Object) to manage dispatch. So calls to Protocol functions are always sent to the default Object implementation, which checks to see what the current model (implementation) is, and rewrites and forwards the call as required. core.matrix uses a separate API "wrapper" namespace to do this, before calls reach the Protocol interface. That approach has the virtue of separating the user interface from the Protocol interface, but that is also a vice, since it means you have to keep them in sync. I decided to use default Object as dispatcher in order to ensure that the user API always matches the Protocol signature. And also just to experiment. I don’t know which technique is preferable.

Emulating Dependent Types

from map to foo-map

This is a map: {:a 1}

This is a foo-map: {:foo 0, :a 1, :b 2}

This is a foo-vector: [:foo 1 2 3]

A foo-list: '(:foo 1 2 3)

Clojure’s Protocol mechanism (together with, say, Prismatic Schema and/or core.typed) allow us to treat these as distinct types. Since these types depend on a particular data value - :foo - they thereby emulated dependent types.

Another example: type VecInt4 - integer vectors of length 4. We start with a function f that operates on vectors:

(defn f [^PersistentVector v] ...do something with v...)

We want a function that only operates on integer vectors or length 4. We can easily do this by writing f as a dispatch function that inspects its argument at runtime and then forwards the call to an appropriate implementation function. If f receives an argument that is not a vi4 datum, it will throw an exception; otherwise, it will pass it to the implementation function, call it vi4-f.

A better way would be to use a multimethod. The same thing happens, but in this case Clojure’s built-in dispatching mechanism for multimethods will take responsibility for routing the call to the appropriate implementation function. Using a multimethod means we don’t have to give our implementation function a distinct name - it will have the same name as the dispatch function, and Clojure will take care of the housekeeping.

In both these approaches, there is only one type involve: PersistentVector. Protocols allow us to treat VecInt4 as a distinct type.

(deftype VecInt4 )
(defprotocol PVecInt4
  (f ...))
(extend VecInt4
  PVecInt4
  ...)

troubleshooting

"No single method: M of interface: I found for function: F of protocol: P"

Note the reference to by interface and protocol: it’s going from function-in-protocol (Clojure) to method-in-interface (Java). The interface is generated at runtime by defprotcol.

"IllegalArgumentException No implementation of method: M of protocol: #'P found for class: K"

Self-explanatory.

multimodels

misc

Useful tools:

  • tools.namespace

  • clojure.reflect

  • clojure.pprint

  • iroh? "a library to inspect, manipulate and game the jvm."

About

Lab for experimenting with polymorphism in Clojure

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages