Standard ML library for Vectors and Multi-dimensional Arrays.
-
lib/github.com/melsman/MoA/vec/vec.mlb
:Implementation of one-dimensional vectors. Several implementations are included, including a version based on pull and push arrays.
- signature
VEC
- structure
ListVec : VEC
- structure
Fvec : VEC
- structure
PPvec : VEC
- signature
-
lib/github.com/melsman/MoA/ilvec/il.mlb
:Implementation of code-generational one-dimensional pull arrays of type
INT * (INT -> Exp)
.- signature
ILVEC
- structure
Type : Type
- structure
Exp : EXP
- structure
Program : PROGRAM
- signature
-
lib/github.com/melsman/MoA/ilvec/il2m.mlb
:Implementation of code-generational one-dimensional pull arrays of type
INT * (INT -> Exp M)
. Support for nested operations on vectors. -
lib/github.com/melsman/MoA/moa.mlb
:Implementation of multi-dimensional array calculus. Makes use of
vec.mlb
. -
lib/github.com/melsman/MoA/ilmoa.mlb
:Implementation of code-generational multi-dimensional array calculus. Makes use of
ilvec/il.mlb
. -
lib/github.com/melsman/MoA/ilapl.mlb
:Implementation of code-generational multi-dimensional array calculus with APL semantics. Makes use of
il2m.mlb
.
The "il" versions of the vector and array libraries are implementations that generate residual intermediate language "C like" code from the specification of the vector or array program. Contrary, the non-"il" versions of the libraries are implementations for which the vector and array operations are performed in ML itself.
This library is set up to work well with the SML package manager smlpkg. To use the package, in the root of your project directory, execute the command:
$ smlpkg add github.com/melsman/MoA
This command will add a requirement (a line) to the sml.pkg
file in your
project directory (and create the file, if there is no file sml.pkg
already).
To download the library into the directory
lib/github.com/melsman/MoA
, execute the command:
$ smlpkg sync
You can now reference the mlb
-file using relative paths from within
your project's mlb
-files.
Notice that you can choose either to treat the downloaded package as
part of your own project sources (vendoring) or you can add the
sml.pkg
file to your project sources and make the smlpkg sync
command part of your build process.
The sources contain various basic vector library implementations that
are used as the basis for the MoA realization. The vec/
folder
contains three implementations of the VEC
signature:
-
A reference vector implementation based on Standard ML lists.
-
An implementation based on pull arrays, which consists of a number denoting the size of the array and a function for reading an element of the array, given an index. Pull arrays support map fusion and work well up-to a number of extensions, including drop and take functionality.
-
An implementation based on both pull arrays and push arrays, which are continuation-based representations of arrays and dual to pull arrays in the sense that they support concatenation well, but drop and take operations not so well.
The ilvec/
folder contains different vector libraries that
support map fusion and generate residual (C-like) code. The
il2.mlb
implementation supports nesting of vectors and thus,
nested folds. As an example, let's first construct a matrix containing
the small multiplification table:
val mat = tabulate (I 10) (fn i => tabulate (I 10) (fn j => i * j))
Here the I
function lifts integers into the vector expression
language, on which all calculations are performed. The lifting allows
operations on vector expressions to be residualized into the C-like
target language. Now, let's write some code for summing up all the
values in the table:
val m = foldr (fn (r, a) => foldr (ret o op +) a r) (I 0) mat
Here we see two nested folds, with the outer foldr folding over the
rows of the matrix with I 0
as the base value for the
accumulator. The inner foldr adds all elements in a row to the outer
accumulator.
Here are the types of foldr
and tabulate
:
val foldr : ('a t * 'b t -> 'b t M) -> 'b t -> 'a v -> 'b t M
val tabulate : INT -> (INT -> 'a t) -> 'a v
We see that foldr
returns a monadic value, which we can "run" using
the runM
function, which again generates a residual program
(slightly simplified):
int n49 = 0;
for (int n52 = 0; n52 < 10; n52++) {
int n53 = n49;
for (int n55 = 0; n55 < 10; n55++) {
n53 += ((9-n52)*(9-n55));
}
n49 = n53;
}
After executing the above code, the result is present in variable
n49
.
The il2m.mlb
implementation further extends the simpler solution with
an implementation of pull-arrays that maps indexes to expression
monads, so as to support simple implementations of
matrix-multiplication and nested reductions along different axes of a
multi-dimensional array.
The moa.mlb
and ilmoa.mlb
libraries provide implementations of
a multi-dimentional array calculus based on the paper:
- G. Hains et L. M. R. Mullin. An algebra of multidimensional arrays. Publication 783, DIRO, Departement d'Informatique et de Recherche Operationnelle, Universite de Montreal, 1991.
In essence, arrays are implemented as a product of two one-dimensional vectors, where the first vector specifies the shape of the array (ranks and dimensions) and where the second vector specifies the content of the array, as contiguous elements.
We demonstrate the ilmoa.mlb
library by example. Consider the
following signal processing program:
fun diff (signal (*:n*)) (*:n-1*) =
rrotate (I 1) signal >>= (fn r =>
sum Double (op -) signal r >>= (fn (r' (*:n*)) =>
ret (drop (I 1) r')))
fun maxsv s v = mmap (fn x => max x s) v (* scalar-vector max *)
fun minsv s v = mmap (fn x => min x s) v (* scalar-vector min *)
fun prodsv s v = mmap (fn x => x * s) v (* scalar-vector multiplication *)
fun addsv s v = mmap (fn x => x + s) v (* scalar-vector addition *)
fun divv v1 v2 = sum Double (op /) v1 v2 (* element-wise vector division *)
fun signal (SIG (*:n*)) =
catenate (scl (D 0.0)) SIG >>= (fn (c (*:n+1*)) =>
diff c >>= (fn (v (*:n*)) =>
divv v (addsv (D 0.01) SIG) >>= (fn tmp =>
ret(maxsv (D ~50.0) (minsv (D 50.0) (prodsv (D 50.0) tmp))))))
In APL, the code is written
diff ← {1↓⍵−¯1⌽⍵}
signal ← {¯50⌈50⌊50×(diff 0,⍵)÷0.01+⍵}
Here is the driver code for the program:
val program =
let val n = I 500
val v = iota (I 2 * n)
infix %
val v = mmap (fn x => x % (I 200)) v
val v = mmap i2d v
val v = mmap (fn d => d / D 2.0) v
in mem v >>= (fn v =>
signal v >>= (fn v' =>
red (ret o op +) (D 0.0) v'))
end
The driver code first constructs an array of 1000 elements, processes
the elements of the array (by calling the signal
function), and
finally, sums up the resulting array.
Here is the residual program generated (slightly modified):
double kernel() {
double[] n468 = alloc(1000*sizeof(double));
for (int n478 = 0; n478 < 1000; n478++) {
n468[n478] = (i2d((1+n478)%200)/2.0);
}
double n471 = 0.0;
for (int n479 = 0; n479 < 1000; n479++) {
n471 = (max(min((((((n479<1) ? 0.0 : n468[(-1+n479)])-n468[n479])/(n468[n479]+0.01))*50.0),50.0),-50.0)+n471);
}
return n471;
}
The first part of the generated program constructs a 1000-element array of doubles. The second part of the generated program processes each element of the array and sums up the results.
Although we can rely on the target language compiler to do a good job
of generating efficient code, we may want to perform certain
optimizations during target code generation. For instance, in the
example above, we could recognize the opportunity to roll out the
second for-loop one time, which will eliminate the continued test on
variable n479
.
The goal of this project is to cover a large set of APL array combinators. Composed with a parsing solution for APL, one can envision a compilation scheme for parsing a subset of APL into the MoA combinators, which again will generate low-level C-like kernels (appropriate for GPGPUs, etc.)
There are lots of related work involving implementation of array languages.
-
Nick Nickolov. Open source APL interpreter in Javascript. Github project.
-
Roy E. Lowrance. APL Literature Review. February 22, 2009.
-
Timothy A. Budd. A New Approach to Vector Code Generation for Applicative Languages. September 20, 1994.
This software is published under the MIT License.