Skip to content
njpipeorgan edited this page Apr 6, 2020 · 51 revisions

Contents

  1. Installation
  2. Basic usage
  3. Types
  4. Scoping
  5. Functions
  6. String Patterns
  7. Calling Wolfram Kernel and Inline C++
  8. Compilation Errors
  9. Working with C++
  10. Options to CompileToBinary
  11. External BLAS and LAPACK
  12. Caveats

Installation

  1. You can find the lastest releases of MathCompile on the Releases page.

  2. Install the package by executing the command using PacletInstall.

To load and test the package, execute

<<MathCompile`
CompileToCode[Function[{Typed[x, Integer]}, x + 2]]

It should output a C++ function as a string.

  1. Set up a C++ compiler (see Prerequisites for C++ Compiler).

Basic usage

Writing a function

A Wolfram Language function that can be compiled should be a function with zero or more formal parameters:

Function[{<arg1>, <arg2>, ...}, <body>]

Each argument of the function should have its type specified by Typed[<name>, <type>]. The body of the function is an expression consist of:

For example, the following function is effectively the native Plus function for two integers:

Function[{Typed[x, Integer], Typed[y, Integer]}, x + y]

CompileToCode

The function CompileToCode compiles a Wolfram Language function to C++ code, e.g.

CompileToCode[Function[{Typed[x, Integer]}, x + 1]]

evaluates to a string:

auto main_function(const int64_t& v11) {
    return wl::val(wl::plus(WL_PASS(v11), int64_t(1)));
}

CompileToBinary

The function CompileToBinary compiles a function to binary and load it into the current session as a library function (it requires a C++ compiler), e.g.

addone = CompileToCode[Function[{Typed[x, Integer]}, x + 1]]

The compiled functions can be called just like normal functions when arguments of correct types are supplied, e.g.

addone[5]  (* gives 6 *)

If a compiled function needs to be recompiled, it should be unloaded first by LibraryFunctionUnload.

Types

Supported types

The following table shows the types supported by MathCompile, and how the types are represented:

Type Width Type specifier
Boolean "Boolean"
Integer 8-bit
16-bit
32-bit
64-bit
"Integer8", "UnsignedInteger8"
"Integer16", "UnsignedInteger16"
"Integer32", "UnsignedInteger32"
"Integer64", "UnsignedInteger64"
Real single precision
double precision
"Real32"
"Real64"
Complex single precision
double precision
"ComplexReal32"
"ComplexReal64"
String "String"
Array {<Type>,<Rank>}

Three commonly used types — Integer, Real, Complex, and String — are aliases of "Integer64", "Real64", "ComplexReal64" and "String".

The function Typed is used to specify the type of function arguments. For example,

Function[{Typed[x, {Integer, 2}]}, ...]

denotes a function taking a matrix (rank-2 array) of integers named x as its argument.

Type promotion

When an arithmetic function is applied on two numbers with distinct types, one or both of them are promoted to a different type. For addition, subtraction, and multiplication, a common type is defined for each combination of types in the table below; the numbers are convert to the common type before applying the operation. For division, the common type of two integral types is defined to be double instead.

Types Common type
integral of different widths the wider integral
integral of different signedness the unsigned integral
integral and floating-point the floating-point
floating-point of different widths the wider floating-point
integral and complex the complex
floating-point T and complex<T> complex<T>
float and complex<double> complex<double>
double and complex<float> complex<double>
complex<double> and complex<float> complex<double>

Other functions that implicitly depend on arithmetic operations also follow the rules of common types. For example, applying LinearSolve on two integral matrices yields a solution matrix of double-precision numbers.

Scoping

Wolfram Language has two scoping mechanisms: lexical and dynamic scoping, of which MathCompile only support lexical scoping using Module. A module creates a scope; local variables can be introduced within the scope, then they are invalidated at the end of the scope.

Note that a local variable must be initialized before use. It is recommended that local variables are initialized at the beginning of a module, either by value or using Typed with a type specifier:

Module[{x = 2}, ...]                (* OK, x is integer 2 *)
Module[{y = Typed[{Real, 1}]}, ...] (* OK, y is an empty list of reals *)
Module[{z}, Sin[z]]                 (* error, z is used before initialization *)

The initialization of local variables can be delayed, but it must be one of the compound expressions in the module. A common cause of the delay is that the initialization of one variable depends on another. For example,

Module[{x = 3, y}, y = x^2; ...]      (* OK, y is initialized *)
Module[{x = 3, y}, Sin[y = x^2]; ...] (* error, not a proper initialization *)

The delay of initialization also applies in multiple-assignment. For example,

Module[{lu, p, c}, {lu, p, c} = LUDecomposition[{{1, 1}, {1, 0}}];]

Functions

The functions nested in the body of the compiled function do not need to have the types of their arguments specified. For example, the two snippets below are equivalent:

Module[{f = Function[{Typed[x, Real]}, x^2]}, f[5.5]]  (* OK *)
Module[{f = #^2 &}, f[5.5]]                            (* OK *)

Functions as values

When compiled, the functions can be used as values in most cases:

Module[{f, g}, f = #^2 &; g = f; ...]  (* OK *)

They can also be returned by divergent code paths, the code compiles when the types from both paths is consistent:

Module[{f = If[Pi > 3, # + 1 &, Sin[#] &]}, f[1.6]]  (* OK *)
Module[{f = If[Pi > 3, # + 1 &, Sin[#] &]}, f[1]]    (* error *)

Slot and SlotSequence

Slot can be compiled except for #0

(#1 + #2 &)[3, 4, 5]      (* OK; 5 is ignored *)
(#1 + #2 &) @@ Range[5]   (* OK; 3, 4, 5 are ignored *)

SlotSequence are designed to be used as the argument of variadic functions, such as List, Plus, and BitXor. Some of the native functions may take different number of arguments, such as Range and ArcTan, but SlotSequence cannot be used with them. For example,

ArcTan[##1] &     (* error *)
ArcTan[#1] &      (* OK *)
ArcTan[#1, #2] &  (* OK *)

Recursive functions

Recursive functions can be defined in two steps:

  1. declare the type of the function in the form of {<argument types>} → <return type>;
  2. set the definition of the function.

For example, factorial takes an integer and returns an integer, so it can be defined as

Module[{f = Typed[{Integer} → Integer]}
  f = If[# == 1, 1, # f[# - 1]] &;
  f[10]    (* gives 3628800 *)
]

As a guideline, a recursive function is favored in these conditions:

  1. the algorithm is naturally recursive and a recursive definition greatly reduces complexity;
  2. the depth of recursion is relatively small especially arrays are defined inside the function so that stack overflow is unlikely to occur.

Otherwise, employ a iterative definition of the function instead.

String Patterns

String patterns are the only places where pattern matching functions are supported. There are a few restrictions on string patterns and the related functions:

  1. Regular expression cannot be mixed with patterns;
  2. RuleDelayed has the same meaning as Rule;
  3. The right hand side of a Rule can only be a string pattern;
  4. Functions that return a list of strings or integers can only take one string as their first arguments;
  5. Overlaps->True and Overlaps->False are the only options supported.

Internally, a string pattern needs to be compiled to a regular expression object before being used. In order to avoid compiling the same string pattern multiple times, you can explicitly create a compiled string expression by assigning it to a variable. For example,

Module[{p = "a"~~_~~"b"}, ...]

compiles the pattern p once and it can be used multiple times in the Module.

Calling Wolfram Kernel and Inline C++

Calling Wolfram Kernel

Functions that needs to be executed in Wolfram Language are represented by Extern[<function>,<return type>].

For example,

CompileToBinary@Function[{Typed[x, {Real, 1}]},
    Extern[AiryAi, Typed[{Real, 1}]][x]
  ]

calls the function AiryAi in Wolfram Kernel and the return type is Typed[{Real, 1}].

Another usage of this functionality is to pass variables to the function after being compiled and loaded:

f = CompileToBinary@Function[{Typed[x, Real]},
    x + Extern[gety, Typed[Real]][]
  ];
gety[] := 5;
f[10]  (* returns 15 *)
gety[] := 8;
f[10]  (* returns 18 *)

Inline C++

Within a function to be compiled, a piece of C++ is represented by CXX["<C++ code>"].

For example,

CompileToCode@Function[{Typed[x, Real]}, CXX["std::trunc"][x]]

calls the function std::trunc in C++.

To access variables, wrap the names of variables with pairs of back-tics. For example,

CompileToCode@Function[{Typed[n, Integer]},
  Module[{x = 1}, CXX["`x` += 5"]; x]]
]

compiles to

auto main_function(const int64_t& v38) {
    return wl::val([&] {
        auto v37 = wl::val(int64_t(1));
        v37 += 5;
        return wl::val(v37);
    }());
}

If the inlined C++ code depends on external header files or libraries, you can specify them in options "Includes" and "Libraries" when calling CompileToBinary.

Compilation Errors

The compilation errors are divided into two categories:

  • syntactic and semantic errors, issued in Wolfram Language → C++ stage;
  • overload resolution and type errors, issued in C++ → binary stage.

Syntax and semantics

CompileToCode[Function[{}, Table[i + j]]];
  syntax::bad: Table[Plus[i,j]] does not have a correct syntax for Table.

Explanation: Functions with iterators such as Table and Sum require a certain syntax, including how to specify the iterators.

CompileToCode[Function[{}, Range[5] += 1]];
  syntax::bad: AddTo[Range[5],1] does not have a correct syntax for AddTo.

Explanation: A function that changes any of its argument requires that argument to be modifiable, but Range[5] is not.

CompileToCode[Function[{}, Module[{x}, x + 1]]];
  semantics::noinit: Variable x is declared but not initialized.
  semantics::badref: Variable x is referenced before initialization.

Explanation: A variable must be initialized in order to be used. In this case, an initialization x = 0 will fix the issue.

CompileToCode[Function[{x}, Dot[x, x]]];
  codegen::notype: One or more arguments of the main function is declared without types.

Explanation: The argument types of the main function must be specified. In this case, if the input to this function is a list of real numbers, declare {Typed[x, {Real, 1}]} instead of {x}.

Overload resolution and type errors

These errors are given by the C++ compiler, and they are formatted and forwarded by MathCompile. Be aware that the error message is always accurate but the location of the error is estimated and can be inaccurate.

CompileToBinary[Function[{}, Module[{x = 5}, x = Sin[x]]]];
  cxx::error:
    ...[List[],Module[List[Set[x,5]],Set[x,Sin[x]]]]...
                                     ∧
    The type of the source cannot be converted to that of the target.

Explanation: Each variable can only have one type. Sin[x] gives a real number and it cannot be assigned back to x.

CompileToBinary[Function[{}, Range[10][[;; 2.5]]]];
  cxx::error:
    ..unction[List[],Part[Range[10],Span[1,2.5]]]...
                                    ∧
            The arguments should be integers.

Explanation: Span specifications can only be integers.

CompileToBinary[Function[{}, Sin[1, 2, 3]]];
  cxx::error:
    ...Function[List[],Sin[1,2,3]]...
                       ∧
    no matching function for call to 'sin(int64_t, int64_t, int64_t)'

Explanation: Each function has its requirements for arguments (Sin can only take one numerical argument). When a function is called with some inappropriate argument types, the compiler is unable to resolve it.

CompileToBinary[Function[{}, Nest[List, {1}, 3]]];
  cxx::error:
    ...Function[List[],Nest[List,List[1],3]]...
                       ∧
    The type should be consistent when the function is applied repeatedly.

Explanation: The return type of a function is determined at compile time. Applying List repeatedly on a variable does not give a consistent type.

Working with C++

Since MathCompile implements all supported functions in C++, the generated code can be used with other C++ code without the existence of Wolfram runtime libraries or installing Mathematica.

A simple example

The implementation of functions comes with MathCompile as a header-only library, which is located in MathCompile/IncludeFiles. You need to download the source code of MathCompile or clone the repository and provide that to the C++ compiler later.

First, we transform a function that calculates the sum of the first n positive integers squared.

CompileToCode@Function[{Typed[n, Integer]}, Sum[i^2, {i, n}]]

gives the function in C++ as a string:

auto main_function(const int64_t& v38) {
    return wl::val(wl::clause_sum([&](auto&& v37, auto&&...) {
        return wl::val(wl::power(WL_PASS(v37), wl::const_int<2>{}));
    },wl::var_iterator(WL_PASS(v38))));
}

Then we wrap it with some code handling input and output:

#include <cstdlib>
#include <iostream>
#include "math_compile.h"

auto main_function(const int64_t& v12) {
    return wl::val(wl::clause_sum([&](auto&& v11, auto&&...) {
        return wl::val(wl::power(WL_PASS(v11), wl::const_int<2>{}));
    },wl::var_iterator(WL_PASS(v12))));
}

int main(int argc, char* argv[]) {
    if (argc <= 1)  // no argument provided
        return 1;
    int64_t n = std::atoi(argv[1]);  // read the argument
    auto result = main_function(n);  // call the function
    std::cout << result << '\n';     // print the result
    return 0;
}

Now we compile the source file above, called source.cpp, using GCC.

$ g++ -std=c++1z -I<path-to-MathCompile>/IncludeFiles -o example source.cpp

where <path-to-MathCompile> should be replaced by the path to the package and -std=c++1z or other equivalent flags is necessary to specify the C++17 standard.

Finally, we run the program:

$ ./example 10

C++ types

Each compilable type in Wolfram Language corresponds to a type in C++. They are summarized below.

Wolfram Language C++ Comment
type of Null wl::void_type an empty class
"Boolean" wl::boolean convertible from and to bool
"Integer<n>" int<n>_t n can be 8, 16, 32, or 64
"UnsignedInteger<n>" uint<n>_t n can be 8, 16, 32, or 64
"Real64" double
"Real32" float
"ComplexReal64" wl::complex<double> same as std::complex<double>
"ComplexReal32" wl::complex<float> same as std::complex<float>
"String" wl::string UTF-8 string
{<Type>, <Rank>} wl::ndarray<Type, Rank> multi-dimensional array

Interact with wl::ndarray

wl::ndarray is a multi-dimensional array type corresponds to PackedArray and NumericArray in Wolfram Language.

Construction

wl::ndarray<Type,Rank> can be constructed from its dimensions (typed std::array<size_t, Rank>) and one of the following:

  1. a value (fill the array by this value);
  2. an initializer list (fill the array by the contents in the list);
  3. a pair of iterators (fill the array by the contents in [iter1, iter2));
  4. an rvalue std::vector (fill the array by the contents in the vector).

Arrays x1 through x4 below are initialized by these four methods respectively.

wl::ndarray<int, 1> x1({5}, 12);
wl::ndarray<int, 2> x2({3, 2}, {1, 2, 3, 4, 5, 6});
std::vector<int> vec{1, 2, 3, 4};
wl::ndarray<int, 2> x3({2, 2}, std::begin(vec), std::end(vec));
wl::ndarray<int, 2> x4({2, 2}, std::move(vec));

Member functions

Member functions that are commonly used are as follows.

Function signature Purpose
std::array<size_t, Rank> dims() const dimensions of the array
size_t size() const flattened size of the array
Type* data() pointer to the first element
const Type* data() const const pointer to the first element
void copy_to(Iterator iter) const copy the elements to iter
void copy_from(Iterator iter) copy the elements from iter

Accessing elements

wl::part can be used to access elements in an array efficiently. wl::part uses 1-based indexing by default:

auto list = wl::range(5);   // {1, 2, 3, 4, 5}
int& x = wl::part(list, 3); // x equals 3
x = -3;                     // changes 3 to -3 in list

wl::part can take multiple indices to extract elements from multi-dimensional arrays:

wl::part(a, 2, 4, -1);  // means a[[2, 4, -1]] in Wolfram Language

0-based indexing is specified by wl::cidx, e.g.

int x = wl::part(wl::range(5), wl::cidx(3)); // x equals 4

String patterns

Functions that involve string patterns, e.g. StringMatchQ, StringReplace, etc., depend on PCRE2 library. MathCompile package includes precompiled PCRE2 for several platforms in the MathCompile/LibraryResources. If these functions are used in C++ source code, the library must be provided during linking.

Options to CompileToBinary

The following options can be given

Option name Default value Meaning
"TargetDirectory" Automatic output directory of the compiled library
"WorkingDirectory" Automatic directory for temporary files
"Debug" False passes debug flags to the C++ compiler
"MonitorAbort" True allows interruption by Abort[]
"Defines" {} Preprocessor definitions
"CompileOptions" "" options passed to the C++ compiler
"LinkerOptions" "" options passed to the linker
"Includes" {} additional include files
"IncludeDirectories" {} directories for header file lookup
"Libraries" {} library dependencies
"LibraryDirectories" {} directories for library lookup

External BLAS and LAPACK

MathCompile supports linear algebra functions based on Eigen. Alternatively, you can link external BLAS and LAPACK libraries with a compiled function to replace Eigen.

As for now, you need to define corresponding macros to tell MathCompile to use external libraries: define WL_USE_CBLAS for a BLAS library; and define WL_USE_LAPACKE for a LAPACK library. In the following example, Intel MKL is linked and the code is compiled by Microsoft Visual C++ compiler, where <path-to-mkl> is the installation path of MKL:

f = CompileToBinary[Function[{}, Inverse@RandomReal[1., {5, 5}]],
    "Defines"            -> {"WL_USE_LAPACKE", "WL_USE_CBLAS"},
    "IncludeDirectories" -> "<path-to-mkl>/include",
    "Includes"           -> "mkl.h",
    "LibraryDirectories" -> "<path-to-mkl>/lib/intel64_win",
    "Libraries"          -> {"mkl_intel_lp64","mkl_sequential","mkl_core"}
  ]

Caveats

1. Arrays must be rectangular

Example

f = CompileToBinary[Function[{}, Table[Range[i], {i, 5}]]]

Attempting to create a ragged array causes run-time error.

2. Avoid signed integer overflow

Example

f = CompileToBinary[Function[{Typed[x, Integer]}, x + 1]];

f[10] evaluates to 11, but the behavior of f[2^63 - 1] is undefined.

3. The evaluation order of function arguments is generally unspecified

Example

f = CompileToBinary[Function[{}, Module[{y = 0}, {y += 5, y *= 5}; y]]]

The order of evaluating y += 5 and y *= 5 is not specified; f[] may gives either 5 or 25.

4. Avoid reading or writing an invalidated variable

Example

Module[{g, x = 3}, g = Module[{y = 4}, x + y &]; g[]]

In the definition of function g, local variable y was referred, but when g[] is called, y has been invalidated. A simple fix to this issue is to pull y out of the local scope:

Module[{g, x = 3, y = 4}, g = x + y &; g[]]  (* good *)