Built in Functions

angavrilov edited this page Sep 13, 2010 · 5 revisions

Standard Common Lisp functions

These standard functions are supported in GPU code without changes, or with slight limitations:

Arithmetic operations

  • abs
    Translated to C (f|ll)abs(f) functions, depending on the argument type.
  • + – * / 1+ 1-
    These operations are directly translated to C arithmetic operators, and thus follow C type promotion and rounding rules: e.g. integer division truncates.

As a special exception (/ x) always returns a floating-point value because it would be useless and dangerous otherwise.

Boolean operations and comparisons

  • t nil
    These constants are interpreted as having boolean type; nil may also be used as an init expression for a declared variable of any type to specify the lack of initialization.
  • and or
    These support only boolean arguments, and return booleans.
  • zerop < <= = /= >= >
  • max min
    Should work as documented.
  • nonzerop
    A counterpart to zerop, defined by this package.
  • eq eql
    Basically equivalent to =, implemented for better macro support.

Bitwise logic operations

  • logand logior logxor logeqv lognot
  • logandc1 logandc2 lognand lognor logorc1 logorc2

Transcendental functions

  • sin asin sinh asinh
  • cos acos cosh acosh
  • tan atan tanh atanh
  • exp sqrt log expt

These function are mapped to C standard library calls and have equivalent precision properties. In cases when the result would normally be a complex value, NaN is returned.

Rounding functions

Unlike the standard lisp library versions, these functions return only one value:

  • ffloor fceiling ftruncate fround
    Implemented via equivalent library calls; always return a floating point-value.
  • floor ceiling truncate round
    Always return an integer. In complex cases implemented via the floating-point equivalents.
  • rem
    Follows standard C arithmetic promotion rules for arguments. Difficult integer cases are implemented through floating-point code.

GPU-specific functions

These functions are specific to GPU code:

Array access functions

  • (aref array indexes…)
  • (array-total-size array)
  • (array-dimension array dim-idx)

Identical to the standard lisp versions.

  • (raw-aref array index)
    Due to the frequent use of pitched allocation to achieve perfect data alignment in gpu arrays, row-major-aref is difficult to implement transparently. This is its functional equivalent that acknowledges the presense of alignment holes in its index range.
  • (array-raw-extent array)
    Returns the size of the index range for raw-aref. For pitched arrays it is greater than array-total-size.
  • (array-raw-stride array dim-idx)
    Returns the stepping for the corresponding dimension with corrections for pitch.
  • (array-raw-index array indexes…)
    Similar to array-row-major-index, but intended to be used with raw-aref.

These expressions are equivalent:

(aref a i j k) 
(raw-aref a (array-raw-index a i j k))
(raw-aref a (+ (* i (array-raw-stride a 0)) (* j (array-raw-stride a 1)) k))

The value of the array parameter must be statically resolvable to the original global variable, kernel parameter or fully declared local array variable.

Tuples

Tuples are special fixed-sized vectors that are directly supported by hardware in some way. The exact implementation, allowed combinations of type and size, and supported operations heavily depend on the target. The tuple type is denoted by (tuple elt-type size).

  • (tuple x y z…)
    Creates a tuple out of its arguments. The element type is determined using rules similar to the ones used by C arithmetic operations, but without mandatory upgrading to int.
  • (untuple val) → x, y, z…
    Unpacks a tuple into multiple returned values.
  • (tuple-aref array indexes…)
    Accesses the innermost dimension of the array as a tuple. The list of indexes must contain one value less than the rank of the array. The innermost dimension must have constant size that is allowed for a tuple.
  • (tuple-raw-aref array index size)
    Accesses size elements starting at index as a tuple. The size argument must be specified as an integer constant.

Thread grid dimensions

The following functions and symbol macros may be used to retrieve thread grid dimensions and indexes.

The macros expand to invocations of the corresponding functions; “x”, “y” and “z” correspond to dimensions 0, 1 and 2. Without arguments the functions return information for all dimensions as a tuple.

  • thread-idx
  • thread-idx-x thread-idx-y thread-idx-z
  • (thread-index &optional dim-idx)
    Retrieves the index of the current thread within the block.
  • thread-cnt
  • thread-cnt-x thread-cnt-y thread-cnt-z
  • (thread-count &optional dim-idx)
    Retrieves the in-block thread grid dimensions.
  • block-idx
  • block-idx-x block-idx-y block-idx-z
  • (block-index &optional dim-idx)
    Retrieves the index of the current block within the global grid.
  • block-cnt
  • block-cnt-x block-cnt-y block-cnt-z
  • (block-count &optional dim-idx)
    Retrieves the dimensions of the global block grid.

The number of supported dimensions depends on the compilation target. The dimension index argument must be specified as an integer constant.

Thread synchronization

The following built-in function can be used for thread synchronization:

  • (barrier &optional mode)

The mode may be:

  • :block (default)
    Waits until all threads in the block reach the same point.
  • :block-fence
    Ensures that all preceeding global and shared variable writes are visible to other threads in the block, and all following reads will see up-to-date data.
  • :grid-fence
    Likewise, but for all currently running blocks.
  • :system-fence
    Likewise for the whole system, including the implicit PCI-E bus transfers to/from main memory (requires Fermi).

The set of actually supported modes depends on the target.