# BOAST Introduction Tutorial

Documentation can be found here: [online documentation](http://www.rubydoc.info/github/Nanosim-LIG/boast/master).

## Simple Definitions and Declarations
BOAST is a ruby library:

In [1]:
require 'BOAST'

true

Defining and declaring simple variables, their name can be anything that evaluate to a string. Note that by default BOAST uses the standard output and is using FORTRAN.

In [2]:
a = BOAST::Int "a"
b = BOAST::Real "b"
BOAST::decl a, b

integer(kind=4) :: a
real(kind=8) :: b


[a, b]

Defining a procedure construct, opening and closing it:

In [3]:
p = BOAST::Procedure("test_proc", [a , b ] )
BOAST::opn p
BOAST::close p
nil;

SUBROUTINE test_proc(a, b)
  integer, parameter :: wp=kind(1.0d0)
  integer(kind=4) :: a
  real(kind=8) :: b
END SUBROUTINE test_proc


Changing the language used by BOAST:

In [4]:
BOAST::lang = BOAST::C
BOAST::opn p
BOAST::close p
nil;

void test_proc(int32_t a, double b){
}


BOAST procedure parameters should be input, output or input-output parameters.

In [5]:
a = BOAST::Real("a",:dir => :in)
b = BOAST::Real("b",:dir => :out)
p = BOAST::Procedure("test_proc", [a , b ] ) {
  BOAST::pr b === a + 2
}
BOAST::lang = BOAST::FORTRAN
BOAST::pr p
nil;

SUBROUTINE test_proc(a, b)
  integer, parameter :: wp=kind(1.0d0)
  real(kind=8), intent(in) :: a
  real(kind=8), intent(out) :: b
  b = a + 2
END SUBROUTINE test_proc


## Creating and Calling a kernel

Writing ```BOAST::``` all the time is tedious so let's import BOAST's namespace inside the global namespace:

In [6]:
include BOAST

Object

### Creating a Kernel
Defining a procedure that take arrays as parameters. Notice that akin to FORTRAN, by default arrays start at index 1.

In [7]:
n = Int("n" , :dir => :in)
a = Real("a", :dir => :in, :dim => [Dim(n)])
b = Real("b", :dir => :out, :dim => [Dim(n)])
p = Procedure("vector_increment", [n, a, b]) {
  decl i = Int("i")
  pr For(i, 1, n) {
    pr b[i] === a[i] + 2
  }
}

SUBROUTINE vector_increment(n, a, b)

Creating a computing kernel from a procedure is straight forward if you have only one procedure:

In [8]:
k = p.ckernel
nil

### Building and Calling a Kernel
Building the kernel using BOAST's default compilation flags:

In [9]:
k.build
nil

If one wants to know what BOAST did it needs to be put in verbose mode. Notice the three compilation phases:

In [10]:
set_verbose(true)
k.build
nil

gcc -O2 -Wall  -fPIC -I/usr/lib/x86_64-linux-gnu/ruby/2.5.0 -I/usr/include/ruby-2.5.0 -I/usr/include/ruby-2.5.0/x86_64-linux-gnu -I/usr/include/x86_64-linux-gnu/ruby-2.5.0 -I/var/lib/gems/2.5.0/extensions/x86_64-linux/2.5.0/narray-0.6.1.2 -march=native -DHAVE_NARRAY_H -c -o /tmp/Mod_vector_increment20180705_13190_16hmbg.o /tmp/Mod_vector_increment20180705_13190_16hmbg.c
gfortran -O2 -Wall -fPIC -march=native -c -o /tmp/vector_increment20180705_13190_16hmbg.o /tmp/vector_increment20180705_13190_16hmbg.f90
gcc -shared -o /tmp/Mod_vector_increment20180705_13190_16hmbg.so /tmp/Mod_vector_increment20180705_13190_16hmbg.o /tmp/vector_increment20180705_13190_16hmbg.o  -Wl,-Bsymbolic-functions -Wl,-z,relro -rdynamic -Wl,-export-dynamic  -L/usr/lib -march=native -lruby-2.5 -lrt


In order to call threads we need to have memory areas for input and output parameters. For this we use the NArray library (C arrays wrapped in ruby).

In [11]:
input = NArray.float(1024).random
output = NArray.float(1024)
nil

Running and checking result:

In [12]:
k.run(input.length, input, output)
raise "Error !" if (output - input - 2).abs.max > 1e-15

Taking a performancce measurement:

In [13]:
stats = k.run(input.length, input, output)
puts " #{ stats[:duration]} s"

 5.082000000000001e-06 s


## Metaprograming Example
This kernel is really different between OpenCL and C/FORTRAN. So we encapsulate it in a function that will return a different kernel when we change language.

In [19]:
set_verbose(false)
set_array_start(0)
def vector_add
  n = Int("n", :dir => :in, :signed => false)
  a = Real("a", :dir => :in, :dim => [Dim(n)])
  b = Real("b", :dir => :in, :dim => [Dim(n)])
  c = Real("c", :dir => :out, :dim => [Dim(n)])
  i = Int("i")
  pr p = Procedure("vector_add", [n, a, b, c]) {
    decl i
    if [CL, CUDA].include?(get_lang) then
      pr i === get_global_id(0)
      pr c[i] === a[i] + b[i]
    else
      pr For(i, 0, n - 1) {
        pr c[i] === a[i] + b[i]
      }
    end
  }
  return p.ckernel
end

:vector_add

In [20]:
n = 1024*1024
a = NArray.float( n ).random!
b = NArray.float( n ).random!
c = NArray.float( n )
epsilon = 10e-15
c_ref = a + b
nil

In [21]:
[:FORTRAN, :C, :CL].each {|l|
  puts "#{l}:"
  push_env( :lang => BOAST.const_get(l) ) {
    k = vector_add
    puts k.print
    c.random!
    k.run(n, a, b, c, :global_work_size => [n ,1 ,1], :local_work_size => [32 ,1 ,1])
    diff = (c_ref - c).abs
    diff.each {|elem|
      raise "Warning: residue too big: #{elem}" if elem > epsilon
    }
  }
}
puts "Success !"

FORTRAN:
SUBROUTINE vector_add(n, a, b, c)
  integer, parameter :: wp=kind(1.0d0)
  integer(kind=4), intent(in) :: n
  real(kind=8), intent(in), dimension(0:n - (1)) :: a
  real(kind=8), intent(in), dimension(0:n - (1)) :: b
  real(kind=8), intent(out), dimension(0:n - (1)) :: c
  integer(kind=4) :: i
  do i = 0, n - (1), 1
    c(i) = a(i) + b(i)
  end do
END SUBROUTINE vector_add


C:
void vector_add(const uint32_t n, const double * a, const double * b, double * c){
  int32_t i;
  for (i = 0; i <= n - (1); i += 1) {
    c[i] = a[i] + b[i];
  }
}


CL:
__kernel void vector_add(const uint n, const __global double * a, const __global double * b, __global double * c){
  int i;
  i = get_global_id(0);
  c[i] = a[i] + b[i];
}


Success !
