# Serialbox Tutorial

This notebook will walk through an example to have developers get familiar with the basic Serialbox features to save data from Fortran and load the data into Python.

The developer will write their Fortran code below the `%%writefile serialBox_tutorial.F90` line in the following cell.  After the Fortran code is completed, running this cell will write a Fortran file `serialBox_tutorial.F90` that can later be compiled.

The developer will do the following in the Fortran code with Serialbox.

* Initialize Serialbox, have the serialized data written to a directory called `./data`, and set the Serialbox file prefix to `example`.
* Create a Serialbox savepoint called `input_data` that contains all the serialized data
* Create an integer scalar that has a value equal to 7 and write it into a Serialbox variable called `int0`
* Create a real scalar that has a value equal to 8.9 and write it into a Serialbox variable called `real0`
* Create a 2D double precision array of size (10,11) where the value at the `(i,j)` index is computed as `(j-1) + i + 0.1` when looping through `i` and `j`.  Save the array using a Serialbox variable called `dp_arr0`
* Create a Fortran derived data type that contains an integer, a real, and a 2D double precision array that are set to the same values as indicated in the 3 previous bullets (i.e. integer equal to 7, real equal to 8.9, etc.) and serialize the three values.  When serializing the data from the derived data type, write the integer into a Serialbox variable called `ddt_int0`, write the real into a Serialbox variable called `ddt_real0`, and write the 2D double precision array into a variable called `ddt_arr0`.

For a helpful reference, see slide 8.
Note the API for serialbox:

* `!$ser init directory='Location for data files' prefix='Prefix for data files'` - Call to initialize Serialbox

* `!$ser savepoint 'SavepointName'` - Create a new ‘savepoint’ to collect data

* `!$ser data var_othername=var_fortranname` - Save fields with simple syntax

* `!$ser verbatim if x > 0.0 then …` - Execute additional Fortran code when utilizing Serialbox

* `!$ser <on/off>` - Turn serializing on or off for different segments of code

In [12]:
%%writefile serialBox_tutorial.F90

module mod0
    type der_data_type
    
        integer :: int0
        real    :: real0
            
        double precision, dimension(:,:), allocatable :: dp_arr0
            
    end type der_data_type
end module mod0

program serialBox_tutorial
    use mod0
    implicit none

    integer          :: int0, ii, jj
    real             :: real0

    real, dimension(:,:), allocatable :: dp_arr0

    type(der_data_type) dd_Type
            
    ! Initialize Serialbox
    !$ser init directory='./data' prefix='example' unique_id=.true.
    !$ser mode write
    !$ser on
    
    ! Set up the data as indicated in the above cell
    int0  = 7
    real0 = 8.9

    allocate(dp_arr0(10,11))

    dd_Type%int0  = 7
    dd_Type%real0 = 8.9
    allocate(dd_Type%dp_arr0(10,11))
    
    do jj = 1, 11
        do ii = 1, 10
            dp_arr0(ii,jj) = (jj-1) + ii + 0.1
            dd_Type%dp_arr0(ii,jj) = (jj-1) + ii + 0.1
            !write(*,*) 'dp_arr0(', ii, ',', jj, ') = ', dp_arr0(ii,jj)
        enddo
    enddo
    ! Write out the data as indicated in the above cell using Serialbox
    !$ser savepoint 'input_data'
    !$ser data int0=int0 real0=real0
    !$ser data dp_arr0=dp_arr0
    !$ser data ddt_int0=dd_Type%int0 ddt_real0=dd_Type%real0 ddt_arr0=dd_Type%dp_arr0
    !!$ser cleanup
   
end program

Overwriting serialBox_tutorial.F90


Once the Fortran file is written, the developer can run the cell below which creates a Bash environment to execute the `pp_ser.py ` Python script.  This will create a Fortran file `s_serialBox_tutorial.F90` with the appropriate SerialBox library calls in the Fortran code.  After that, the code is compiled using `gfortran` and executed.

In [13]:
%%bash

[ -f tutorial_run ] && rm tutorial_run
[ -f s_serialBox_tutorial.F90 ] && rm s_serialBox_tutorial.F90

python3 ${SERIALBOX_ROOT}/python/pp_ser/pp_ser.py -s -v --output=s_serialBox_tutorial.F90 serialBox_tutorial.F90

gfortran -O3 -cpp -DSERIALIZE \
    -o tutorial_run s_serialBox_tutorial.F90 \
    -I${SERIALBOX_ROOT}/include \
    ${SERIALBOX_ROOT}/lib/libSerialboxFortran.a \
    ${SERIALBOX_ROOT}/lib/libSerialboxC.a \
    ${SERIALBOX_ROOT}/lib/libSerialboxCore.a \
    -lpthread -lstdc++ -lstdc++fs 
./tutorial_run

Processing file serialBox_tutorial.F90
 >>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<
 >>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<


Once the binary executes and writes out the data, the developer can run the following cell containing a Python script to verify whether the data serialized properly or not.  The script assumes that the developer has set up the data exactly as specified in bulleted list above.  If the only message printed is `Finished running comparison tests!`, all the tests have passed!

In [14]:
#!/usr/bin/env python3

import numpy as np
import sys
import os
sys.path.append(os.environ.get('SERIALBOX_ROOT') + '/python')
import serialbox as ser

serializer = ser.Serializer(ser.OpenModeKind.Read,'./data', 'example')

sp = serializer.get_savepoint('input_data')

int0  = serializer.read('int0',  sp[0])[0]
real0 = serializer.read('real0', sp[0])[0]

dp_arr0   = serializer.read('dp_arr0',   sp[0])

ddt_int0    = serializer.read('ddt_int0', sp[0])
ddt_real0   = serializer.read('ddt_real0', sp[0])
ddt_dp_arr0 = serializer.read('ddt_arr0', sp[0])

int0_ref = 7
real0_ref = np.float32(8.9)

dp_arr0_ref = np.zeros((10,11), dtype=np.float32)

for j in range(11):
    for i in range(10):
        dp_arr0_ref[i,j] = j + i+1 + 0.1
try:
    assert int0_ref == int0, "int0 does not match!"
    assert real0_ref == real0, "real0 does not match!"
    assert np.array_equal(dp_arr0_ref, dp_arr0), "dp_arr0 does not match!"
    assert np.allclose(int0_ref, ddt_int0), "ddt_int0 does not match!"
    assert np.allclose(real0_ref, ddt_real0), "ddt_real0 does not match!"
    assert np.allclose(dp_arr0_ref, ddt_dp_arr0), "ddt_arr0 does not match!"
except AssertionError as msg:
    print(msg)
print("Finished running comparison tests!")

Finished running comparison tests!


## Optional Exercise 2
Often setting a savepoint can result in an excessive amount of data being produced. You may want to be strategic about using when you save data, either by using `ser verbatim if (condition)` around savepoints or using `!$ser on` and `!$ser off` if data volume becomes an issue, or you are not gaining unique tests out of the repetition of data. 
In this example try to save 'arr' only for n = 1 and n = 50

In [35]:
%%writefile serialBox_tutorial2.F90


program serialBox_tutorial
    use mod0
    implicit none

    integer          :: ii, jj, kk, n
    real             :: real0

    double precision, dimension(100,100,100) :: arr

    ! Initialize Serialbox
    !$ser init directory='./data2' prefix='example2' unique_id=.true.
    !$ser mode write
    !$ser on
    do n=1,100
      do kk=1,100
        do jj=2,100
            do ii=2,100
                arr(ii,jj,kk) = arr(ii, jj,kk) + (arr(ii-1,jj,kk) + arr(ii, jj - 1,kk) + 2 * arr(ii, jj,kk)) / (4 * n)
            enddo
        enddo
      enddo
    !$ser verbatim if (n == 1 .or. n == 50) then
    !$ser savepoint 'loop_data'
    !$ser data arr=arr
    !$ser verbatim endif
    enddo
    !$ser cleanup
    
end program

Overwriting serialBox_tutorial2.F90


In [36]:
%%bash

[ -f tutorial_run2 ] && rm tutorial_run2
[ -f s_serialBox_tutorial2.F90 ] && rm s_serialBox_tutorial2.F90

python3 ${SERIALBOX_ROOT}/python/pp_ser/pp_ser.py -s -v --output=s_serialBox_tutorial2.F90 serialBox_tutorial2.F90
gfortran -O3 -cpp -DSERIALIZE \
    -o tutorial_run2 s_serialBox_tutorial2.F90 \
    -I${SERIALBOX_ROOT}/include \
    ${SERIALBOX_ROOT}/lib/libSerialboxFortran.a \
    ${SERIALBOX_ROOT}/lib/libSerialboxC.a \
    ${SERIALBOX_ROOT}/lib/libSerialboxCore.a \
    -lpthread -lstdc++ -lstdc++fs 
if [ -d ./data2 ]; then
   rm -r ./data2
fi
echo "Running"
./tutorial_run2

Processing file serialBox_tutorial2.F90
Running
 >>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<
 >>>>>>>>>>>>>>>>>><<<<<<<<<<<<<<<<<<


In [7]:
#!/usr/bin/env python3

import numpy as np
import sys
import os
sys.path.append(os.environ.get('SERIALBOX_ROOT') + '/python')
import serialbox as ser
import gt4py.gtscript as gtscript
import gt4py.storage as gt_storage

@gtscript.stencil(backend="numpy")
def update_arr(arr: gtscript.Field[np.float64], n: int):
    with computation(PARALLEL), interval(...):
        arr = arr + (arr[-1, 0, 0] + arr[0, -1, 0] + 2 * arr) / (4 * n)
        

serializer = ser.Serializer(ser.OpenModeKind.Read,'./data2', 'example2')
saved_arrs = []
for savepoint in serializer.savepoint_list():
    if savepoint.name == 'loop_data':
        saved_arrs.append(serializer.read('arr',   savepoint))
shape = saved_arrs[0].shape
arr0_ref = np.zeros(shape, dtype=np.float64)
for n in range(1, 101):
    update_arr(arr0_ref, n, origin=(1, 1, 0), domain=(shape[0] - 1, shape[1] - 1, shape[2]))
    if n == 1 or n == 50:
      try:
          assert np.array_equal(arr0_ref, saved_arrs.pop()), "dp_arr0 does not match!"
      except AssertionError as msg:
        print(msg)


print("Finished running comparison tests!")

Finished running comparison tests!
