[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ericyang125/gpu-mp/blob/windef/Openpiv_Python_Tutorial.ipynb)

# Openpiv-python-gpu Tutorial

## Introduction
This tutorial will demonstrate how to use the GPU functionality of openpiv-python-gpu.
There are two GPU-accelerated functions in the GPU-accelerated library--both will be shown here.
Multiprocessing can be used to run multiple PIV processes on the same or multple GPUs, and this will be demonstrated as well.

### openpiv.gpu_process.widim(frame_a, frame_b, **pars)
This advanced function uses windows displacment and deformation to handle higher velocity gradients.

Parameters:

    frame_a, frame_b : array_like
        2D, Image data as arrays. Floats are cast to int32.
    window_sizes : tuple or int
        Window sizes to interrogate over. Default is (32, 16).
    iterations : tuple or int
        Number of iterations to perform. Default is 1.
    overlap_ratio : float
        Ratio of overlap of interrogation windows to size of windows. Default is 0.5.
    mask : array_like
        2D bool, mask to apply to the image field. If None, no points are masked. Default is None.
    win_def : bool
        Whether to apply deformation using the gradient computed on the results of the previous iteration. Default is True.
    dt : int
        Time separation between frames. The velcity vectors are scaled by the inverse of this quantity. Default is 1.
    nb_validation_iter : tuple or int
        Number of validation iterations to perform each iteration. Default is 1.
    median_tol : float
        Tolerance of the median validation. Equal to (value at point - median of surrounding points) / (median of (value of surrounding points - median of surrounding pointss)). Default is 2.
    trust_1st_iter : bool
        Whether the validation is performed on the first iteration. Typically can be set to False for 62 px windows. Default is True.

Returns:

    x, y : ndarray
        Coordinates of the velocity vectors.
    u, v : ndarray
        Resulting velocity field.
    mask : ndarray
        The masked points on the returned coordinates
    s2n : ndarray
        The signal-to-noise ratio at each point of the result.

### Example

In [None]:
Get the 
!nvidia-smi

In [None]:
import os
import numpy
import imageio as io

# import the gpu module and the tools module
import openpiv.gpu_process as gpu_process
import openpiv.tools as tools

In [2]:
import os
import numpy
import imageio as io

# import the gpu module and the tools module
import openpiv.gpu_process as gpu_process
import openpiv.tools as tools

ModuleNotFoundError: No module named 'openpiv.gpu_process'

In [4]:
# PIV parameters
pars = {
'window_sizes': (32, 16),
'iterations': (1, 2),
'overlap_ratio': 0.5,
'mask': None,
'win_def': True,
'dt': 1,
'nb_validation_iter': 1,
'median_tol': 2,
'trust_1st_iter': False
}

In [None]:
# The images can be loaded using imageio.
frame_a  = io.imread('exp1_001_a.bmp')
frame_b  = io.imread('exp1_001_b.bmp')

In [3]:
# The velocity fields are computed.
x, y, u, v, mask, s2n = openpiv.gpu_process.WiDIM(frame_a, frame_b, **pars)

NameError: name 'openpiv' is not defined

In [5]:
# Save the results to a text file.
tools.save(x, y, u, v, mask, s2n, 'exp1_001_gpu.txt')

NameError: name 'tools' is not defined

In [None]:
# The results can be visualized using the openpiv.tools module.
tools.display_vector_field('exp1_001_gpu.txt', scale=5000, width=0.0025)

## Multiprocessing
Multiprocessing is useful for multiple GPUs or if a single process does not use all GPU memory.
Multiple threads decreases batch process time by fully using the memory transfer bandwith, which is a bottleneck.
However, there are decreasing returns because each GPU can only execute a single CUDA kernel at once under the current implementation.

This is WIP.

In [None]:
def mp_func():
    # Importing the gpu_process starts a CUDA context
    import openpiv.gpu_process as gpu_process
    
    cpu_name = current_process().name
    k = (int(cpu_name[cpu_name.find('-') + 1:]) - 1) % gpus
    
    # This suppresses the stout to prevent confusion.
    with redirect_stdout(None):
        x, y, u, v, mask, s2n = openpiv.gpu_process.WiDIM(frame_a, frame_b, **pars)
    
    pass