[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

ThenoobMario · 2021-08-19T09:50:54Z

[GSoC 2021] GPU Accelerated Sub-grdding

The purpose of this Pull Request is to explain the code structure and implementation of HSG Sub-gridding Algorithm ported to GPU for improved computation performance. This was done within the time duration of GSoC 2021 along with the help of my mentors.

The implementation of GPU kernels for subgridding closely follows the the Cython implementation of the same algorithm.

In the subgridding implementation there is a difference with how the Grid is calculated. While the Main_grid computation looks at the whole 3D grid simulataneously, the computation for sub-grid happens by cutting the whole grid into 2D slices of arrays.

Code Structure

The CUDA Kernels

The main GPU kernels can be found in the the file cuda/hsg_field_updates.py. These kernels are then implemented in subgrids/subgrid_hsg.py where the main update computations happen.

In the CUDA kernels, the main difference between the main GPU FDTD implementation and the GPU Sub-grid FDTD implementation is the way the subscript indexing is calculated for traversal of EM fields.

In the main GPU FDTD implementation (Found in cuda/field_updates.py) the indexing is calculated as follows:

// Linear Index to Subscript
int i = idx / ($NY_FIELDS * $NZ_FIELDS);
int j = (idx % ($NY_FIELDS * $NZ_FIELDS)) / $NZ_FIELDS;
int k = (idx % ($NY_FIELDS * $NZ_FIELDS)) % $NZ_FIELDS;

Where:

idx = The linear index corresponding to current CUDA thread
$NY_FIELDS = Size of array in y direction.
$NZ_FIELDS = Size of array in z direction.

Here, we are assuming that the traversal happens in Z-direction first.

This subscript calculation changes for Subgridding as:

We don't need to wrap around the FDTD grid.
Only two indices change during one computation.

Hence the index calculation happens as follows:

// Linear Index to Subscript
int l = idx / ($NZ_FIELDS * $NY_FIELDS); 
int m = idx % ($NZ_FIELDS * $NY_FIELDS);

Where:

l & m = x/y/z index respectively depending upon which face of the cube is being calculated.

The Abstraction and Structure of Classes

The Subgridding directory contains all the classes and methods related to subgridding. The ones where there have been significant changes/ have importance to the explanation are as shown here

The classes CPUSubGridBase and CUDASubGridBase inherit FDTDGrid and CUDAGrid respectively, along with SubGridBase.
The create_updates() method is called in solver.py which integrates the subgrid computations to the already existing gprMax structure.

Further Work to be Done

GPU kernels for Precursors Nodes calculations.
Refactoring the CUDASubGridHSG class to reduce redunduncy.

…evel

ThenoobMario and others added 29 commits June 15, 2021 13:23

Added WSL2 integration doc

9e47b58

Update gprMax_WSL_integration.rst

79012ab

Merge branch 'gprMax:devel' into devel

4e30b35

Updated WSL2 integration guide

870a14d

Merge branch 'devel' of https://github.com/ThenoobMario/gprMax into d…

58b7e33

…evel

Merge branch 'gprMax:devel' into devel

b1c1c13

Merge branch 'gprMax:devel' into devel

6867b70

Added initial CUDA code

a1225e7

changed variables a and b

a8ad2de

Corrected block and grid dimensions

20385de

Corrected bpg

75cb8a7

Updated subgrid_hsg.py

c9532b6

changed indexing

9987a80

Updated subgrid_hsg.py

1fca40f

Changed inc_field Index Calc

1a813be

Made the variables constant

b5a49c0

Successful implementation of CUDA kernel

5932d6c

Completed gpu_update_magnetic_os

eecefb1

Completed gpu_hsg_update_electric_os

7646e14

Completed hsg_update_is_gpu

ab0ab40

Kernel implementation of all functions done

c51aae0

Added CUDASubgridUpdater class

c3a19af

Shifted kernels to diff file

ab7d99d

Added Subgrid Abstractions to run CUDA code

32b23df

Added comments in the HSG kernels

a85aa28

Added more comments

a69a556

Added Subgrid Class Structure diagram

3908cef

Added windows 10 note

54dc66d

added subgrid_gpu example notebook

fa95951

ThenoobMario marked this pull request as ready for review August 20, 2021 14:58

Merge branch 'gprMax:devel' into devel

0d20484

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

ThenoobMario commented Aug 19, 2021 •

edited

[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

Are you sure you want to change the base?

[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

Conversation

ThenoobMario commented Aug 19, 2021 • edited

[GSoC 2021] GPU Accelerated Sub-grdding

Code Structure

The CUDA Kernels

The Abstraction and Structure of Classes

Further Work to be Done

ThenoobMario commented Aug 19, 2021 •

edited