Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC 2021] Implementing GPU Accelerated Sub-gridding #297

Open
wants to merge 30 commits into
base: devel
Choose a base branch
from

Conversation

ThenoobMario
Copy link

@ThenoobMario ThenoobMario commented Aug 19, 2021

[GSoC 2021] GPU Accelerated Sub-grdding

The purpose of this Pull Request is to explain the code structure and implementation of HSG Sub-gridding Algorithm ported to GPU for improved computation performance. This was done within the time duration of GSoC 2021 along with the help of my mentors.

The implementation of GPU kernels for subgridding closely follows the the Cython implementation of the same algorithm.

In the subgridding implementation there is a difference with how the Grid is calculated. While the Main_grid computation looks at the whole 3D grid simulataneously, the computation for sub-grid happens by cutting the whole grid into 2D slices of arrays.

Code Structure

The CUDA Kernels

The main GPU kernels can be found in the the file cuda/hsg_field_updates.py. These kernels are then implemented in subgrids/subgrid_hsg.py where the main update computations happen.

In the CUDA kernels, the main difference between the main GPU FDTD implementation and the GPU Sub-grid FDTD implementation is the way the subscript indexing is calculated for traversal of EM fields.

In the main GPU FDTD implementation (Found in cuda/field_updates.py) the indexing is calculated as follows:

// Linear Index to Subscript
int i = idx / ($NY_FIELDS * $NZ_FIELDS);
int j = (idx % ($NY_FIELDS * $NZ_FIELDS)) / $NZ_FIELDS;
int k = (idx % ($NY_FIELDS * $NZ_FIELDS)) % $NZ_FIELDS;

Where:

  • idx = The linear index corresponding to current CUDA thread
  • $NY_FIELDS = Size of array in y direction.
  • $NZ_FIELDS = Size of array in z direction.

Here, we are assuming that the traversal happens in Z-direction first.

This subscript calculation changes for Subgridding as:

  • We don't need to wrap around the FDTD grid.
  • Only two indices change during one computation.

Hence the index calculation happens as follows:

// Linear Index to Subscript
int l = idx / ($NZ_FIELDS * $NY_FIELDS); 
int m = idx % ($NZ_FIELDS * $NY_FIELDS);

Where:

  • l & m = x/y/z index respectively depending upon which face of the cube is being calculated.

The Abstraction and Structure of Classes

The Subgridding directory contains all the classes and methods related to subgridding. The ones where there have been significant changes/ have importance to the explanation are as shown here

  • The classes CPUSubGridBase and CUDASubGridBase inherit FDTDGrid and CUDAGrid respectively, along with SubGridBase.
  • The create_updates() method is called in solver.py which integrates the subgrid computations to the already existing gprMax structure.

Further Work to be Done

  • GPU kernels for Precursors Nodes calculations.
  • Refactoring the CUDASubGridHSG class to reduce redunduncy.

@ThenoobMario ThenoobMario marked this pull request as ready for review August 20, 2021 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant