Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Production de hpc tracer #594

Draft
wants to merge 7 commits into
base: production_DE
Choose a base branch
from
Draft

Conversation

basava70
Copy link
Collaborator

@basava70 basava70 commented Jun 6, 2024

Creating a draft pull request for Tracer porting. I will keep updating in this pull request from now. When I reach a mile stone, I can make it a final pull request with updated computation timings.
Also, this is created on top of the changes from production_DE_openacc_test branch.

Copy link
Collaborator

@mandresm mandresm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I have some questions out of curiosity, see below.

Comment on lines 51 to 58
do nz = 1, mesh%nl-1
if (tracers%data(tr_num)%AB_order==2) then
tracers%data(tr_num)%valuesAB(nz, n) =-(0.5_WP+epsilon)*tracers%data(tr_num)%valuesold(1, nz, n)+(1.5_WP+epsilon)*tracers%data(tr_num)%values(nz, n)
elseif (tracers%data(tr_num)%AB_order==3) then
tracers%data(tr_num)%valuesAB(nz, n) =5.0_WP*tracers%data(tr_num)%valuesold(2, nz, n)-16.0_WP*tracers%data(tr_num)%valuesold(1, nz, n)+23.0_WP*tracers%data(tr_num)%values(nz, n)
tracers%data(tr_num)%valuesAB(nz, n) =tracers%data(tr_num)%valuesAB(nz, n)/12.0_WP
end if
end do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is where you place the loop out of the if condition, correct?

do n=1, partit%myDim_nod2D+partit%eDim_nod2D
! AB interpolation
if (tracers%data(tr_num)%AB_order==2) then
tracers%data(tr_num)%valuesAB(:, n) =-(0.5_WP+epsilon)*tracers%data(tr_num)%valuesold(1, :, n)+(1.5_WP+epsilon)*tracers%data(tr_num)%values(:, n)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to that I understand, can you let me know if the following is true @basava70?:

If we run this on GPUs one kernel is taking care of performing this operation for the whole vector, but instead we want that one kernel operates that formula for each element in the vector correct? (And that's what you are doing with your change).

@suvarchal and @basava70, would this change slow down the code for CPUs?

!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(elem, elnodes, nz, nzmin, nzmax)
#else
!$ACC UPDATE DEVICE(gradient_sca)
!$ACC parallel loop private(elnodes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this missing a collapse(2) or would that create some sort of conflict/race-condition related to the elnodes being defined private?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... the nz loop depends on number of levels which varies between different elements so the loops cannot be collapsed. So you can ignore my question above.

@@ -105,7 +144,12 @@ SUBROUTINE tracer_gradient_elements(ttf, partit, mesh)
#include "associate_mesh_def.h"
#include "associate_part_ass.h"
#include "associate_mesh_ass.h"
#ifndef ENABLE_OPENACC
!$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(elem, elnodes, nz, nzmin, nzmax)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there so many variables private for OpenMP and only elnodes for OpenACC? (a question out of curiosity)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants