Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend structure #7

Merged
merged 17 commits into from
Dec 1, 2023
Merged

Backend structure #7

merged 17 commits into from
Dec 1, 2023

Conversation

semi-h
Copy link
Member

@semi-h semi-h commented Sep 12, 2023

First commit to showcase the backend structure in the new framework. It's possible that this won't compile, but it demonstrates the main idea.

Subsequent commits added a lot of functionality to the codebase and we can run the transport equation on the CUDA backend now.

@semi-h semi-h marked this pull request as draft September 12, 2023 14:54
@semi-h semi-h changed the title Backend structure, incomplete but gives an idea. Backend structure Sep 12, 2023
src/backend.f90 Outdated Show resolved Hide resolved
src/backend.f90 Outdated Show resolved Hide resolved
@semi-h
Copy link
Member Author

semi-h commented Nov 7, 2023

So as of now with the latest commit, the cuda backend that extends base backend also compiles (however extremely simplified and basically does nothing inside). Also, we have a time integrator that executes the algorithm within its subroutine called 'run'. All this is managed in the main file xcompact.f90, which is ultimately used for creating an executable.

The issue I noticed at this level is that our allocator is not advanced enough to deal with different nx, ny, nz. To fix this I need to introduce a simple functionality in the allocator using the 'bounds remapping' introduced in fortran 2003. I have already used this 'bounds remapping' technique with the nvidia compiler and gfortran and haven't got any issues.

@slaizet slaizet self-requested a review November 8, 2023 13:33
@semi-h semi-h linked an issue Nov 9, 2023 that may be closed by this pull request
@semi-h semi-h force-pushed the backend branch 3 times, most recently from 4516a0a to 0f9177b Compare November 16, 2023 17:10
@semi-h semi-h marked this pull request as ready for review November 17, 2023 15:09
@semi-h
Copy link
Member Author

semi-h commented Nov 22, 2023

I fixed the issue I mentioned on Monday about kernels not being executed. Now the transport equation runs in the backend structure.

Next few days I'll work on separating a solver class from backend class and then the PR be ready to merge.

Copy link
Collaborator

@Nanoseb Nanoseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good overall. Just had a handful of very minor comments.

src/common.f90 Outdated Show resolved Hide resolved
Comment on lines +19 to +20
subroutine set_pprev_pnext(xprev, xnext, yprev, ynext, zprev, znext, &
xnproc, ynproc, znproc, nrank)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how you want to go about tests, but this function would be well suited to have one as it is easy to make a typo with sign/variables and it is isolated from the rest of the codebase.

src/cuda/backend.f90 Outdated Show resolved Hide resolved
src/xcompact.f90 Outdated
Comment on lines 103 to 110
! GPU only
!allocate(cuda_allocator_t :: allocator)
!allocate(cuda_backend_t :: backend)

!cuda_backend = cuda_backend_t(allocator, xdirps, ydirps, zdirps)

!allocator = cuda_allocator_t([SZ, 512, 512*512/SZ])
!backend = cuda_backend_t(allocator, xdirps, ydirps, zdirps)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be removed? as I understand there shouldn't be any implementation specifics calls in this part of the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make the main file a bit more elaborate so that it can handle both backends. The problem is if you're not on a GPU cluster you probably don't have the NVIDIA compiler and then all the CUDA Fortran extentions NVIDIA compiler supports are not recognized by standard Fortran compilers. Our Cmake file compiles the cuda backend only when the compiler is NVIDIA compiler. Other times cuda backend is not even compiled and I think we'll need to use 'ifdef' stuff just in a few places in the main file to sort this out.

@slaizet slaizet merged commit b1a78de into xcompact3d:main Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants