Backend structure #7

semi-h · 2023-09-12T14:53:50Z

~~First commit to showcase the backend structure in the new framework. It's possible that this won't compile, but it demonstrates the main idea.~~

Subsequent commits added a lot of functionality to the codebase and we can run the transport equation on the CUDA backend now.

src/backend.f90

semi-h · 2023-11-07T14:30:30Z

So as of now with the latest commit, the cuda backend that extends base backend also compiles (however extremely simplified and basically does nothing inside). Also, we have a time integrator that executes the algorithm within its subroutine called 'run'. All this is managed in the main file xcompact.f90, which is ultimately used for creating an executable.

The issue I noticed at this level is that our allocator is not advanced enough to deal with different nx, ny, nz. To fix this I need to introduce a simple functionality in the allocator using the 'bounds remapping' introduced in fortran 2003. I have already used this 'bounds remapping' technique with the nvidia compiler and gfortran and haven't got any issues.

src/cuda/backend.f90

semi-h · 2023-11-22T17:18:38Z

I fixed the issue I mentioned on Monday about kernels not being executed. Now the transport equation runs in the backend structure.

Next few days I'll work on separating a solver class from backend class and then the PR be ready to merge.

Nanoseb

Looking good overall. Just had a handful of very minor comments.

src/common.f90

Nanoseb · 2023-11-28T15:35:18Z

src/common.f90

+   subroutine set_pprev_pnext(xprev, xnext, yprev, ynext, zprev, znext, &
+                              xnproc, ynproc, znproc, nrank)


Not sure how you want to go about tests, but this function would be well suited to have one as it is easy to make a typo with sign/variables and it is isolated from the rest of the codebase.

src/cuda/backend.f90

Nanoseb · 2023-11-28T15:53:26Z

src/xcompact.f90

+   ! GPU only
+   !allocate(cuda_allocator_t :: allocator)
+   !allocate(cuda_backend_t :: backend)
+
+   !cuda_backend = cuda_backend_t(allocator, xdirps, ydirps, zdirps)
+
+   !allocator = cuda_allocator_t([SZ, 512, 512*512/SZ])
+   !backend = cuda_backend_t(allocator, xdirps, ydirps, zdirps)


should this be removed? as I understand there shouldn't be any implementation specifics calls in this part of the code

We need to make the main file a bit more elaborate so that it can handle both backends. The problem is if you're not on a GPU cluster you probably don't have the NVIDIA compiler and then all the CUDA Fortran extentions NVIDIA compiler supports are not recognized by standard Fortran compilers. Our Cmake file compiles the cuda backend only when the compiler is NVIDIA compiler. Other times cuda backend is not even compiled and I think we'll need to use 'ifdef' stuff just in a few places in the main file to sort this out.

semi-h marked this pull request as draft September 12, 2023 14:54

semi-h changed the title ~~Backend structure, incomplete but gives an idea.~~ Backend structure Sep 12, 2023

semi-h commented Sep 14, 2023

View reviewed changes

src/backend.f90 Outdated Show resolved Hide resolved

semi-h force-pushed the backend branch from 591ef3d to 018f9ef Compare September 26, 2023 10:02

semi-h force-pushed the backend branch from 3733150 to 1c67269 Compare October 16, 2023 12:15

semi-h force-pushed the backend branch 3 times, most recently from 3e9a7d9 to 67723b6 Compare November 6, 2023 12:04

pbartholomew08 reviewed Nov 7, 2023

View reviewed changes

src/backend.f90 Outdated Show resolved Hide resolved

semi-h force-pushed the backend branch from 67723b6 to 3db002d Compare November 8, 2023 10:39

semi-h commented Nov 8, 2023

View reviewed changes

src/cuda/backend.f90 Show resolved Hide resolved

slaizet self-requested a review November 8, 2023 13:33

semi-h mentioned this pull request Nov 9, 2023

Paire (sym) issue in the transport equation #14

Closed

semi-h linked an issue Nov 9, 2023 that may be closed by this pull request

Paire (sym) issue in the transport equation #14

Closed

semi-h force-pushed the backend branch 3 times, most recently from 4516a0a to 0f9177b Compare November 16, 2023 17:10

semi-h marked this pull request as ready for review November 17, 2023 15:09

semi-h requested review from pbartholomew08, mathrack, Nanoseb and rfj82982 November 17, 2023 15:10

semi-h added 7 commits November 22, 2023 16:12

feat: Backend structure, incomplete but gives an idea.

1151430

chore: Add comments.

be4b39b

fix: base backend compiles.

be783c4

feat: Simplified cuda_backend compiles and we have an executable.

16ea142

feat: Backend structure is ready to execute kernels.

ca7cb48

feat: Example use of a CUDA kernel in the CUDA backend.

d3bde27

feat(cuda): Move towards running transeq via the cuda backend.

f134f94

semi-h added 3 commits November 22, 2023 16:12

feat(cuda): Transport equation implementation is complete.

62c6eca

feat(cuda): Add set_fields and get_fields for host-device transfers.

5c46780

fix(cuda): Fix a bug in the CUDA backend with blocks and threads.

71f177f

semi-h force-pushed the backend branch from 70c76ae to 71f177f Compare November 22, 2023 17:14

semi-h added 2 commits November 23, 2023 16:26

refactor: Extract a solver class from backend class.

d416702

feat: Simplify tdsops initialisation.

679bfb8

semi-h force-pushed the backend branch from 84727ae to 679bfb8 Compare November 28, 2023 12:45

semi-h linked an issue Nov 28, 2023 that may be closed by this pull request

Potential refactoring of the backends into a solver class and a backend class #13

Closed

Nanoseb reviewed Nov 28, 2023

View reviewed changes

semi-h added 5 commits November 30, 2023 12:38

feat(cuda): Handle halo data copy in a dedicated subroutine.

238e48a

refactor: Simplify setting prev and next for each dimension.

73886dc

chore: Cleanups.

ef6f800

feat: Use preprocessor to handle the CUDA backend in the main file.

2b9c91b

feat(omp): Add an empty OpenMP backend to test the backend structure.

5ac4f30

slaizet merged commit b1a78de into xcompact3d:main Dec 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend structure #7

Backend structure #7

semi-h commented Sep 12, 2023 •

edited

Loading

semi-h commented Nov 7, 2023

semi-h commented Nov 22, 2023

Nanoseb left a comment

Nanoseb Nov 28, 2023

Nanoseb Nov 28, 2023

semi-h Nov 28, 2023

		subroutine set_pprev_pnext(xprev, xnext, yprev, ynext, zprev, znext, &
		xnproc, ynproc, znproc, nrank)

Backend structure #7

Backend structure #7

Conversation

semi-h commented Sep 12, 2023 • edited Loading

semi-h commented Nov 7, 2023

semi-h commented Nov 22, 2023

Nanoseb left a comment

Choose a reason for hiding this comment

Nanoseb Nov 28, 2023

Choose a reason for hiding this comment

Nanoseb Nov 28, 2023

Choose a reason for hiding this comment

semi-h Nov 28, 2023

Choose a reason for hiding this comment

semi-h commented Sep 12, 2023 •

edited

Loading