New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

FEM Mass matrix element assembly kernel #330

Merged

MrBurmark merged 14 commits into develop from artv3/mass-ea

Jul 21, 2023

Member

artv3 commented Jun 1, 2023

Kernel for element local mass matrix assembly

artv3 added 4 commits

June 1, 2023 11:33


          initial commit for mass matrix assembly kernel

d59e416


          add in sequential variant

23f1f03


          add omp variant

0f61d29


          add mass 3D ea kernel

23ea952

artv3 requested review from MrBurmark and rhornung67

June 1, 2023 21:44


          Empty-Commit

96630b3

Member Author

artv3 commented Jul 5, 2023

@rhornung67 @MrBurmark could I get a review on this PR?

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA.hpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA.hpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Seq.cpp Outdated Show resolved Hide resolved

MrBurmark reviewed

View reviewed changes

src/apps/MASS3DEA-OMP.cpp Outdated Show resolved Hide resolved

MrBurmark reviewed

View reviewed changes

src/apps/MASS3DEA-Seq.cpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Seq.cpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Seq.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-OMP.cpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-OMP.cpp Show resolved Hide resolved

MrBurmark reviewed

View reviewed changes

src/apps/MASS3DEA.hpp Outdated Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Cuda.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Cuda.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Hip.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-Hip.cpp Show resolved Hide resolved

rhornung67 reviewed

View reviewed changes

src/apps/MASS3DEA-OMPTarget.cpp Show resolved Hide resolved

artv3 added 3 commits

July 6, 2023 09:25


          fix link and add c style implementation

25cccc2


          remove duplicate macros

ad5554f


          spacing and fix team sync placement

a00fa2d

Member Author

artv3 commented Jul 6, 2023

@rhornung67 I added some notes on the z-loops with an iteration space of [0,1) that the kernels have. Let me know if it makes sense; I can expand more.

rhornung67 mentioned this pull request

v2023.06.0 Release #344

Closed

24 tasks

Member

MrBurmark commented Jul 11, 2023 •

edited

Loading

My opinion on the "extra" length 1 loops is that if they are not necessary in some of the Base/Lambda implementations its fine to remove them. It seems like in this case that is how you would naturally code up the native variants of the implementation. On the other hand its important for the RAJA implementation to be single source and the same for all RAJA variants because that is how we write RAJA code.
To expound a little further afield, and this is not the case here, but sometimes the sequential optimization is large enough that the sequential and parallel implementations are substantially different. In that case its probably best to have two kernels instead of having significant differences between the Base and RAJA variants. An example of this is the basic INDEXLIST and INDEXLIST_3LOOP kernels where there are two ways to create the same list of indices. Breaking this into two kernels allows us to do variant to variant comparisons that are apples to apples and kernel to kernel comparisons that show the performance implications of each algorithm/implementation.

Member Author

artv3 commented Jul 18, 2023

TODO: Remove extra loop in host base variants with iteration range [0, 1)

artv3 added 6 commits

July 18, 2023 09:43


          remove loop with iteration 1

5db5914


          Merge branch 'develop' into artv3/mass-ea

dcc6b36


          Merge branch 'develop' into artv3/mass-ea

69b71dc


          build fixes

be1c72e


          Merge branch 'develop' into artv3/mass-ea

9234d8d


          loop_exec -> seq_exec

47b62ee

MrBurmark approved these changes

View reviewed changes

MrBurmark enabled auto-merge

July 20, 2023 23:15

MrBurmark merged commit ab03e6e into develop

16 of 17 checks passed

MrBurmark deleted the artv3/mass-ea branch

July 21, 2023 03:07

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment