-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast integration changes for Vectorial Envelope and Optimizations #36
Conversation
@jbadger95 When you have the PR all cleaned up and ready-for-review, we'll have to also ask Brandon and Federico to take a look at the changes you are proposing within the shape functions before we merge. |
@federicofuentes @brendankeith Hi, I made some changes to the enriched shape functions on Quads and Prisms essentially mirroring what's in Hexa, I was hoping you guys could take a look. The change is for convenient ordering of the shape functions in sum-factorization routines, it's not really about performance or anything (the expense of computing the shape functions is insignificant even in these sum-factorization elem routines). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbadger95 great work!!! I only have minor suggestions; but here are a few thoughts I generally had when looking over the changes:
- Fast integration implementations have become increasingly complex because there are so many different terms to track that entirely avoiding introduction of new bugs appears almost impossible. In particular:
- Adding additional terms to an existing formulation is quite challenging (as evident by the work you had to do for "simply" adding envelope terms to bilinear form and Gram matrix);
- A thorough code review by looking only at the changes in fast integration routines is difficult; one needs to basically review the entire implementation (thousands of lines);
- If you have ideas for general simplifications of implementing fast integration in the future, I am open to ideas (maybe modularize somehow?).
- Did you check for possible memory leaks, either through debugging tool, or an easy hack is by solving the same linear system over and over while keeping track of memory usage (e.g. via
htop
) -- if in the process of the repeated solving memory usage increases there's a leak.
- I haven't done verification runs yet, but will comment again once I have.
trunk/problems/LASER/UW_COUPLED/elem/elem_maxwell_gram_hexa.F90
Outdated
Show resolved
Hide resolved
trunk/problems/LASER/UW_COUPLED/elem/elem_maxwell_gram_hexa.F90
Outdated
Show resolved
Hide resolved
I agree these routines are nasty, comparing to the standard routines is helpful checking but it can take a while to track down bugs in these routines (as evidenced by my glacial pace). I have some ideas for modularization, we might have to explore a bit to try to find a solution that has a minimal impact on performance though. I'll maybe work on some simple examples when I get a chance. |
Maybe some performance hit (a constant factor) is alright as long as the asymptotic speedup is still there. |
If we want to look at this, we should probably do it with a simpler implementation where testing is easier and faster. Maybe a linear UW Maxwell model problem or so. |
The simplest way to modularize these would be to separate computation of the Gram and stiffness matrix. We already have a routine for the Gram matrix, we could make one exclusively for the stiffness matrices (maybe even separate ones for the element and boundary) and then have a different subroutine that does the LA. That would at least make routines more manageable. I can outline what I was thinking for further modularization but I doubt it’s performance will be great. |
This would help modularizing, but not really simplify implementation and debugging in any significant way. I was trying to think further than that, somehow coming up with a more adaptable implementation where things like handling auxiliary matrices would become easier. |
Or the heat problem? |
Possible, yes. But using the existing primal DPG Poisson model problem as a testing ground makes things a lot easier than having to run the LASER application for testing. |
…d up linear algebra operations
…und 1e-9 tolerance
…n routine accordingly Also removed some of the previous interior integration optimizations, can reimplement later
I think we can merge this. @jbadger95 do you agree?
In addition, it would be nice to have the PML parameters be automatically tuned depending on the user input for the envelope wavenumber. This could go into a separate PR. |
@jbadger95 you mentioned fixing some minor issue recently in the FI routines for the envelope formulation? Did you make the change on a different branch? If so, would like to add it here or should we merge this branch into |
I made the changes on my bending branch, I'll fix the errors quick on this one though. I have some more extensive changes to generalize the PML and stuff but I'll make those on a separate PR. |
Ok, sounds good. I believe otherwise, once you have added the fix, this branch should be ready for merge? |
@@ -103,7 +103,7 @@ subroutine elem_residual_maxwell(Mdle,Fld_flag, & | |||
! | |||
!..Gram matrix in packed format | |||
!VTYPE, dimension(NrTest*(NrTest+1)/2) :: gramTest | |||
VTYPE, allocatable :: gramP(:) | |||
VTYPE, allocatable :: gramP(:), Grfp(:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefanhenneking I'll remove this RFP format stuff from the residual routine as well. This is one of those cases where it's a MatVec operation so it's still only using BLAS2 and it's actually somewhat slower using RFP format.
Yeah, give me a bit to look over my changes on my other branch and make sure nothing else is crucial then we should be good. |
…_maxwell to used packed format
Design
Fast integration routines need to be updated to include additional vectorial envelope terms. While implementing these changes the following optimizations were also identified:
Finally, computation of enriched shape functions on prism and quad elements should be updated to use tensor produce ordering. This simplifies the DoF maps in the prism routine (the prism map is trivial, however when using the prism routine to assemble hexas it's necessary to convert from the 2D-1D tensor ordering to the 1Dx1Dx1D tensor ordering)
Implementation
The following routines need to be updated to include vectorial envelope terms:
The following routines need to be updated to include boundary and linear algebra optimizations:
The following enriched shape functions routines shoud be updated to use tensor product ordering (the hexa already uses tensor product ordering):