- 
                Notifications
    
You must be signed in to change notification settings  - Fork 121
 
Refactoring + Additional Optimizations #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           @anandrdbz @wilfonba CI failed. Why have all the golden files changed?  | 
    
          
 @sbryngelson should be fixed now, there was a deallocate statement in m_viscous that was causing the bug (which only shows on GCC)  | 
    
| @@ -0,0 +1,4 @@ | |||
| #!bash | |||
| 
               | 
          |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this file
| qL_rsx_vf_flat, qL_rsy_vf_flat, qL_rsz_vf_flat, & | ||
| qR_rsx_vf_flat, qR_rsy_vf_flat, qR_rsz_vf_flat, & | ||
| qL_rsx_vf, qL_rsy_vf, qL_rsz_vf, & | ||
| qR_rsx_vf, qR_rsy_vf, qR_rsz_vf, & | ||
| id) | ||
| 
               | 
          ||
| iv%beg = mom_idx%beg; iv%end = mom_idx%end | ||
| if (weno_Re_flux) then | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like this subroutine s_reconstruct_cell_boundary_values_visc_deriv should be moved to the viscous module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must have missed that one. I'll move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @wilfonba -- i already pulled your request so please create a new one for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
Code is fully refactored (RHS only 2800 lines) and passes all test cases on GPUs. Dead code removed from m_cbc, m_weno, m_monopole and m_bubbles as well (FYPP, removing _flat, _alt etc)
Run times are identical (slightly quicker in fact) for the fully resolved cases.
Sub-grid bubbles, which was previously quite slow on GPUs, now runs 20X faster (had to modify the bubble source kernel). Now, they exhibit identical speedups to the fully resolved case.
QBMM cases also show slight (5%) improvement from previous state. However GPU speedup is limited (3-4X slower than fully resolved / bubbles) due to the vast number of power operations (This is quite slow on OpenACC + F90).
In summary, runtimes for all cases are either identical to previous state or significantly better, with a vast reduction in number of lines