You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allocates within compute code are OK on the CPU but when adding directives to accelerate the code on a GPU, these allocates are left on the CPU and then (with managed memory) copied over to the GPU leading to a lot of unnecessary data traffic. Ideally we should be able to transform these allocates in a similar manner to what we do for automatic arrays. (i.e. make them module scope if they aren't already and then only reallocate them if their size changes).
The text was updated successfully, but these errors were encountered:
This also happens in at least 1 place of the NEMO 4.0 metoffice configuration and the profiler shows lots of data traffic there. So improving this will also benefit our NEMO 4.0 gpu performance.
Allocates within compute code are OK on the CPU but when adding directives to accelerate the code on a GPU, these allocates are left on the CPU and then (with managed memory) copied over to the GPU leading to a lot of unnecessary data traffic. Ideally we should be able to transform these allocates in a similar manner to what we do for
automatic
arrays. (i.e. make them module scope if they aren't already and then only reallocate them if their size changes).The text was updated successfully, but these errors were encountered: