Have SZ, nx_loc, n_groups etc. accessible "globally" #77

Nanoseb · 2024-03-14T17:19:24Z

At the moment all the variables that define block sizes are defined in various places and sometimes duplicated over multiple structures. For example, backend_t has nx_loc, ny_loc, ny_loc, but so has backend_t%xdirps%n etc.
SZ is accessed in m_omp_common or m_cuda_common depending on the backend, meaning we can't access it outside the backend functions (unless being passed as argument from there etc.).

It would be good to have a single consistent way of accessing all of these variable. It will also reduce the number of arguments needed to be passed around.

The text was updated successfully, but these errors were encountered:

semi-h · 2024-03-21T18:22:58Z

I agree, there is a bit duplication in some variables and its better to clean them up sooner than later. Lets try to come up some sensible notation here and then implementing it will be easy.

I think the most important point is having a clear distinction between padded dimensions and actual domain dimensions. This will help for a smooth transition from periodic to non-periodic BCs. Currently in both backends we use nx/y/z_loc as if they're padded dimensions, especially in reorders, but their name doesn't suggest this at all. Maybe we can drop _loc because we almost always work with the local dimensions, and rarely with global. We can highlight in the name when a size is global and just not bother for local dimensions I think. So maybe we can replace nx/y/z_loc with dims_padded(3)?

For the actual local domain size, I think dirps%n is the best place to store it. tds_solve is passed the dirps and the domain size is already taken from the dirps%n I suppose.

Also, dirps%n_blocks is probably the best for accessing the number of pencil groups in the domain. Its name is the worst though, back then there was only CUDA stuff around and because the CUDA block dimension is the number of pencil groups, this influenced me to give it a wrong name. So maybe we can rename n_blocks to n_groups, and get rid of the globs%n_groups_x/y/z as it is completely a duplicate and has very few uses that can easily be replaced. n_groups may not be the best though, so please share if you have an idea. n_pencilettes could be a fun one 😆, as pencils were these chunky groups of lines in a 2D domain decomposition setting and the new SZ many lines grouped together are small compared to that.

SZ is a different story. Its annoying it comes from two different sources based on the backend, but having a single source can be a problem for performance when running a heterogeneous simulation. We don't support this yet, however if we give up now and collapse SZs into one then we will stop passing it as an argument, and then it'll be a pain to bring this seperation back into the codebase. I think this affects very few places in the code though, so would be good if can keep it as is.

Nanoseb · 2024-03-29T16:36:10Z

I agree with getting rid of _loc as we are almost always dealing with local domain indeed.

Happy with n_pencilettes :) but n_groups would also work for me

About SZ, I wasn't suggesting we have a single value. It should be based on the architecture/backend. But have a single way of accessing it would be good. At the moment we need to run either use m_omp_common, only: SZ or use m_cuda_common, only: SZ. And while most of the time it is ok because only under backend things, there are some moments when we will need to get SZ outside that and I'd rather not use ifdef cuda... unless we really have to. I had to do something like that for the ordering because that module shouldn't live under any backend for example.

Just a thought but, nx/ny/nz, n_groups, and SZ define dimensions of the fields. Could they be stored in field_t instead of dirps and be adapted depending on the dir.
For example, if a field is in the y direction, field%n => ny, field%n_padded => ny_padded field%n_groups => n_groups_y, field%SZ => SZ etc. All of these variables would be defined in the allocator and setup automatically when issuing get_block(DIR_Y).

That way any loop working on a particular field will always work with the same loop ranges (field%n_groups, field%n and field%SZ) and these values will always be 'at hand' without passing dirps around when it isn't needed.

Nanoseb · 2024-07-30T14:17:43Z

solved by #95

Nanoseb added the maintainability label Mar 14, 2024

This was referenced Mar 15, 2024

Reorder for OMP backend #68

Merged

Make allocator more flexible with domain sizes #34

Merged

Nanoseb mentioned this issue Apr 9, 2024

Move dimensions to allocator #85

Closed

Nanoseb closed this as completed Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have SZ, nx_loc, n_groups etc. accessible "globally" #77

Have SZ, nx_loc, n_groups etc. accessible "globally" #77

Nanoseb commented Mar 14, 2024 •

edited

Loading

semi-h commented Mar 21, 2024

Nanoseb commented Mar 29, 2024

Nanoseb commented Jul 30, 2024

Have SZ, nx_loc, n_groups etc. accessible "globally" #77

Have SZ, nx_loc, n_groups etc. accessible "globally" #77

Comments

Nanoseb commented Mar 14, 2024 • edited Loading

semi-h commented Mar 21, 2024

Nanoseb commented Mar 29, 2024

Nanoseb commented Jul 30, 2024

Nanoseb commented Mar 14, 2024 •

edited

Loading