Implement field::for_each capabilities #139

wdeconinck · 2023-05-16T07:37:52Z

Develop field::for_each functionality. For instance:

Field f1, f2, f3;
// ... assign f1, f2 ...
field::for_each_value(f1, f2, f3, [](const double& x1, const double& x2, double& x3) {
    x3 = x1 + x2;
});

We can also loop over Field::horizontal_dimension() only, hence extracting column-views.
Note that Field::horizontal_dimension() returns a std::vector<idx_t> to cater for multiple horizontal dimensions in a field which don't need to be following eachother (e.g. BlockStructuredColumns fields of dimensions {nblock,nlev,nproma} where nblock and nproma are the horizontal dimensions).

Field f1, f2, f3;
// ... assign f1, f2 ...
field::for_each_column(f1, f2, f3, [](array::View<const double,1>&& x1,
                                      array::View<const double,1>&& x2,
                                      array::View<double,1>&& x3
    const idx_t nlev = x1.size();
    for (idx_t jlev=0; jlev<nlev; ++jlev) {
        x3(jlev) = x1(jlev) + x2(jlev);
    }
});

In case there are extra dimensions (e.g. a "variable" dimension), then the column views could be multidimensional:

Field f1, f2, f3;
// ... assign f1, f2 ...
field::for_each_column(f1, f2, f3, [](array::View<const double,2>&& x1,
                                      array::View<const double,2>&& x2,
                                      array::View<double,2>&& x3) {
    const idx_t nlev = x1.shape(0);
    const idx_t nvar = x1.shape(1);
    for (idx_t jlev=0; jlev<nlev; ++jlev) {
        for (idx_t jvar=0; jvar<nvar; ++jvar) {
            x3(jlev,jvar) = x1(jlev,jar) + x2(jlev,jvar);
        }
    }
});

There are further functions that accept a masking function or masking field (such as a ghost field).
The ghost field is typically a one-dimensional field corresponding to the horizontal_dimension.
If there are multiple horizontal_dimensions, then either the ghost field can be multi-dimensional, or
a one-dimensional view of the horizontal_dimensions with indexing like (ni*j+i).

Field f1, f2, f3; 
Field ghost;
// ... assign f1, f2, ghost ...
field::for_each_value_masked(ghost, f1, f2, f3, [](const double& x1, const double& x2, double& x3) {
    x3 = x1 + x2;
});

Field f1, f2, f3;
Field ghost;
// ... assign f1, f2, ghost ...
field::for_each_column_masked(ghost, f1, f2, f3, [](array::View<const double,1>&& x1,
                                                    array::View<const double,1>&& x2,
                                                    array::View<double,1>&& x3
    const idx_t nlev = x1.size();
    for (idx_t jlev=0; jlev<nlev; ++jlev) {
        x3(jlev) = x1(jlev) + x2(jlev);
    }
});

If a masking function is supplied instead of a ghost field, and only the first field dimension is to be masked, then define it as:

auto mask = [](idx_t i, auto&&... args) {
    // mask depending on i only.
};

wdeconinck · 2023-05-16T07:39:03Z

Please review @odlomax @ytremolet

codecov-commenter · 2023-05-16T07:58:54Z

Codecov Report

Patch coverage: 88.80% and project coverage change: +0.04 🎉

Comparison is base (9a2def8) 78.18% compared to head (a61d667) 78.22%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #139      +/-   ##
===========================================
+ Coverage    78.18%   78.22%   +0.04%     
===========================================
  Files          817      802      -15     
  Lines        59325    59645     +320     
===========================================
+ Hits         46385    46660     +275     
- Misses       12940    12985      +45

Impacted Files	Coverage Δ
src/atlas/field/detail/for_each.h	`53.08% <53.08%> (ø)`
src/atlas/field/for_each.h	`79.51% <79.51%> (ø)`
src/tests/field/test_field_foreach.cc	`99.69% <99.69%> (ø)`
src/atlas/array/helpers/ArrayForEach.h	`96.29% <100.00%> (+3.11%)`	⬆️

... and 155 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

odlomax · 2023-05-16T17:12:27Z

src/atlas/field/detail/for_each.h

+    constexpr auto dims = std::make_index_sequence<Rank>();
+    if constexpr (valid_value_function<Function,Value>())
+    {
+        // Surely this can be improved to be more generic w.r.t. num_fields


Surely we can!

template <int FieldIdx, int NumFields, typename Value, int Rank, typename... Field> auto make_view_tuple(std::tuple<Field...>&& fields) { if constexpr (FieldIdx < NumFields) { return std::tuple_cat(std::make_tuple( array::make_view<Value, Rank>(std::get<FieldIdx>(fields))), make_view_tuple<FieldIdx + 1, NumFields, Value, Rank>(std::move(fields))); } else { return std::make_tuple(); } } template <typename Value, int Rank, typename Config, typename Mask, typename... Field, typename Function> void for_each_value_masked_rank(const Config& config, const Mask& mask, std::tuple<Field...>&& fields, const Function& function ) { constexpr auto num_fields = std::tuple_size_v<std::tuple<Field...>>; constexpr auto dims = std::make_index_sequence<Rank>(); if constexpr (valid_value_function<Function,Value>()) { // Surely this can be improved to be more generic w.r.t. num_fields // We have the technology! if constexpr (num_fields <= 4) { for_each_value_masked_view(dims, config, mask, function, make_view_tuple<0, num_fields, Value, Rank>(std::move(fields))); return; } ATLAS_THROW_EXCEPTION("Unsupported number of arguments in function"); } }

And here I thought we'd seen the last of tuple_cats 😉
Thanks for your suggestion!

It's my new favourite thing!

I don't think we need to pass NumFields into the make_view_tuple function, we can just use the size of the fields tuple.

Also, if we want to be really clever, I think we can pass in Function as a template parameter and use your template kung fu (function_traits) to get the Value type from each of the functor args.

One problem with the function_traits is that we cannot use auto arguments in a "Function" lambda. This is related to runtime dispatch of fields based on ranks and data types, and it cannot infer this from auto.
So one has to be explicit w.r.t. data types and ranks in a view, and has to match what is encoded in the Field

I was trying to wrap my head around this. You'd basically end up with thousands of compiled loop functions if you tried to dispatch every template combination. I interpreted your non-generic lambdas as trying to sneak in the template arguments through the back door ;-)

odlomax · 2023-05-16T17:30:48Z

src/atlas/field/detail/for_each.h

+
+    static constexpr std::size_t arity = sizeof...(Arguments);
+};
+


Took me a while to get my head around this. I am in awe.

DJDavies2 · 2023-05-17T12:17:25Z

I am getting build failures when building against oops (JCSDA) with this:

In file included from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/ArrayView.h:18:0,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/functionspace/CubedSphereColumns.h:13,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/functionspace.h:16,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/base/GeometryData.h:15,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/base/Geometry.h:23,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/ControlVariable.h:20,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/ControlIncrement.h:21,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/MinimizerUtils.h:16,
from /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/oops/src/oops/assimilation/MinimizerUtils.cc:8:
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: error: 'is_convertible_v' is not a member of 'std'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp*,value_type*>>>
^~~~~~~~~~~~~~~~
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: note: suggested alternative: 'is_convertible'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp*,value_type*>>>
^~~~~~~~~~~~~~~~
is_convertible
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: error: 'is_convertible_v' is not a member of 'std'
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:66: note: suggested alternative: 'is_convertible'
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp*,value_type*>>>
^~~~~~~~~~~~~~~~
is_convertible
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:102: error: template argument 1 is invalid
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp*,value_type*>>>
^
/home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/native/NativeArrayView.h:138:105: error: expected unqualified-id before '>' token
template <typename ValueTp, typename = std::enable_if_t<std::is_convertible_v<ValueTp*,value_type*>>>
^
gmake[2]: *** [oops/src/CMakeFiles/oops.dir/oops/generic/UnstructuredInterpolator.cc.o] Error 1
gmake[2]: *** [oops/src/CMakeFiles/oops.dir/oops/assimilation/MinimizerUtils.cc.o] Error 1

wdeconinck · 2023-05-22T08:40:49Z

@DJDavies2 Thanks for verifying compilation. You have observed this in multiple PRs now, so could you see if it is triggered in develop branch as well? If so create a GitHub issue specifically for the develop branch. Please then mention the compiler, version, and compile line including include directories and flags.

I have added to develop branch the public propagation of the -std=c++17 flag to downstream projects, as I believe that's what is missing. A reintegration of this branch on develop should then hopefully fix this.

wdeconinck · 2023-05-22T09:47:22Z

@ytremolet when you have time could you comment? Otherwise I'll just merge this in soon.

DJDavies2 · 2023-05-22T13:31:39Z

@DJDavies2 Thanks for verifying compilation. You have observed this in multiple PRs now, so could you see if it is triggered in develop branch as well? If so create a GitHub issue specifically for the develop branch. Please then mention the compiler, version, and compile line including include directories and flags.

I have added to develop branch the public propagation of the -std=c++17 flag to downstream projects, as I believe that's what is missing. A reintegration of this branch on develop should then hopefully fix this.

I have tested against head of develop and it seems to work now, there are been changes updating to std=c++17 in some downstream repositories so that has probably fixed this issue.

ytremolet

I don't think I can review all the advanced C++. As others I am in awe...
So I mostly looked at the test and how this would be used. It looks great, I think we are going to get good use of this.

ytremolet

Should a test with a non-default execution policy be added? I'd like to see that for the for_each_column, for example the tests starting at line 425 and 483.

wdeconinck · 2023-05-31T10:05:49Z

I have added performance tests with OMP_NUM_THREADS=4 with following output:

timing with for loops seq  = 0.00333232  (triple nested for-loop, sequential)
timing with execution::seq = 0.0790008   (field::for_each_value, sequential)
timing with for loops par  = 0.00102567  (triple nested for-loop with OpenMP for for outermost loop)
timing with execution::par = 0.0215601   (field::for_each_value, parallel)

The difference between normal for-loops and the field::for_each is pretty dramatic (field::for_each is factor 20-24 slower)!
I don't know why with the ArrayForEach tests we had no difference in result, but we do have a lot of difference here.

This should foremost not stop using this capability.
The benefit is that optimisations can and WILL be applied in following developments.

DJDavies2 · 2023-06-01T15:07:16Z

The new test fails when I run it:

Running case 2: test field::for_each_value_masked; horizontal_dimension {0,2} mask_rank1 ...
[0] �[31mTest "test field::for_each_value_masked; horizontal_dimension {0,2} mask_rank1" failed with unhandled eckit::Exception: Assertion failed: (std::is_invocable_v<Mask,MaskArgs...>) in apply, line 265 of /home/d03/frwd/cylc-run/atlas-139/share/mo-bundle/atlas/src/atlas/array/helpers/ArrayForEach.h @ �[0m
[0] Stack trace: backtrace [1] stack has 38 addresses
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::BackTrace::dumpabi:cxx11)0x91
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::Exception::Exception())0xb6
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::AssertionFailed::AssertionFailed(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, eckit::CodeLocation const&))0x41
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libeckit.so+eckit::handle_assert(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, eckit::CodeLocation const&))0x2a5
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/lib/libatlas.so.0.33+atlas::throw_AssertionFailed(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, eckit::CodeLocation const&))0x53
[0] (/home/d03/frwd/cylc-run/atlas-139/share/build-mo-cray_gnu_debug/atlas/src/tests/field/atlas_test_field_foreach+atlas::detail::Assert(bool, char const*, char const*, int, char const*))0xa1

and similar for rank2.

…each * origin/develop: Workaround compiler bug in nvhpc-22.11-release build Update GHA nvidia-21.9 to nvidia-22.11 Avoid and suppress some compiler warnings with nvhpc Fix bug where DelaunayMeshGenerator with 1 partition was not setting the grid in the mesh (#143) Use Eigen 3.4 Disable floating point signals for tests on nvidia Add nvidia compiler specific HPC build config

wdeconinck · 2023-06-01T16:05:14Z

@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of atlas --info)

This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5.

@odlomax could you perhaps try to investigate or debug this on this particular configuration?

DJDavies2 · 2023-06-01T16:07:36Z

Yes, this is gnu 7.3.

odlomax · 2023-06-02T08:58:34Z

@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of atlas --info)

This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5.

@odlomax could you perhaps try to investigate or debug this on this particular configuration?

I'm just having a look. Will get back to you later today.

odlomax · 2023-06-02T15:14:25Z

@DJDavies2 Is this with gnu 7.3? Which build-type? (You could print output of atlas --info)
This failure should not have anything to do with the new test. Not sure how this occurs. It seems to pass on so many other compilers and build-types, including gnu 7.5.
@odlomax could you perhaps try to investigate or debug this on this particular configuration?

I'm just having a look. Will get back to you later today.

I might have found a bug. If I can confirm it, shall I PR a solution into this branch?

odlomax · 2023-06-05T10:22:35Z

I think I've fixed it #145

tl;dr: GCC 7.3 is old and rubbish.

wdeconinck · 2023-06-05T13:10:30Z

With new changes from d97f973 the field::for_each_value timings did not seem to improve:

timing with for loops seq  = 0.0035155
timing with for loops par  = 0.00106079
timing with execution::seq = 0.092391
timing with execution::par = 0.0240693

odlomax · 2023-06-05T13:24:27Z

With new changes from d97f973 the field::for_each_value timings did not seem to improve:
timing with for loops seq  = 0.0035155
timing with for loops par  = 0.00106079
timing with execution::seq = 0.092391
timing with execution::par = 0.0240693

Strange. I got the following when I ran the test in release:

37: timing with for loops seq  = 0.000203149
37: timing with for loops par  = 5.73367e-05
37: timing with execution::seq = 0.000196719
37: timing with execution::par = 5.94674e-05

Admittedly, there's a fair bit of variation between runs.

wdeconinck · 2023-06-05T13:41:01Z

Apologies @odlomax I indeed do recover the raw for loop performance when I have optimisations on. I had hacked my compile flags in my build and forgot to reset them.

wdeconinck · 2023-06-05T13:47:49Z

@DJDavies2 Please let me know if you still have issues with updated branch, and thanks once more for spotting this!

DJDavies2 · 2023-06-05T14:29:17Z

The test works now, thanks.

odlomax · 2023-06-05T14:31:59Z

Great success!

wdeconinck · 2023-06-06T08:02:24Z

Finally merged. Thanks all your help!

* develop: (56 commits) Implement field::for_each capabilities (ecmwf#139) Avoid harmless FE_INVALID with nvhpc build Avoid harmless FE_INVALID with nvhpc build Remove ATLAS_FPE=0 environment variable as not needed anymore now Avoid harmless FE_DIVBYZERO with nvhpc build Optimize modules and dependencies for caching Add HPC build options matrix Workaround compiler bug in nvhpc-22.11-release build Update GHA nvidia-21.9 to nvidia-22.11 Avoid and suppress some compiler warnings with nvhpc Fix bug where DelaunayMeshGenerator with 1 partition was not setting the grid in the mesh (ecmwf#143) Use Eigen 3.4 Disable floating point signals for tests on nvidia Add nvidia compiler specific HPC build config Add function to build mesh from imported connectivity data (ecmwf#135) Disable GHA "linux gnu-12" OpenMP for CXX as "omp.h" header is not found :( Add gnu-12 ci to github actions (github-hosted runners) Add gnu-7 ci to github actions with github-hosted runners (ecmwf#136) Setup horizontal_dimensions() for BlockStructuredColumns fields Add function Field::horizontal_dimension() -> std::vector<idx_t> ...

wdeconinck self-assigned this May 16, 2023

wdeconinck force-pushed the feature/field_for_each branch from 3260b3b to 48e31b4 Compare May 16, 2023 12:20

odlomax reviewed May 16, 2023

View reviewed changes

Implement field::for_each_value and field::for_each_column

bef82b0

Use tuple_cat to create tuple of views from tuple of fields

9e19a7a

wdeconinck force-pushed the feature/field_for_each branch from 48e31b4 to 9e19a7a Compare May 22, 2023 09:08

ytremolet reviewed May 24, 2023

View reviewed changes

wdeconinck added 3 commits May 30, 2023 17:41

Add performance test with multiple execution policies

502d101

Avoid and suppress some compiler warnings with nvhpc

40f0b2f

Update GHA nvidia-21.9 to nvidia-22.11

275ef87

Workaround compiler bug in nvhpc-22.11-release build

6b57c87

wdeconinck added 2 commits June 1, 2023 18:00

Try to be safer

51f56eb

odlomax added 2 commits June 5, 2023 14:41

Tidied up static_asserts in ArrayForEach.

d97f973

Workaround for GCC bug 64095.

f85f3b5

Removed extraneous include.

a61d667

wdeconinck merged commit d0babe3 into develop Jun 6, 2023
65 of 66 checks passed

wdeconinck deleted the feature/field_for_each branch September 19, 2023 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement field::for_each capabilities #139

Implement field::for_each capabilities #139

wdeconinck commented May 16, 2023 •

edited

Loading

wdeconinck commented May 16, 2023

codecov-commenter commented May 16, 2023 •

edited

Loading

odlomax May 16, 2023 •

edited

Loading

wdeconinck May 17, 2023

odlomax May 17, 2023 •

edited

Loading

wdeconinck May 22, 2023

odlomax May 22, 2023 •

edited

Loading

odlomax May 16, 2023

DJDavies2 commented May 17, 2023

wdeconinck commented May 22, 2023

wdeconinck commented May 22, 2023

DJDavies2 commented May 22, 2023

ytremolet left a comment

ytremolet left a comment •

edited

Loading

wdeconinck commented May 31, 2023 •

edited

Loading

DJDavies2 commented Jun 1, 2023

wdeconinck commented Jun 1, 2023

DJDavies2 commented Jun 1, 2023

odlomax commented Jun 2, 2023

odlomax commented Jun 2, 2023

odlomax commented Jun 5, 2023 •

edited

Loading

wdeconinck commented Jun 5, 2023

odlomax commented Jun 5, 2023

wdeconinck commented Jun 5, 2023

wdeconinck commented Jun 5, 2023

DJDavies2 commented Jun 5, 2023

odlomax commented Jun 5, 2023

wdeconinck commented Jun 6, 2023


		static constexpr std::size_t arity = sizeof...(Arguments);
		};

Implement field::for_each capabilities #139

Implement field::for_each capabilities #139

Conversation

wdeconinck commented May 16, 2023 • edited Loading

wdeconinck commented May 16, 2023

codecov-commenter commented May 16, 2023 • edited Loading

Codecov Report

odlomax May 16, 2023 • edited Loading

Choose a reason for hiding this comment

wdeconinck May 17, 2023

Choose a reason for hiding this comment

odlomax May 17, 2023 • edited Loading

Choose a reason for hiding this comment

wdeconinck May 22, 2023

Choose a reason for hiding this comment

odlomax May 22, 2023 • edited Loading

Choose a reason for hiding this comment

odlomax May 16, 2023

Choose a reason for hiding this comment

DJDavies2 commented May 17, 2023

wdeconinck commented May 22, 2023

wdeconinck commented May 22, 2023

DJDavies2 commented May 22, 2023

ytremolet left a comment

Choose a reason for hiding this comment

ytremolet left a comment • edited Loading

Choose a reason for hiding this comment

wdeconinck commented May 31, 2023 • edited Loading

DJDavies2 commented Jun 1, 2023

wdeconinck commented Jun 1, 2023

DJDavies2 commented Jun 1, 2023

odlomax commented Jun 2, 2023

odlomax commented Jun 2, 2023

odlomax commented Jun 5, 2023 • edited Loading

wdeconinck commented Jun 5, 2023

odlomax commented Jun 5, 2023

wdeconinck commented Jun 5, 2023

wdeconinck commented Jun 5, 2023

DJDavies2 commented Jun 5, 2023

odlomax commented Jun 5, 2023

wdeconinck commented Jun 6, 2023

wdeconinck commented May 16, 2023 •

edited

Loading

codecov-commenter commented May 16, 2023 •

edited

Loading

odlomax May 16, 2023 •

edited

Loading

odlomax May 17, 2023 •

edited

Loading

odlomax May 22, 2023 •

edited

Loading

ytremolet left a comment •

edited

Loading

wdeconinck commented May 31, 2023 •

edited

Loading

odlomax commented Jun 5, 2023 •

edited

Loading