Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADParsedFunctorMaterial JIT compile failure when out-of-tree on ARM Macs #26129

Open
cticenhour opened this issue Nov 22, 2023 · 10 comments
Open
Labels
C: Framework P: normal A defect affecting operation with a low possibility of significantly affects. T: defect An anomaly, which is anything that deviates from expectations.

Comments

@cticenhour
Copy link
Member

Bug Description

When running the finite_volume/wcns/natural_convection.natural_circulation_pipe out-of-tree after #21230 was merged into next, the test fails to run. JIT error output is seen as follows:

In file included from ./tmp_jit_cVLX2n.cc:3:
In file included from /Users/user/mambaforge3/envs/build/conda-bld/moose_1700678935528/work/framework/include/utils/ADReal.h:52:
In file included from /Users/user/mambaforge3/envs/moose-test/libmesh/include/Eigen/Core:70:
/Users/user/mambaforge3/envs/moose-test/include/omp.h:489:20: error: static declaration of 'omp_is_initial_device' follows non-static declaration
 static inline int omp_is_initial_device(void) { return 1; }
                   ^
/Users/user/mambaforge3/envs/moose-test/include/omp.h:134:16: note: previous declaration is here
    extern int omp_is_initial_device (void);
               ^
/Users/user/mambaforge3/envs/moose-test/include/omp.h:492:20: error: static declaration of 'omp_is_initial_device' follows non-static declaration
 static inline int omp_is_initial_device(void) { return 0; }
                   ^
/Users/user/mambaforge3/envs/moose-test/include/omp.h:134:16: note: previous declaration is here
    extern int omp_is_initial_device (void);
               ^

with a brief stack trace pointing to ADParsedFunctorMaterial as a possible culprit:

*** ERROR ***
ADFParser::JITCompile() failed. Evaluation not possible.

Stack frames: 17
0: 0   libmesh_opt.0.dylib                 0x000000010e0cb09c libMesh::print_trace(std::__1::basic_ostream<char, std::__1::char_traits<char>>&) + 1044
1: 1   libmoose-opt.0.dylib                0x0000000109abce50 moose::internal::mooseErrorRaw(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) + 680
2: 2   libmoose-opt.0.dylib                0x0000000108d20260 void mooseError<char const (&) [57]>(char const (&) [57]) + 216
3: 3   libmoose-opt.0.dylib                0x000000010943f6c0 ADFParser::JITCompile() + 948
4: 4   libmoose-opt.0.dylib                0x00000001092b77ec ParsedFunctorMaterialTempl<true>::buildParsedFunction() + 496
5: 5   libmoose-opt.0.dylib                0x00000001093099f8 ParsedFunctorMaterialTempl<true>::ParsedFunctorMaterialTempl(InputParameters const&) + 988
6: 6   libmoose-opt.0.dylib                0x00000001093095a0 RegistryEntry<ParsedFunctorMaterialTempl<true>>::build(InputParameters const&) + 64
7: 7   libmoose-opt.0.dylib                0x0000000109aa49d8 Factory::create(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, InputParameters const&, unsigned int, bool) + 344
8: 8   libmoose-opt.0.dylib                0x0000000109347958 std::__1::shared_ptr<MaterialBase> Factory::create<MaterialBase>(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, InputParameters const&, unsigned int) + 56
9: 9   libmoose-opt.0.dylib                0x00000001093a4934 auto FEProblemBase::addFunctorMaterial(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, InputParameters&)::$_1::operator()<InputParameters, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>(InputParameters const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) const + 172
10: 10  libmoose-opt.0.dylib                0x00000001093a46d8 FEProblemBase::addFunctorMaterial(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, InputParameters&) + 148
11: 11  libmoose-opt.0.dylib                0x0000000109792300 ActionWarehouse::executeActionsWithAction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 1172
12: 12  libmoose-opt.0.dylib                0x00000001097c96b8 ActionWarehouse::executeAllActions() + 244
13: 13  libmoose-opt.0.dylib                0x0000000109acc9a0 MooseApp::runInputFile() + 152
14: 14  libmoose-opt.0.dylib                0x0000000109ac90b8 MooseApp::run() + 1368
15: 15  combined-opt                        0x0000000104643b4c main + 164
16: 16  dyld                                0x000000018701d0e0 start + 2360
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1
:
system msg for write_line failure : Bad file descriptor

The CIVET output for this failure can be seen here: https://civet.inl.gov/job/1895204/

Steps to Reproduce

Run the mentioned test while using the pre-built moose conda package as-built after #21230.

Impact

Prevents this capability from being fully tested across all installation configurations and platforms.

CC: @lindsayad @loganharbour

@cticenhour cticenhour added C: Framework T: defect An anomaly, which is anything that deviates from expectations. P: normal A defect affecting operation with a low possibility of significantly affects. labels Nov 22, 2023
@lindsayad
Copy link
Member

Also FYI @joshuahansel @dschwen

@lindsayad
Copy link
Member

The currently marked installation_type = in_tree tests in the mixture_model tests in navier_stokes are subject to this issue

@dschwen
Copy link
Member

dschwen commented Feb 5, 2024

I can take a look at this. It looks vaguely familiar to me...

Basically this means the function implementation is at odds with the forward declaration. Meaning out of tree we probably have a preprecessor symbol set wrongly to trigger this (as it all happens inside of the same omp.h file)

@milljm
Copy link
Member

milljm commented Feb 5, 2024

When testing this on my Apple Si:

In file included from /Users/milljm/miniforge3/envs/moose/libmesh/include/Eigen/Core:70:
/Users/milljm/miniforge3/envs/moose/include/omp.h:489:20: error: static declaration of 'omp_is_initial_device' follows non-static declaration
 static inline int omp_is_initial_device(void) { return 1; }
                   ^
/Users/milljm/miniforge3/envs/moose/include/omp.h:134:16: note: previous declaration is here
    extern int omp_is_initial_device (void);
               ^
/Users/milljm/miniforge3/envs/moose/include/omp.h:492:20: error: static declaration of 'omp_is_initial_device' follows non-static declaration
 static inline int omp_is_initial_device(void) { return 0; }

Linux does not appear to have an issue.

Edit: just realized this is already posted in OP.

@dschwen
Copy link
Member

dschwen commented Feb 5, 2024

Looking at the envs/moose/include/omp.h file I'm quite surprised... There is nothing that woulc be switching stuff around. If defined(_OPENMP) and _OPENMP >= 201811 there always is a

static inline int

implementation following a

extern int

forward declaration (the _KAI_KMPC_CONVENTION symbol is just for windows)

@dschwen
Copy link
Member

dschwen commented Feb 5, 2024

The key must be in the ADRealMonolithic.h file. On Linux it contains only the line

extern int omp_is_initial_device (void) throw ();

for the omp_is_initial_device function. On Mac it contains


extern int omp_is_initial_device (void);

[...]

#pragma omp begin declare variant match(device={kind(host)})
static inline int omp_is_initial_device(void) { return 1; }
#pragma omp end declare variant
#pragma omp begin declare variant match(device={kind(nohost)})
static inline int omp_is_initial_device(void) { return 0; }
#pragma omp end declare variant

@dschwen
Copy link
Member

dschwen commented Feb 5, 2024

What I don't get is why the in-tree run (which does not use ADRealMonolithic.h) works. The ADRealMonolithic.h is built using the preprocessor, and it should process the ADReal.h file the same way it gets processed when building the JIT files using the in-tree version...

Update: The preprocessor options for building the monolith are probably not the same we pass in for JIT. I'm investigating this.

GiudGiud added a commit to GiudGiud/moose that referenced this issue Feb 21, 2024
GiudGiud added a commit to GiudGiud/moose that referenced this issue Feb 21, 2024
pbehne pushed a commit to pbehne/moose that referenced this issue Feb 27, 2024
schakrabortygithub pushed a commit to schakrabortygithub/moose that referenced this issue Mar 12, 2024
@joshuahansel
Copy link
Contributor

Just ran into this recently; conversation on slack: https://moosedevelopers.slack.com/archives/C01054VRUEM/p1711729364638029. I must have forgotten this existed, since I see I was tagged, or maybe I didn't know what "out-of-tree" meant at the time.

@joshuahansel
Copy link
Contributor

So in my case the consequence is that I cannot use parsed things in my workshop, since most users are using a conda MOOSE executable, so I just decided to use non-parsed classes instead (yuck!).

@lindsayad
Copy link
Member

Duplicate issue created at #27255 with fix at #27256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Framework P: normal A defect affecting operation with a low possibility of significantly affects. T: defect An anomaly, which is anything that deviates from expectations.
Projects
None yet
Development

No branches or pull requests

5 participants