Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'success' in 'struct nest::StaticAssert...' does not name a type #1031

Closed
sanjayankur31 opened this issue Sep 15, 2018 · 17 comments · Fixed by #1065
Closed

'success' in 'struct nest::StaticAssert...' does not name a type #1031

sanjayankur31 opened this issue Sep 15, 2018 · 17 comments · Fixed by #1065
Assignees
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation ZC: Kernel DO NOT USE THIS LABEL ZP: In progess DO NOT USE THIS LABEL

Comments

@sanjayankur31
Copy link
Contributor

I'm getting a failed build with GCC 8.2.1 on Fedora 30 systems for the nest 2.16.0 release. I'm still looking at the cause, but here is the error message.

/builddir/build/BUILD/nest-simulator-2.16.0/nestkernel/target.h:135:48: error: 'success' in 'struct nest::StaticAssert<false>' does not name a type
 typedef StaticAssert< sizeof( Target ) == 8 >::success success_target_size;
                                                ^~~~~~~

The compiler flags are:

C compiler flags    : -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -Wall -fopenmp -fdiagnostics-color=auto

C++ compiler flags  : -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -std=c++11 -O2 -Wall -fopenmp -fdiagnostics-color=auto

Cheers!

@sanjayankur31
Copy link
Contributor Author

This is the complete build log: build-log.txt

@sanjayankur31
Copy link
Contributor Author

Another note: the build is only failing on 2 architectures: i686 and armv7hl (All builds here: https://koji.fedoraproject.org/koji/taskinfo?taskID=29705080)

@heplesser heplesser added ZC: Kernel DO NOT USE THIS LABEL T: Bug Wrong statements in the code or documentation ZP: In progess DO NOT USE THIS LABEL S: High Should be handled next I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Sep 16, 2018
@heplesser
Copy link
Contributor

@sanjayankur31 From 'struct nest::StaticAssert<false>' it looks to me that sizeof(Target) does not return 8 as NEST requires. So the question would be why that is not the case om some systems. Since unsigned long data_; is the only data member of Target, it seems NEST implicitly requires that sizeof(unsigned long) is 8 byte, while it may be 4 on some systems.

@jakobj Could you take a look?

@heplesser heplesser assigned jakobj and unassigned apeyser Sep 16, 2018
@apeyser
Copy link
Contributor

apeyser commented Sep 17, 2018

This is an issue with inter-machine aliasing (both of those architectures are 32 bit architectures where sizeof(long) == 4. Data is being sent in a character array "naked" and then reconstructed on the other side -- thus the size of the package should be the same on both ends (and it should also be the same endian, but that isn't checked) and needs to be large enough to fit all the elements.

We could fix this by defining data_ to be uint64_t, however that would then also require defining thread and index types in a word-size neutral way.

However, really, this aliasing is a bit of a mess, depending on raw c types without fixed lengths, and in fact that indexes have less than 27 bits and so on. If you allow 32 bit compilation, then you would need to also reduce the bits allocated for lcid, rank, tid, and id's. so that entire section should be type-trait based values, or make NEST fully 64-bit types even on 32 bit machines.

The only reason not to have fixed-size types is that someone intends to run on a 32 bit supercomputer -- and I'd hope that everyone building an ARM supercomputer is waiting for the new vectorized operators anyhow.

@heplesser
Copy link
Contributor

@apeyser Thanks for the explanation! I am not entirely sure I fully understand your comment on 32-bit supercomputers (is some irony hiding there?). It seems we need to have 64 bits for data_, to have enough space for all information that needs to go there. Could we find some pragmatic solution here, e.g., using uint64_t, provided it is defined in the C++11 standard. Would we really need to change the definitions of thread and index then? Wouldn't it suffice to change casts that currently are to unsigned long to uint64_t?

Another approach would be require that unsigned long is 64 bit, but that might exclude too many systems even today?

@apeyser
Copy link
Contributor

apeyser commented Sep 25, 2018

I'd definitely go for using well-defined length types for as many types as possible, such as uint64_t.
Some of the ARM hardware is still unusual, and that's the only place that I'd expect any large machines to have some preference for 32-bits, at least in some of the early stages of the machines.

@heplesser
Copy link
Contributor

@ikitayama Could you comment on this issue with respect to coming ARM-based systems?

@jakobj
Copy link
Contributor

jakobj commented Sep 26, 2018

At multiple points we require the variables to have exactly the number of bits (64 in our case) we would expect on a normal 64bit machine, e.g., for the bitmasks in https://github.com/nest/nest-simulator/blob/master/nestkernel/target.h#L46 or the bitfields in https://github.com/nest/nest-simulator/blob/master/nestkernel/source.h#L43. The easiest option I see here is to define all the relevant variables as uint64_t. Unless required by some relevant system, I do not see any value in supporting a 32bit version.

@heplesser
Copy link
Contributor

@jakobj I also do not see the value of a 32 bit version.

@terhorstd Could you put this on the agenda for the next NEST Developer VC as an explicit decision item?

@ikitayama
Copy link

@ikitayama Could you comment on this issue with respect to coming ARM-based systems?

I've built today's master branch without an issue in Rawhide (aka Fedora 30 candidate with GCC 8.2.1) VM on a ThunderX server (arm64). As @jakobj mentioned, the Community should decide they want to support 32-bit binaries. I'd just drop the support and be done with it. I think Arm-based supercomputers will all be 64-bit machines.

@sanjayankur31
Copy link
Contributor Author

I had discussed this during the package review too. The reviewer had suggested we support whatever archs were available in case someone wants to run smaller sims on a chromebook or something. However, I agree that if it's too much trouble, then we drop support here (as upstream), and downstream packagers like me at Fedora can drop support too. 2.14 didn't error out---this only turned up when I tried to build 2.16.

@ikitayama
Copy link

ikitayama commented Sep 27, 2018 via email

@sanjayankur31
Copy link
Contributor Author

Not much other than supporting a wider range of hardware. If 32 bit is going away and 64 bit will become the norm, dropping 32bit support should be fine. I'm not as up to date as the others here on these things, unfortunately.

@apeyser
Copy link
Contributor

apeyser commented Oct 19, 2018

The worst case is a 2-fold memory increase -- that's just meaningless, computationally.

Fuggedaboutit and just go to well-defined 64 bit types, and if someone's laptop needs twice as much memory, they should borrow someone else's in the lab with twice as much memory.

@gtrensch
Copy link
Contributor

I have worked out a pragmatic solution which I tested on a 32-bit ARM Raspberry Pi. See also pull request #1065.

Regarding the word size, I believe, one can safely assume that future ARM-based HPC systems are 64-bit architectures. Nonetheless, there is also the very interesting XILINX Zynq device family (ARM + FPGA) that is still 32-bit. Being prepared and keeping an eye on writing portable code is not wrong.

Unfortunately, NEST is not well prepared for that, e.g., we do not have user defined platform-independent type definitions - and this is just one aspect! I also agree with @apeyser that relying on integral C-types and bit-shifting data is not the best solution here.

I worked around the current problem by doing the following:

  • Let CMake identify when compiling for a 32-bit architecture and then include the appropriate headers. Even on 32-bit platforms there is usually an uint64_t available in stdint.h.
  • Define the gidexplicitly as 64-bit. Both, unsigned long and unsigend long long are not appropriate for this. They are platform-dependent data types!
  • The 64-bit gid on 32-bit required a few more changes down to sli-types, i.e., overloading constructors and ensuring that this does not conflict with data types on 64-bit platforms, i.e., uint64_t might be equivalent to unsigned long preventing overloading.

I found that it is not needed for the moment to change the types of index and thread as they are compiled from a subset of the bits still fitting 32-bit. Making this uint64_t breaks sli-types and will require more changes. Since we decided to follow the C++11 standard, I also introduced constexpr for the constant-definitions to ensure their evaluation at compile-time.

@apeyser
Copy link
Contributor

apeyser commented Nov 13, 2018

PR #1065 needs a second reviewer so we can merge and close it and this issue. It's already been approved by a first reviewer (me). @sanjayankur31 @ikitayama @jakobj @heplesser

@ikitayama
Copy link

ikitayama commented Nov 15, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation ZC: Kernel DO NOT USE THIS LABEL ZP: In progess DO NOT USE THIS LABEL
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants