Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm64 #68

Closed
AleksZhuravlyov opened this issue Oct 21, 2021 · 31 comments
Closed

arm64 #68

AleksZhuravlyov opened this issue Oct 21, 2021 · 31 comments

Comments

@AleksZhuravlyov
Copy link

I have slightly patched OpenFOAM-9-2b1d7d67c.patch to build OpenFOAM on arm architecture. Works fine. If it is relevant I can share the patch.

@mrklein
Copy link
Owner

mrklein commented Oct 22, 2021

Thank you for your efforts. You can share the patch through a merge request. Though, since I do not have access to M1 hardware, all questions regarding the patch will be redirected to you.

Alternatively, I can add a link to your repository with the patch to the top of README.

@AleksZhuravlyov
Copy link
Author

AleksZhuravlyov commented Oct 22, 2021

Thank you for your excellent project. I'll make the merge request soon. let's keep it in one place. The patch is really gentle.

Аnd, of course, I will be happy to answer the questions if they follow.

@BrushXue
Copy link

@AleksZhuravlyov Have you figured out how to deal with the sigfpe problem? Seems like M1 doesn't support this x86 feature. Anyway I disabled it in my build.

@AleksZhuravlyov
Copy link
Author

AleksZhuravlyov commented Oct 28, 2021

@AleksZhuravlyov Have you figured out how to deal with the sigfpe problem? Seems like M1 doesn't support this x86 feature. Anyway I disabled it in my build.

I partially disabled its functionality. Find patch attached.

sigFpe.C.patch.zip

And, as far as you work on your build, I share here patch for OpenFOAM-9-2b1d7d67c.patch. It's just about changing architecture key in 3 places.

OpenFOAM-9-2b1d7d67c.patch.patch.zip

@BrushXue
Copy link

@AleksZhuravlyov Have you figured out how to deal with the sigfpe problem? Seems like M1 doesn't support this x86 feature. Anyway I disabled it in my build.

I partially disabled its functionality. Find patch attached.

sigFpe.C.patch.zip

Thanks. In .com version patch (actually OpenCFD has merged the patch into their repo) sigfpe is treated differently. I'll see if I can merge them.

@AleksZhuravlyov
Copy link
Author

@AleksZhuravlyov Have you figured out how to deal with the sigfpe problem? Seems like M1 doesn't support this x86 feature. Anyway I disabled it in my build.

I partially disabled its functionality. Find patch attached.
sigFpe.C.patch.zip

Thanks. In .com version patch (actually OpenCFD has merged the patch into their repo) sigfpe is treated differently. I'll see if I can merge them.

Not at all. Do you still need the merge request with my patches?

@BrushXue
Copy link

@AleksZhuravlyov Have you figured out how to deal with the sigfpe problem? Seems like M1 doesn't support this x86 feature. Anyway I disabled it in my build.

I partially disabled its functionality. Find patch attached.
sigFpe.C.patch.zip

Thanks. In .com version patch (actually OpenCFD has merged the patch into their repo) sigfpe is treated differently. I'll see if I can merge them.

Not at all. Do you still need the merge request with my patches?

Good for now. I'm testing my M1 patch for v2106.

@mrklein
Copy link
Owner

mrklein commented Nov 1, 2021

@AleksZhuravlyov I have taken a look at the pieces your have posted. sigFpe.C.patch basically removes sigFPE handling, and your patch for OpenFOAM-9 patch makes ad-hoc changes to wmake rules. It would be more interesting to make patch more conforming to wmake naming scheme, i.e. you create new rules in darwinArm64Clang and, instead of commenting pieces out, you add conditional compilation pieces.

@BrushXue In .com version sigFPE handling was a little bit modified to be more maintainable, but basically it is the same code (instead of doing things directly in the code @olesenm asked me to make separate header for FPE-related functions, which I have taken from public domain (https://develop.openfoam.com/Development/openfoam/-/blob/master/src/OSspecific/POSIX/signals/feexceptErsatz.H). So, I think, these pieces won't work on ARM64.

@BrushXue
Copy link

BrushXue commented Nov 1, 2021

I created a patch that keeps x86 untouched. For sigfpe issue the only available guide is from ARM and there's zero information from Apple.

@olesenm
Copy link

olesenm commented Nov 2, 2021

If possible I would like to work with the existing feexceptErsatz.H, which was recently updated with patch from @ttsyshmz - issue #2240 - to presumably also work with ARM64:
https://develop.openfoam.com/Development/openfoam/-/blob/develop/src/OSspecific/POSIX/signals/feexceptErsatz.H

I don't have access to OSX (intel, arm64 or otherwise) so I am working blind for the most part.

@BrushXue
Copy link

BrushXue commented Nov 2, 2021

I will test that part and let you know the result today.

@BrushXue
Copy link

BrushXue commented Nov 2, 2021

The code compiles smoothly without -ftrapping-math
If I enable -ftrapping-math I see tons of warnings:(basically for every file)

warning: overriding currently unsupported use of floating point exceptions on this target [-Wunsupported-floating-point-opt]

@mrklein
Copy link
Owner

mrklein commented Nov 2, 2021

Obviously, you can suppress these warnings with -Wno-unsupported-floating-point-opt flag. What is more interesting does it catch sigFPE?

@BrushXue
Copy link

BrushXue commented Nov 2, 2021

Thanks for the hint. Now these warnings are gone.
Do you have any test case that can 100% trigger sigfpe? I can compare the output between Intel and M1.

@mrklein
Copy link
Owner

mrklein commented Nov 2, 2021

For example, icoFoam's cavity case with negative viscosity.

@BrushXue
Copy link

BrushXue commented Nov 2, 2021

x86 output:

Starting time loop

Time = 0.005

Courant Number mean: 0 max: 0
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 3.87689e+220, No Iterations 1000
smoothSolver:  Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
#0  Foam::error::printStack(Foam::Ostream&) in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libOpenFOAM.dylib
#1  Foam::sigFpe::sigHandler(int) in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libOpenFOAM.dylib
#2  _sigtramp in /usr/lib/system/libsystem_platform.dylib
#3  ? in /usr/lib/system/libsystem_platform.dylib
#4  Foam::PCG::scalarSolve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libOpenFOAM.dylib
#5  Foam::PCG::solve(Foam::Field<double>&, Foam::Field<double> const&, unsigned char) const in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libOpenFOAM.dylib
#6  Foam::fvMatrix<double>::solveSegregated(Foam::dictionary const&) in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libfiniteVolume.dylib
#7  Foam::fvMatrix<double>::solveSegregatedOrCoupled(Foam::dictionary const&) in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libfiniteVolume.dylib
#8  Foam::fvMesh::solve(Foam::fvMatrix<double>&, Foam::dictionary const&) const in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/lib/libfiniteVolume.dylib
#9  main in ~/OpenFOAM/OpenFOAM-v2106/platforms/darwin64ClangDPInt32Opt/bin/icoFoam
#10  start in /usr/lib/system/libdyld.dylib
[1]    75476 floating point exception  icoFoam

arm64 output: (Added the new feexceptErsatz.H, compiled with clang++ -std=c++14 -pthread -ftrapping-math -Wno-unsupported-floating-point-opt )

Starting time loop

Time = 0.005

Courant Number mean: 0 max: 0
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 3.87689e+220, No Iterations 1000
smoothSolver:  Solving for Uy, Initial residual = 0, Final residual = 0, No Iterations 0
DICPCG:  Solving for p, Initial residual = 1, Final residual = nan, No Iterations 1000
time step continuity errors : sum local = nan, global = nan, cumulative = nan
DICPCG:  Solving for p, Initial residual = nan, Final residual = nan, No Iterations 1000


--> FOAM FATAL IO ERROR: (openfoam-2106)
Wrong token type - expected scalar value, found on line 0: word 'nan'

file: /Volumes/OpenFOAM/cases/sigfpe/system/data.solverPerformance.p at line 0.

    From Foam::Istream &Foam::operator>>(Foam::Istream &, Foam::doubleScalar &)
    in file lnInclude/Scalar.C at line 172.

FOAM exiting

@BrushXue
Copy link

BrushXue commented Nov 3, 2021

Here's fenv.h extracted from macOS 12 SDK

/*
 * Copyright (c) 2002-2013 Apple Computer, Inc. All rights reserved.
 *
 * @APPLE_LICENSE_HEADER_START@
 * 
 * The contents of this file constitute Original Code as defined in and
 * are subject to the Apple Public Source License Version 1.1 (the
 * "License").  You may not use this file except in compliance with the
 * License.  Please obtain a copy of the License at
 * http://www.apple.com/publicsource and read it before using this file.
 * 
 * This Original Code and all software distributed under the License are
 * distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, EITHER
 * EXPRESS OR IMPLIED, AND APPLE HEREBY DISCLAIMS ALL SUCH WARRANTIES,
 * INCLUDING WITHOUT LIMITATION, ANY WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT.  Please see the
 * License for the specific language governing rights and limitations
 * under the License.
 * 
 * @APPLE_LICENSE_HEADER_END@
 */
 
/******************************************************************************
 *                                                                            *
 *  File:  fenv.h                                                             *
 *                                                                            *
 *  Contains: typedefs and prototypes for C99 floating point environment.     *
 *                                                                            *
 *  A collection of functions designed to provide access to the floating      *
 *  point environment for numerical programming. It is compliant with the     *
 *  floating-point requirements in C99.                                       *
 *                                                                            *
 *  The file <fenv.h> declares many functions in support of numerical         *
 *  programming. Programs that test flags or run under non-default mode       *
 *  must do so under the effect of an enabling "fenv_access" pragma:          *
 *                                                                            *
 *      #pragma STDC FENV_ACCESS on                                           *
 *                                                                            *
 ******************************************************************************/

#ifndef __FENV_H__
#define __FENV_H__

#ifdef __cplusplus
extern "C" {
#endif
    
/******************************************************************************
 *                                                                            *
 *  Architecture-specific types and macros.                                   *
 *                                                                            *
 *      fenv_t          a type for representing the entire floating-point     *
 *                      environment in a single object.                       *
 *                                                                            *
 *      fexcept_t       a type for representing the floating-point            *
 *                      exception flag state collectively.                    *
 *                                                                            *
 *      FE_INEXACT      macros representing the various floating-point        *
 *      FE_UNDERFLOW    exceptions.                                           *
 *      FE_OVERFLOW                                                           *
 *      FE_DIVBYZERO                                                          *
 *      FE_INVALID                                                            *
 *      FE_ALL_EXCEPT                                                         *
 *                                                                            *
 *      FE_TONEAREST    macros representing the various floating-point        *
 *      FE_UPWARD       rounding modes                                        *
 *      FE_DOWNWARD                                                           *
 *      FE_TOWARDZERO                                                         *
 *                                                                            *
 *      FE_DFL_ENV      a macro expanding to a pointer to an object           *
 *                      representing the default floating-point environemnt   *
 *                                                                            *
 ******************************************************************************/
    
/******************************************************************************
 *  ARM definitions of architecture-specific types and macros.                *
 ******************************************************************************/
     
#if defined __arm__ && !defined __SOFTFP__
     
typedef struct {
    unsigned int            __fpscr;    
    unsigned int            __reserved0;
    unsigned int            __reserved1;
    unsigned int            __reserved2;
} fenv_t;

typedef unsigned short fexcept_t;
    
#define FE_INEXACT          0x0010
#define FE_UNDERFLOW        0x0008
#define FE_OVERFLOW         0x0004
#define FE_DIVBYZERO        0x0002
#define FE_INVALID          0x0001
/*  FE_FLUSHTOZERO
    An ARM-specific flag that is raised when a denormal is flushed to zero.
    This is also called the "input denormal exception"                        */
#define FE_FLUSHTOZERO      0x0080 
#define FE_ALL_EXCEPT       0x009f
    
#define FE_TONEAREST        0x00000000
#define FE_UPWARD           0x00400000
#define FE_DOWNWARD         0x00800000
#define FE_TOWARDZERO       0x00C00000

/*  Masks for values that may be controlled in the FPSCR.  Modifying any other
    bits invokes undefined behavior.                                          */
enum {
    __fpscr_trap_invalid   = 0x00000100,
    __fpscr_trap_divbyzero = 0x00000200,
    __fpscr_trap_overflow  = 0x00000400,
    __fpscr_trap_underflow = 0x00000800,
    __fpscr_trap_inexact   = 0x00001000,
    __fpscr_trap_denormal  = 0x00008000,
    __fpscr_flush_to_zero  = 0x01000000,
    __fpscr_default_nan    = 0x02000000,
    __fpscr_saturation     = 0x08000000,
};

extern const fenv_t _FE_DFL_ENV;
#define FE_DFL_ENV &_FE_DFL_ENV
    
/******************************************************************************
 *  ARM64 definitions of architecture-specific types and macros.              *
 ******************************************************************************/
    
#elif defined __arm64__
    
typedef struct {
    unsigned long long      __fpsr;
    unsigned long long      __fpcr;
} fenv_t;
    
typedef unsigned short fexcept_t;
    
#define FE_INEXACT          0x0010
#define FE_UNDERFLOW        0x0008
#define FE_OVERFLOW         0x0004
#define FE_DIVBYZERO        0x0002
#define FE_INVALID          0x0001
/*  FE_FLUSHTOZERO
    An ARM-specific flag that is raised when a denormal is flushed to zero.
    This is also called the "input denormal exception"                        */
#define FE_FLUSHTOZERO      0x0080 
#define FE_ALL_EXCEPT       0x009f
    
#define FE_TONEAREST        0x00000000
#define FE_UPWARD           0x00400000
#define FE_DOWNWARD         0x00800000
#define FE_TOWARDZERO       0x00C00000
    
/*  Masks for values that may be controlled in the FPCR.  Modifying any other
    bits invokes undefined behavior.                                          */
enum {
    __fpcr_trap_invalid   = 0x00000100,
    __fpcr_trap_divbyzero = 0x00000200,
    __fpcr_trap_overflow  = 0x00000400,
    __fpcr_trap_underflow = 0x00000800,
    __fpcr_trap_inexact   = 0x00001000,
    __fpcr_trap_denormal  = 0x00008000,
    __fpcr_flush_to_zero  = 0x01000000,
};

/*  Mask for the QC bit of the FPSR                                           */
enum { __fpsr_saturation  = 0x08000000 };
    
extern const fenv_t _FE_DFL_ENV;
#define FE_DFL_ENV &_FE_DFL_ENV

/*  FE_DFL_DISABLE_DENORMS_ENV
 
    A pointer to a fenv_t object with the default floating-point state modified
    to set the FZ (flush to zero) bit in the FPCR.  When using this environment
    denormals encountered by floating-point calculations will be treated as
    zero.  Denormal results of floating-point operations will also be treated
    as zero.  This calculation mode is not IEEE-754 compliant, but it may
    prevent lengthy stalls that occur in code that encounters denormals.  It is
    suggested that you do not use this mode unless you have established that
    denormals are the source of measurable performance problems.
 
    Note that the math library, and other system libraries, are not guaranteed
    to do the right thing if called in this mode.  Edge cases may be incorrect.
    Use at your own risk.                                                     */
extern const fenv_t _FE_DFL_DISABLE_DENORMS_ENV;
#define FE_DFL_DISABLE_DENORMS_ENV &_FE_DFL_DISABLE_DENORMS_ENV
    
/******************************************************************************
 *  x86 definitions of architecture-specific types and macros.                *
 ******************************************************************************/
    
#elif defined __i386__ || defined __x86_64__

typedef struct {
    unsigned short          __control;      /* x87 control word               */
    unsigned short          __status;       /* x87 status word                */
    unsigned int            __mxcsr;        /* SSE status/control register    */
    char                    __reserved[8];  /* Reserved for future expansion  */   
} fenv_t;

typedef unsigned short fexcept_t;
    
#define FE_INEXACT          0x0020
#define FE_UNDERFLOW        0x0010
#define FE_OVERFLOW         0x0008
#define FE_DIVBYZERO        0x0004
#define FE_INVALID          0x0001
/*  FE_DENORMALOPERAND
    An Intel-specific flag that is raised when an operand to a floating-point
    arithmetic operation is denormal, or a single- or double-precision denormal
    value is loaded on the x87 stack.  This flag is not raised by SSE
    arithmetic when the DAZ control bit is set.                               */
#define FE_DENORMALOPERAND  0x0002
#define FE_ALL_EXCEPT       0x003f

#define FE_TONEAREST        0x0000
#define FE_DOWNWARD         0x0400
#define FE_UPWARD           0x0800
#define FE_TOWARDZERO       0x0c00

extern const fenv_t _FE_DFL_ENV;
#define FE_DFL_ENV &_FE_DFL_ENV

/*  FE_DFL_DISABLE_SSE_DENORMS_ENV
 
    A pointer to a fenv_t object with the default floating-point state modifed
    to set the DAZ and FZ bits in the SSE status/control register.  When using
    this environment, denormals encountered by SSE based calculation (which
    normally should be all single and double precision scalar floating point
    calculations, and all SSE/SSE2/SSE3 computation) will be treated as zero.
    Calculation results that are denormals will also be truncated to zero.
    This calculation mode is not IEEE-754 compliant, but may prevent lengthy
    stalls that occur in code that encounters denormals. It is suggested that
    you do not use this mode unless you have established that denormals are
    causing trouble for your code. Please use wisely.
    
    CAUTION: The math library currently is not architected to do the right
    thing in the face of DAZ + FZ mode.  For example, ceil( +denormal) might
    return +denormal rather than 1.0 in some versions of MacOS X. In some
    circumstances this may lead to unexpected application behavior. Use at
    your own risk.
 
    It is not possible to disable denormal stalls for calculations performed
    on the x87 FPU                                                            */
extern const fenv_t _FE_DFL_DISABLE_SSE_DENORMS_ENV;
#define FE_DFL_DISABLE_SSE_DENORMS_ENV  &_FE_DFL_DISABLE_SSE_DENORMS_ENV

/******************************************************************************
 *  Totally generic definitions and macros if we don't know anything about    *
 *  the target platform, or if the platform does not have hardware floating-  *
 *  point support.                                                            *
 ******************************************************************************/
    
#else /* Unknown architectures */

typedef int fenv_t;
typedef unsigned short fexcept_t;
#define FE_ALL_EXCEPT       0
#define FE_TONEAREST        0
extern const fenv_t _FE_DFL_ENV;
#define FE_DFL_ENV &_FE_DFL_ENV

#endif

/******************************************************************************
 *  The following functions provide high level access to the exception flags. *  
 *  The "int" input argument can be constructed by bitwise ORs of the         *
 *  exception macros: for example: FE_OVERFLOW | FE_INEXACT.                  *
 *                                                                            *
 *  The function "feclearexcept" clears the supported floating point          *
 *  exceptions represented by its argument.                                   *
 *                                                                            *
 *  The function "fegetexceptflag" stores a implementation-defined            *
 *  representation of the states of the floating-point status flags indicated *
 *  by its integer argument excepts in the object pointed to by the argument, * 
 *  flagp.                                                                    *
 *                                                                            *
 *  The function "feraiseexcept" raises the supported floating-point          *
 *  exceptions represented by its argument. The order in which these          *
 *  floating-point exceptions are raised is unspecified.                      *
 *                                                                            *
 *  The function "fesetexceptflag" sets or clears the floating point status   *
 *  flags indicated by the argument excepts to the states stored in the       *
 *  object pointed to by flagp. The value of the *flagp shall have been set   *
 *  by a previous call to fegetexceptflag whose second argument represented   *
 *  at least those floating-point exceptions represented by the argument      *
 *  excepts. This function does not raise floating-point exceptions; it just  *
 *  sets the state of the flags.                                              *
 *                                                                            *
 *  The function "fetestexcept" determines which of the specified subset of   *
 *  the floating-point exception flags are currently set.  The excepts        *
 *  argument specifies the floating-point status flags to be queried. This    *
 *  function returns the value of the bitwise OR of the floating-point        *
 *  exception macros corresponding to the currently set floating-point        *
 *  exceptions included in excepts.                                           *
 ******************************************************************************/

extern int feclearexcept(int /* excepts */);
extern int fegetexceptflag(fexcept_t * /* flagp */, int /* excepts */);
extern int feraiseexcept(int /* excepts */);
extern int fesetexceptflag(const fexcept_t * /* flagp */, int /* excepts */);
extern int fetestexcept(int /* excepts */);

/******************************************************************************
 *  The following functions provide control of rounding direction modes.      *
 *                                                                            *
 *  The function "fegetround" returns the value of the rounding direction     *
 *  macro which represents the current rounding direction, or a negative      *
 *  if there is no such rounding direction macro or the current rounding      *
 *  direction is not determinable.                                            *
 *                                                                            *
 *  The function "fesetround" establishes the rounding direction represented  *
 *  by its argument "round". If the argument is not equal to the value of a   *
 *  rounding direction macro, the rounding direction is not changed.  It      *
 *  returns zero if and only if the argument is equal to a rounding           *
 *  direction macro.                                                          *
 ******************************************************************************/
    
extern int fegetround(void);
extern int fesetround(int /* round */);

/******************************************************************************
 *  The following functions manage the floating-point environment, exception  *
 *  flags and dynamic modes, as one entity.                                   *
 *                                                                            *
 *  The fegetenv function stores the current floating-point enviornment in    *
 *  the object pointed to by envp.                                            *
 *                                                                            *
 *  The feholdexcept function saves the current floating-point environment in *
 *  the object pointed to by envp, clears the floating-point status flags,    *
 *  and then installs a non-stop (continue on floating-point exceptions)      *
 *  mode, if available, for all floating-point exceptions. The feholdexcept   *
 *  function returns zero if and only if non-stop floating-point exceptions   *
 *  handling was successfully installed.                                      *
 *                                                                            *
 *  The fesetnv function establishes the floating-point environment           *
 *  represented by the object pointed to by envp. The argument envp shall     *
 *  point to an object set by a call to fegetenv or feholdexcept, or equal to *
 *  a floating-point environment macro to be C99 standard compliant and       *
 *  portable to other architectures. Note that fesetnv merely installs the    *
 *  state of the floating-point status flags represented through its          *
 *  argument, and does not raise these floating-point exceptions.             *
 *                                                                            *
 *  The feupdateenv function saves the currently raised floating-point        *
 *  exceptions in its automatic storage, installs the floating-point          *
 *  environment represented by the object pointed to by envp, and then raises *
 *  the saved floating-point exceptions. The argument envp shall point to an  *
 *  object set by a call to feholdexcept or fegetenv or equal a               *
 *  floating-point environment macro.                                         *
 ******************************************************************************/
    
extern int fegetenv(fenv_t * /* envp */);
extern int feholdexcept(fenv_t * /* envp */);
extern int fesetenv(const fenv_t * /* envp */);
extern int feupdateenv(const fenv_t * /* envp */);

#ifdef __cplusplus
}
#endif

#endif /* __FENV_H__ */

@geoffrey4444
Copy link

I've been trying to figure out floating-point-exception trapping on Apple Silicon Macs, and I might have found a solution here:

https://opensource.apple.com/source/xnu/xnu-6153.11.26/tests/fp_exception.c.auto.html

Based on what Apple's code does, here is a little test program that seems to work on my M1 Mac; the only oddity is that the signal that gets thrown is SIGILL, not SIGFPE, when a floating-point exception is encountered. Maybe this would be helpful?

#include <cmath>
#include <csignal>
#include <fenv.h>
#include <iostream>

uint64_t old_fpcr;

void fpe_signal_handler(int /*signal*/) {
  std::cerr << "Floating point exception!\n";
  exit(1);
}

//
void enable_floating_point_exceptions() {
  old_fpcr = __builtin_arm_rsr64("FPCR");
  uint64_t fpcr = __builtin_arm_rsr64("FPCR") | __fpcr_trap_divbyzero;
  __builtin_arm_wsr64("FPCR", fpcr);
  signal(SIGILL, fpe_signal_handler);
}
void disable_floating_point_exceptions() {
  __builtin_arm_wsr64("FPCR", old_fpcr);
}

int main() {
  double x = -4.0;
  double y = 0.0;

  // Should not signal
  std::cout << x / y << "\n";

  // Should not signal
  enable_floating_point_exceptions();
  disable_floating_point_exceptions();
  std::cout << x / y << "\n";

  // Should signal
  enable_floating_point_exceptions();
  std::cout << x / y << "\n";
}

@mrklein
Copy link
Owner

mrklein commented Dec 2, 2021

@geoffrey4444 Thank you for the update. Could you also test square root of negative number? Fractional power of negative number? I wonder what signals are sent for these errors.

@olesenm
Copy link

olesenm commented Dec 3, 2021

Hi @geoffrey4444 - good to have you onboard. Did you try with the updated fpe handling?

I'm guess that this will be a topic that we may have to continue on next year (getting really too close to Dec release to fiddle with much anymore).

@geoffrey4444
Copy link

Hi @mrklein @olesenm! Please accept my apologies for the delay in getting back to you. I'm actually not an openfoam user, so I haven't tried the updated fpe handling yet. I just saw this thread while investigating FPEs for a numerical-relativity code (https://github.com/sxs-collaboration/spectre) that I help develop. I just thought I'd share what I had come up with, in case it might be helpful for you, since there are very few resources I could find online about how to properly handle FPEs on apple silicon.

If you enable the correct exceptions, the code I posted will signal SIGILL (e.g., mask __fpcr_trap_invalid to signal for sqrt of negative numbers)...the relevant masks are all in the file posted above:

enum {
    __fpcr_trap_invalid   = 0x00000100,
    __fpcr_trap_divbyzero = 0x00000200,
    __fpcr_trap_overflow  = 0x00000400,
    __fpcr_trap_underflow = 0x00000800,
    __fpcr_trap_inexact   = 0x00001000,
    __fpcr_trap_denormal  = 0x00008000,
    __fpcr_flush_to_zero  = 0x01000000,
};

I do plan to try your approach to FPEs and see if it works for our code as well...I'll let you know how it goes!

@geoffrey4444
Copy link

@olesenm I tried a simple test executable using your updated fpe handling code (pasted below). I compiled with clang++ -o fpe -std=c++17 fpe.cpp, and when I ran it, it did not trap any floating point exceptions, and fpe_signal_handler() never gets called. Maybe I'm doing something wrong?

// fpe.cpp
#include <cmath>
#include <csignal>
#include <fenv.h>
#include <iostream>
#include <limits>

void fpe_signal_handler(int /*signal*/) {
  std::cerr << "Floating point exception!\n";
  exit(1);
}

inline int feenableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpcr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // unmask
#if defined __arm64__
    fenv.__fpcr &= ~new_excepts;
    fenv.__fpsr &= ~(new_excepts << 7);
#else
    fenv.__control &= ~new_excepts;
    fenv.__mxcsr   &= ~(new_excepts << 7);
#endif
    return fesetenv(&fenv) ? -1 : old_excepts;
}


inline int fedisableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // all previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpcr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // mask
#if defined __arm64__
    fenv.__fpcr |= new_excepts;
    fenv.__fpsr |= new_excepts << 7;
#else
    fenv.__control |= new_excepts;
    fenv.__mxcsr   |= new_excepts << 7;
#endif

    return fesetenv(&fenv) ? -1 : old_excepts;
}


int main() {
  std::signal(SIGFPE, fpe_signal_handler);
  std::signal(SIGILL, fpe_signal_handler);
  std::cout << std::numeric_limits<double>::has_signaling_NaN << "\n";
  double x = -4.0;
  double y = 0.0;

  // Should not signal
  std::cout << x / y << "\n";

  unsigned int excepts = FE_DIVBYZERO | FE_INEXACT | FE_INVALID | FE_OVERFLOW;

  // Should not signal
  feenableexcept(excepts);
  fedisableexcept(excepts);
  std::cout << x / y << "\n";

  // Should not signal
  double z = std::numeric_limits<double>::signaling_NaN();
  z = 4.0;
  std::cout << z << "\n";

  // Should signal
  feenableexcept(excepts);
  z = std::numeric_limits<double>::signaling_NaN();
  std::cout << sqrt(z) << "\n";

  // Should signal
  std::cout << x / y << "\n";
}

@mrklein
Copy link
Owner

mrklein commented Aug 7, 2022

@AleksZhuravlyov Since there is no way to catch sigFPE on Apple Silicon hardware shall we close the issue?

@AleksZhuravlyov
Copy link
Author

Hi @mrklein,

Yeah, it's incorrect to have the issue opened since, it seems to be a technical Silicon issue. I believe, it'll be resolved soon by the community one way or another.

From my perspective, openFOAM can be used employing this simple temporary solution. It is just important to keep in mind that division by zero is not going to be caught )

And again, thank you for your project,

Aleks

@SxnSess
Copy link

SxnSess commented Apr 30, 2023

hello i am doing my university thesis and i need to run openfoam on nvidia module called Jetson AGX Xavier, the architecture of this module is ARM64 (aarch64), how can i install some version of openfoam?

@mrklein
Copy link
Owner

mrklein commented Apr 30, 2023

Hi,

Usually this module has Linux installed, so you just follow official compilation guides. There are different wmake rules for ARM compilation (linuxARM64Arm, linuxARM64Clang, linuxARM64Fujitsu, linuxARM64Gcc, linuxARM64Nvidia). You use the ones, which correspond to your installation.

Since I do not have any Jetson AGX Xavier available for testing, if you can provide remote access to the hardware I can take a look.

@SxnSess
Copy link

SxnSess commented Apr 30, 2023

If I can give you access through a VPN, is that OK?
How can we communicate better?

@mrklein
Copy link
Owner

mrklein commented Apr 30, 2023

You can use email from my profile to contact me in private.

@SxnSess
Copy link

SxnSess commented Apr 30, 2023

contact you mrklein. Thanks
I have already sent you an email @mrklein

@BrushXue
Copy link

BrushXue commented Aug 3, 2023

@olesenm I tried a simple test executable using your updated fpe handling code (pasted below). I compiled with clang++ -o fpe -std=c++17 fpe.cpp, and when I ran it, it did not trap any floating point exceptions, and fpe_signal_handler() never gets called. Maybe I'm doing something wrong?

@geoffrey4444 According to https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/fpu/feenablxcpt.c;h=cc98a649db58b961ededc8d6266c7bca18062030;hb=refs/heads/master and https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/fpu/fedisblxcpt.c;h=8041f438bdca4be929e9302e5d03bde72a4a48a1;hb=refs/heads/master the implementation on arm64 is different from x86, so the code won't work here. This modified code will give you floating point exception (you may need Xcode >14.3). However, it still won't give me sigfpe in OpenFOAM.

// fpe.cpp
#include <cmath>
#include <csignal>
#include <fenv.h>
#include <iostream>
#include <limits>

void fpe_signal_handler(int /*signal*/) {
  std::cerr << "Floating point exception!\n";
  exit(1);
}

inline int feenableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpsr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // unmask
#if defined __arm64__
    fenv.__fpsr |= new_excepts;
    fenv.__fpcr |= new_excepts << 8;
#else
    fenv.__control &= ~new_excepts;
    fenv.__mxcsr   &= ~(new_excepts << 7);
#endif
    return fesetenv(&fenv) ? -1 : old_excepts;
}


inline int fedisableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // all previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpsr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // mask
#if defined __arm64__
    fenv.__fpsr &= ~new_excepts;
    fenv.__fpcr &= ~(new_excepts << 8);
#else
    fenv.__control |= new_excepts;
    fenv.__mxcsr   |= new_excepts << 7;
#endif

    return fesetenv(&fenv) ? -1 : old_excepts;
}


int main() {
  std::signal(SIGFPE, fpe_signal_handler);
  std::signal(SIGILL, fpe_signal_handler);
  std::cout << std::numeric_limits<double>::has_signaling_NaN << "\n";
  double x = -4.0;
  double y = 0.0;

  // Should not signal
  std::cout << x / y << "\n";

  unsigned int excepts = FE_DIVBYZERO | FE_INEXACT | FE_INVALID | FE_OVERFLOW;

  // Should not signal
  feenableexcept(excepts);
  fedisableexcept(excepts);
  std::cout << x / y << "\n";

  // Should not signal
  double z = std::numeric_limits<double>::signaling_NaN();
  z = 4.0;
  std::cout << z << "\n";

  // Should signal
  feenableexcept(excepts);
  z = std::numeric_limits<double>::signaling_NaN();
  std::cout << sqrt(z) << "\n";

  // Should signal
  std::cout << x / y << "\n";
}

@BrushXue
Copy link

BrushXue commented Aug 3, 2023

After some investigation, this code only triggers SIGILL instead of SIGFPE

zsh: illegal hardware instruction

which a well-known problem on M1.

@olesenm On arm64, the shift is 8 bits instead of 7, and the mask definition is inverted. You can probably update the code in https://develop.openfoam.com/Development/openfoam/-/blob/master/src/OSspecific/POSIX/signals/feexceptErsatz.H as now it really triggers signal on Apple arm64.

inline int feenableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpsr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // unmask
#if defined __arm64__
    fenv.__fpsr |= new_excepts;
    fenv.__fpcr |= new_excepts << 8;
#else
    fenv.__control &= ~new_excepts;
    fenv.__mxcsr   &= ~(new_excepts << 7);
#endif

    return fesetenv(&fenv) ? -1 : old_excepts;
}


inline int fedisableexcept(unsigned int excepts)
{
    static fenv_t fenv;
    unsigned int new_excepts = excepts & FE_ALL_EXCEPT;
    unsigned int old_excepts;   // all previous masks

    if (fegetenv(&fenv))
    {
        return -1;
    }

#if defined __arm64__
    old_excepts = fenv.__fpsr & FE_ALL_EXCEPT;
#else
    old_excepts = fenv.__control & FE_ALL_EXCEPT;
#endif

    // mask
#if defined __arm64__
    fenv.__fpsr &= ~new_excepts;
    fenv.__fpcr &= ~(new_excepts << 8);
#else
    fenv.__control |= new_excepts;
    fenv.__mxcsr   |= new_excepts << 7;
#endif

    return fesetenv(&fenv) ? -1 : old_excepts;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants