New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VSX support #195

Merged
merged 33 commits into from May 31, 2018

Conversation

Projects
None yet
3 participants
@shibatch
Owner

shibatch commented May 27, 2018

This patch adds support for POWER VSX.
clang-5.0 and later is supported at this time.
gcc-8 should be able to compile the VSX code, but I haven't checked.

@avmgithub Please check if this works on your environment.

shibatch added some commits May 27, 2018

@shibatch shibatch requested a review from fpetrogalli-arm May 27, 2018

@avmgithub

This comment has been minimized.

avmgithub commented May 27, 2018

@shibatch ok , will do, do you which unit test in pytorch exercises sleef ?

@shibatch

This comment has been minimized.

Owner

shibatch commented May 27, 2018

@avmgithub I don't fully understand you mean, but please check if ctest in sleef passes.
I only checked with binfmt_misc + qemu, so I'm not confident if this works on other environments.
For PyTorch, I haven't done any test. Do you know what the standard set of tests is in a case like this?

shibatch added some commits May 27, 2018

@avmgithub

This comment has been minimized.

avmgithub commented May 28, 2018

@shibatch I cloned and checkedout commit ca35e1a . I hope that is the correct one. Here were my steps:
cd sleef
git checkout ca35e1a
mkdir build
cd build
cmake ..
make VERBOSE=1 <<<< this failed stating I need to add -std=c99 . I had to add them into 2 places .
first in line 193 ; set(FLAGS_FASTMATH "-std=c99 -ffast-math") and the other in 236 ; set(SLEEF_C_FLAGS "-std=c99 " "${FLAGS_WALL} ${FLAGS_STRICTMATH} ${FLAGS_NO_ERRNO}") of Configure.cmake.

here is the error:
/home/freddie/projects/sleef/src/libm/mkmasked_gnuabi.c:58:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for(int i=0;funcList[i].name != NULL;i++) {
^
/home/freddie/projects/sleef/src/libm/mkmasked_gnuabi.c:58:3: note: use option -std=c99 or -std=gnu99 to compile your code

Once I corrected the error above. All built fine. I then ran "make test"
It gave me errors in the last 8 tests (see below)
......
18/26 Test #18: roundtriptest2ddp_4_4 ............***Failed 10.44 sec
Start 19: roundtriptest2ddp_8_8
19/26 Test #19: roundtriptest2ddp_8_8 ............***Failed 0.53 sec
Start 20: roundtriptest2ddp_10_10
20/26 Test #20: roundtriptest2ddp_10_10 ..........***Failed 0.21 sec
Start 21: roundtriptest2ddp_5_15
21/26 Test #21: roundtriptest2ddp_5_15 ...........***Failed 0.30 sec
Start 22: roundtriptest2dsp_2_2
22/26 Test #22: roundtriptest2dsp_2_2 ............ Passed 0.16 sec
Start 23: roundtriptest2dsp_4_4
23/26 Test #23: roundtriptest2dsp_4_4 ............***Failed 8.49 sec
Start 24: roundtriptest2dsp_8_8
24/26 Test #24: roundtriptest2dsp_8_8 ............***Failed 0.50 sec
Start 25: roundtriptest2dsp_10_10
25/26 Test #25: roundtriptest2dsp_10_10 ..........***Failed 0.20 sec
Start 26: roundtriptest2dsp_5_15
26/26 Test #26: roundtriptest2dsp_5_15 ...........***Failed 0.30 sec

69% tests passed, 8 tests failed out of 26

Total Test time (real) = 29.56 sec

The following tests FAILED:
18 - roundtriptest2ddp_4_4 (Failed)
19 - roundtriptest2ddp_8_8 (Failed)
20 - roundtriptest2ddp_10_10 (Failed)
21 - roundtriptest2ddp_5_15 (Failed)
23 - roundtriptest2dsp_4_4 (Failed)
24 - roundtriptest2dsp_8_8 (Failed)
25 - roundtriptest2dsp_10_10 (Failed)
26 - roundtriptest2dsp_5_15 (Failed)
Errors while running CTest
make: *** [test] Error 8

@shibatch

This comment has been minimized.

Owner

shibatch commented May 28, 2018

Can I see the output from cmake?

@avmgithub

This comment has been minimized.

avmgithub commented May 28, 2018

sure , here it is
$ cmake ..
-- The C compiler identification is GNU 4.8.5
-- Check for working C compiler: /bin/cc
-- Check for working C compiler: /bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE - Success
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Failed
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Failed
-- Performing Test COMPILER_SUPPORTS_SVE
-- Performing Test COMPILER_SUPPORTS_SVE - Failed
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Failed
-- Performing Test COMPILER_SUPPORTS_VSX
-- Performing Test COMPILER_SUPPORTS_VSX - Failed
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Found OpenMP: -fopenmp
-- Performing Test COMPILER_SUPPORTS_OPENMP
-- Performing Test COMPILER_SUPPORTS_OPENMP - Success
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Success
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Success
-- Unroll target for DP : unroll_0_vecextdp.c;unroll_2_vecextdp.c
-- Unroll target for SP : unroll_0_vecextsp.c;unroll_2_vecextsp.c
-- Configuring build for SLEEF-v3.2
Target system: Linux-3.10.0-693.11.6.el7.ppc64le
Target processor: ppc64le
Host system: Linux-3.10.0-693.11.6.el7.ppc64le
Host processor: ppc64le
Detected C compiler: GNU @ /bin/cc
-- Using option -Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math to compile libsleef
-- Building shared libs : ON
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
-- Configuring done
-- Generating done
-- Build files have been written to: /home/freddie/projects/sleef/build

@shibatch

This comment has been minimized.

Owner

shibatch commented May 28, 2018

Your compiler is too old. Try using clang-5 or later, or gcc-8.
VSX intrinsics are fully supported only by recent compilers.

@avmgithub

This comment has been minimized.

avmgithub commented May 28, 2018

I'll see what I can do, unfortunately most of ppc64le users use the OS distro gcc. Right now we're supporting RH 7.4 which comes with gcc 4.8. Even if I switch to the latest Advanced Toolkit for ppc64le it only goes to gcc 7.3.1 and that becomes a problem with pytorch as pytorch only seem to support gcc 6.

@shibatch

This comment has been minimized.

Owner

shibatch commented May 28, 2018

I see, but I think it's basically impossible to support VSX with gcc 4.8.
We need to wait until newer compilers are supported.

shibatch added some commits May 28, 2018

@@ -192,7 +208,7 @@ if(CMAKE_C_COMPILER_ID MATCHES "(GNU|Clang)")
# src/arch/helpervecext.h:88
string(CONCAT FLAGS_WALL ${FLAGS_WALL} " -Wno-psabi")
set(FLAGS_ENABLE_NEON32 "-mfpu=neon")
endif(CMAKE_C_COMPILER_ID MATCHES "GNU")
endif()

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

It is good practice to enclose the if condition in the endif statement. We don't have a coding format, but i think we should keep it. Please revert this change.

#include <time.h>
#if defined(UNDEF_USE_EXTERN_INLINES)

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

#include <time.h>
#if defined(UNDEF_USE_EXTERN_INLINES)

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

#include <time.h>
#include <unistd.h>
#include <sys/time.h>
#if defined(UNDEF_USE_EXTERN_INLINES)

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

@@ -7,9 +7,17 @@
#include <stdlib.h>
#include <stdint.h>
#include <assert.h>
#include <time.h>
#if defined(UNDEF_USE_EXTERN_INLINES)

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

@@ -268,7 +277,7 @@ EXPORT CONST vdouble xsin(vdouble d) {
d = vmla_vd_vd_vd_vd(dqh, vcast_vd_d(-PI_C), d);
d = vmla_vd_vd_vd_vd(dql, vcast_vd_d(-PI_C), d);
d = vmla_vd_vd_vd_vd(vadd_vd_vd_vd(dqh, dql), vcast_vd_d(-PI_D), d);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

Please don't change this line, we don't have a coding standard that enforces to remove lines with spaces.

@@ -289,7 +298,7 @@ EXPORT CONST vdouble xsin(vdouble d) {
vor_vo_vo_vo(visnegzero_vo_vd(r),
vgt_vo_vd_vd(vabs_vd_vd(r), vcast_vd_d(TRIGRANGEMAX)))),
vcast_vd_d(-0.0), u);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

Please don't change this line, we don't have a coding standard that enforces to remove lines with spaces.

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

Can you do it the other way around, so that you don't have to change the variable used in the printf calls? We have a patch downstream that adds the AArch64 Vector PCS attribute [1] to the renaming functions (of course, we will upstream it), your changes will you be easier to merge if you leave the original name in the printf calls.

[1] https://developer.arm.com/products/software-development-tools/hpc/arm-compiler-for-hpc/vector-function-abi

This comment has been minimized.

@shibatch

shibatch May 30, 2018

Owner

Not all the arguments of printf calls are space-escaped.
I don't think of a good way of fulfilling your request.

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

Let me put it this way: how is it possible that argv[3] has spaces in it? Wouldn't a space break the argument in two arguments?

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

Got it, thank you. Please add your useful answer as a comment to the definition of the function escapeSpace.

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

h
Hi @shibatch , sorry for bothering you again with this. I had a second (or maybe third!!) thought, and if my understanding is correctly, the following variables don't have space in their names, unless when targeting VSX:

vdoublename
vfloatname
vintname
vint2name

Given that the function escapeSpace do not modify the values of those variables when not targeting VSX, I'd rather have you redefine them as follows:

char *vdoublename = escapeSpace(argv[3]);
char *vfloatname = escapeSpace(argv[4]);
char *vintname = escapeSpace(argv[5]);
char *vint2name = escapeSpace(argv[6]);

This should keep the functionality of the current patch, without the need to modify the variables name in the printf calls.

# When cross compiling for ppc64, this bug-workaround is needed
if(CMAKE_CROSSCOMPILING AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(powerpc|ppc)64")
set(COMMON_TARGET_DEFINITIONS ${COMMON_TARGET_DEFINITIONS} UNDEF_USE_EXTERN_INLINES=1)

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

I have commented all the places where you use this define, asking to add a comment saying that the use of UNDEF_USE_EXTERN_INLINES is platform specific.

I was thinking, instead of adding the comments, how about making the variable names specific to PPC64, so there is not doubt it target specific? May I ask you to change it to POWER64_UNDEF_USE_EXTERN_INLINES? With this name change, I am happy for you to avoid adding the comments all around.

@@ -15,6 +15,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- 3.5-ULP versions of sinh, cosh, tanh, sinhf, coshf, tanhf, and the
corresponding testing functionalities are added.
https://github.com/shibatch/sleef/pull/192
- Power VSX target support is added to libm.

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 29, 2018

Collaborator

Love this! :)

UPDATE: I just realized the changelog should say libsleef, not libm.

shibatch added some commits May 30, 2018

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

Got it, thank you. Please add your useful answer as a comment to the definition of the function escapeSpace.

shibatch added some commits May 30, 2018

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

h
Hi @shibatch , sorry for bothering you again with this. I had a second (or maybe third!!) thought, and if my understanding is correctly, the following variables don't have space in their names, unless when targeting VSX:

vdoublename
vfloatname
vintname
vint2name

Given that the function escapeSpace do not modify the values of those variables when not targeting VSX, I'd rather have you redefine them as follows:

char *vdoublename = escapeSpace(argv[3]);
char *vfloatname = escapeSpace(argv[4]);
char *vintname = escapeSpace(argv[5]);
char *vint2name = escapeSpace(argv[6]);

This should keep the functionality of the current patch, without the need to modify the variables name in the printf calls.

@@ -10,6 +10,13 @@
#include "funcproto.h"
char *escapeSpace(char *str) {
char *ret = malloc(strlen(str) + 10);

This comment has been minimized.

@fpetrogalli-arm

fpetrogalli-arm May 30, 2018

Collaborator

This memory allocation is never released. you should modify the content of str in place, and definitely not return ret.

This comment has been minimized.

@shibatch

shibatch May 30, 2018

Owner

I think this is harmless, but I added free for these variables.

@shibatch

This comment has been minimized.

Owner

shibatch commented May 30, 2018

The contents of vdoublename and other variables are used for its own type name and part of typedef name. So, we need both the original one and the replaced one. They cannot be unified.

@fpetrogalli-arm

This comment has been minimized.

Collaborator

fpetrogalli-arm commented May 30, 2018

The contents of vdoublename and other variables are used for its own type name and part of typedef name. So, we need both the original one and the replaced one. They cannot be unified.

@shibatch Sigh - apologies, now I have seen it. Sorry for keeping you on this.

@fpetrogalli-arm

This comment has been minimized.

Collaborator

fpetrogalli-arm commented May 30, 2018

Thank you for adding this new target @shibatch! This shows all the goodness of SLEEF!

@shibatch shibatch merged commit 41b62dd into master May 31, 2018

4 checks passed

continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@shibatch shibatch deleted the Add_VSX_Support branch May 31, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment