Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VSX support #195

Merged
merged 33 commits into from
May 31, 2018
Merged

Add VSX support #195

merged 33 commits into from
May 31, 2018

Conversation

shibatch
Copy link
Owner

@shibatch shibatch commented May 27, 2018

This patch adds support for POWER VSX.
clang-5.0 and later is supported at this time.
gcc-8 should be able to compile the VSX code, but I haven't checked.

@avmgithub Please check if this works on your environment.

@shibatch shibatch requested a review from fpetrogalli May 27, 2018 15:17
@avmgithub
Copy link

@shibatch ok , will do, do you which unit test in pytorch exercises sleef ?

@shibatch
Copy link
Owner Author

@avmgithub I don't fully understand you mean, but please check if ctest in sleef passes.
I only checked with binfmt_misc + qemu, so I'm not confident if this works on other environments.
For PyTorch, I haven't done any test. Do you know what the standard set of tests is in a case like this?

@avmgithub
Copy link

@shibatch I cloned and checkedout commit ca35e1a . I hope that is the correct one. Here were my steps:
cd sleef
git checkout ca35e1a
mkdir build
cd build
cmake ..
make VERBOSE=1 <<<< this failed stating I need to add -std=c99 . I had to add them into 2 places .
first in line 193 ; set(FLAGS_FASTMATH "-std=c99 -ffast-math") and the other in 236 ; set(SLEEF_C_FLAGS "-std=c99 " "${FLAGS_WALL} ${FLAGS_STRICTMATH} ${FLAGS_NO_ERRNO}") of Configure.cmake.

here is the error:
/home/freddie/projects/sleef/src/libm/mkmasked_gnuabi.c:58:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for(int i=0;funcList[i].name != NULL;i++) {
^
/home/freddie/projects/sleef/src/libm/mkmasked_gnuabi.c:58:3: note: use option -std=c99 or -std=gnu99 to compile your code

Once I corrected the error above. All built fine. I then ran "make test"
It gave me errors in the last 8 tests (see below)
......
18/26 Test #18: roundtriptest2ddp_4_4 ............***Failed 10.44 sec
Start 19: roundtriptest2ddp_8_8
19/26 Test #19: roundtriptest2ddp_8_8 ............***Failed 0.53 sec
Start 20: roundtriptest2ddp_10_10
20/26 Test #20: roundtriptest2ddp_10_10 ..........***Failed 0.21 sec
Start 21: roundtriptest2ddp_5_15
21/26 Test #21: roundtriptest2ddp_5_15 ...........***Failed 0.30 sec
Start 22: roundtriptest2dsp_2_2
22/26 Test #22: roundtriptest2dsp_2_2 ............ Passed 0.16 sec
Start 23: roundtriptest2dsp_4_4
23/26 Test #23: roundtriptest2dsp_4_4 ............***Failed 8.49 sec
Start 24: roundtriptest2dsp_8_8
24/26 Test #24: roundtriptest2dsp_8_8 ............***Failed 0.50 sec
Start 25: roundtriptest2dsp_10_10
25/26 Test #25: roundtriptest2dsp_10_10 ..........***Failed 0.20 sec
Start 26: roundtriptest2dsp_5_15
26/26 Test #26: roundtriptest2dsp_5_15 ...........***Failed 0.30 sec

69% tests passed, 8 tests failed out of 26

Total Test time (real) = 29.56 sec

The following tests FAILED:
18 - roundtriptest2ddp_4_4 (Failed)
19 - roundtriptest2ddp_8_8 (Failed)
20 - roundtriptest2ddp_10_10 (Failed)
21 - roundtriptest2ddp_5_15 (Failed)
23 - roundtriptest2dsp_4_4 (Failed)
24 - roundtriptest2dsp_8_8 (Failed)
25 - roundtriptest2dsp_10_10 (Failed)
26 - roundtriptest2dsp_5_15 (Failed)
Errors while running CTest
make: *** [test] Error 8

@shibatch
Copy link
Owner Author

Can I see the output from cmake?

@avmgithub
Copy link

sure , here it is
$ cmake ..
-- The C compiler identification is GNU 4.8.5
-- Check for working C compiler: /bin/cc
-- Check for working C compiler: /bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Setting build type to 'Release' (required for full support).
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of long double
-- Check size of long double - done
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE
-- Performing Test COMPILER_SUPPORTS_LONG_DOUBLE - Success
-- Performing Test COMPILER_SUPPORTS_FLOAT128
-- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE2
-- Performing Test COMPILER_SUPPORTS_SSE2 - Failed
-- Performing Test COMPILER_SUPPORTS_SSE4
-- Performing Test COMPILER_SUPPORTS_SSE4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX
-- Performing Test COMPILER_SUPPORTS_AVX - Failed
-- Performing Test COMPILER_SUPPORTS_FMA4
-- Performing Test COMPILER_SUPPORTS_FMA4 - Failed
-- Performing Test COMPILER_SUPPORTS_AVX2
-- Performing Test COMPILER_SUPPORTS_AVX2 - Failed
-- Performing Test COMPILER_SUPPORTS_SVE
-- Performing Test COMPILER_SUPPORTS_SVE - Failed
-- Performing Test COMPILER_SUPPORTS_AVX512F
-- Performing Test COMPILER_SUPPORTS_AVX512F - Failed
-- Performing Test COMPILER_SUPPORTS_VSX
-- Performing Test COMPILER_SUPPORTS_VSX - Failed
-- Try OpenMP C flag = [-fopenmp]
-- Performing Test OpenMP_FLAG_DETECTED
-- Performing Test OpenMP_FLAG_DETECTED - Success
-- Found OpenMP: -fopenmp
-- Performing Test COMPILER_SUPPORTS_OPENMP
-- Performing Test COMPILER_SUPPORTS_OPENMP - Success
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES
-- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Success
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH
-- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Success
-- Unroll target for DP : unroll_0_vecextdp.c;unroll_2_vecextdp.c
-- Unroll target for SP : unroll_0_vecextsp.c;unroll_2_vecextsp.c
-- Configuring build for SLEEF-v3.2
Target system: Linux-3.10.0-693.11.6.el7.ppc64le
Target processor: ppc64le
Host system: Linux-3.10.0-693.11.6.el7.ppc64le
Host processor: ppc64le
Detected C compiler: GNU @ /bin/cc
-- Using option -Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math to compile libsleef
-- Building shared libs : ON
-- MPFR : LIB_MPFR-NOTFOUND
-- GMP : LIBGMP-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP : 1
-- Configuring done
-- Generating done
-- Build files have been written to: /home/freddie/projects/sleef/build

@shibatch
Copy link
Owner Author

Your compiler is too old. Try using clang-5 or later, or gcc-8.
VSX intrinsics are fully supported only by recent compilers.

@avmgithub
Copy link

I'll see what I can do, unfortunately most of ppc64le users use the OS distro gcc. Right now we're supporting RH 7.4 which comes with gcc 4.8. Even if I switch to the latest Advanced Toolkit for ppc64le it only goes to gcc 7.3.1 and that becomes a problem with pytorch as pytorch only seem to support gcc 6.

@shibatch
Copy link
Owner Author

I see, but I think it's basically impossible to support VSX with gcc 4.8.
We need to wait until newer compilers are supported.

Configure.cmake Outdated
@@ -192,7 +208,7 @@ if(CMAKE_C_COMPILER_ID MATCHES "(GNU|Clang)")
# src/arch/helpervecext.h:88
string(CONCAT FLAGS_WALL ${FLAGS_WALL} " -Wno-psabi")
set(FLAGS_ENABLE_NEON32 "-mfpu=neon")
endif(CMAKE_C_COMPILER_ID MATCHES "GNU")
endif()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good practice to enclose the if condition in the endif statement. We don't have a coding format, but i think we should keep it. Please revert this change.

#include <time.h>

#if defined(UNDEF_USE_EXTERN_INLINES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

#include <time.h>

#if defined(UNDEF_USE_EXTERN_INLINES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

#include <time.h>
#include <unistd.h>
#include <sys/time.h>

#if defined(UNDEF_USE_EXTERN_INLINES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

@@ -7,9 +7,17 @@
#include <stdlib.h>
#include <stdint.h>
#include <assert.h>
#include <time.h>

#if defined(UNDEF_USE_EXTERN_INLINES)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround specific to this patch. Could you please add a comment as you did in the CMakeLists.txt file?

@@ -268,7 +277,7 @@ EXPORT CONST vdouble xsin(vdouble d) {
d = vmla_vd_vd_vd_vd(dqh, vcast_vd_d(-PI_C), d);
d = vmla_vd_vd_vd_vd(dql, vcast_vd_d(-PI_C), d);
d = vmla_vd_vd_vd_vd(vadd_vd_vd_vd(dqh, dql), vcast_vd_d(-PI_D), d);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change this line, we don't have a coding standard that enforces to remove lines with spaces.

@@ -289,7 +298,7 @@ EXPORT CONST vdouble xsin(vdouble d) {
vor_vo_vo_vo(visnegzero_vo_vd(r),
vgt_vo_vd_vd(vabs_vd_vd(r), vcast_vd_d(TRIGRANGEMAX)))),
vcast_vd_d(-0.0), u);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't change this line, we don't have a coding standard that enforces to remove lines with spaces.

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do it the other way around, so that you don't have to change the variable used in the printf calls? We have a patch downstream that adds the AArch64 Vector PCS attribute [1] to the renaming functions (of course, we will upstream it), your changes will you be easier to merge if you leave the original name in the printf calls.

[1] https://developer.arm.com/products/software-development-tools/hpc/arm-compiler-for-hpc/vector-function-abi

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all the arguments of printf calls are space-escaped.
I don't think of a good way of fulfilling your request.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me put it this way: how is it possible that argv[3] has spaces in it? Wouldn't a space break the argument in two arguments?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thank you. Please add your useful answer as a comment to the definition of the function escapeSpace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h
Hi @shibatch , sorry for bothering you again with this. I had a second (or maybe third!!) thought, and if my understanding is correctly, the following variables don't have space in their names, unless when targeting VSX:

vdoublename
vfloatname
vintname
vint2name

Given that the function escapeSpace do not modify the values of those variables when not targeting VSX, I'd rather have you redefine them as follows:

char *vdoublename = escapeSpace(argv[3]);
char *vfloatname = escapeSpace(argv[4]);
char *vintname = escapeSpace(argv[5]);
char *vint2name = escapeSpace(argv[6]);

This should keep the functionality of the current patch, without the need to modify the variables name in the printf calls.

Configure.cmake Outdated

# When cross compiling for ppc64, this bug-workaround is needed
if(CMAKE_CROSSCOMPILING AND CMAKE_SYSTEM_PROCESSOR MATCHES "^(powerpc|ppc)64")
set(COMMON_TARGET_DEFINITIONS ${COMMON_TARGET_DEFINITIONS} UNDEF_USE_EXTERN_INLINES=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have commented all the places where you use this define, asking to add a comment saying that the use of UNDEF_USE_EXTERN_INLINES is platform specific.

I was thinking, instead of adding the comments, how about making the variable names specific to PPC64, so there is not doubt it target specific? May I ask you to change it to POWER64_UNDEF_USE_EXTERN_INLINES? With this name change, I am happy for you to avoid adding the comments all around.

CHANGELOG.md Outdated
@@ -15,6 +15,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- 3.5-ULP versions of sinh, cosh, tanh, sinhf, coshf, tanhf, and the
corresponding testing functionalities are added.
https://github.com/shibatch/sleef/pull/192
- Power VSX target support is added to libm.
Copy link
Collaborator

@fpetrogalli fpetrogalli May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this! :)

UPDATE: I just realized the changelog should say libsleef, not libm.

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thank you. Please add your useful answer as a comment to the definition of the function escapeSpace.

char *vfloatname = argv[4];
char *vintname = argv[5];
char *vint2name = argv[6];
char *vdoublename = argv[3], *vdoublename_escspace = escapeSpace(vdoublename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h
Hi @shibatch , sorry for bothering you again with this. I had a second (or maybe third!!) thought, and if my understanding is correctly, the following variables don't have space in their names, unless when targeting VSX:

vdoublename
vfloatname
vintname
vint2name

Given that the function escapeSpace do not modify the values of those variables when not targeting VSX, I'd rather have you redefine them as follows:

char *vdoublename = escapeSpace(argv[3]);
char *vfloatname = escapeSpace(argv[4]);
char *vintname = escapeSpace(argv[5]);
char *vint2name = escapeSpace(argv[6]);

This should keep the functionality of the current patch, without the need to modify the variables name in the printf calls.

@@ -10,6 +10,13 @@

#include "funcproto.h"

char *escapeSpace(char *str) {
char *ret = malloc(strlen(str) + 10);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This memory allocation is never released. you should modify the content of str in place, and definitely not return ret.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is harmless, but I added free for these variables.

@shibatch
Copy link
Owner Author

The contents of vdoublename and other variables are used for its own type name and part of typedef name. So, we need both the original one and the replaced one. They cannot be unified.

@fpetrogalli
Copy link
Collaborator

The contents of vdoublename and other variables are used for its own type name and part of typedef name. So, we need both the original one and the replaced one. They cannot be unified.

@shibatch Sigh - apologies, now I have seen it. Sorry for keeping you on this.

@fpetrogalli
Copy link
Collaborator

Thank you for adding this new target @shibatch! This shows all the goodness of SLEEF!

@shibatch shibatch merged commit 41b62dd into master May 31, 2018
@shibatch shibatch deleted the Add_VSX_Support branch May 31, 2018 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants