Generate inline headers #283

shibatch · 2020-02-21T04:17:09Z

This patch implements the functionality described in issue #282.

The header files for inlining sleef functions are generated if -DBUILD_INLINE_HEADERS=TRUE is specified as a cmake option.

In order to use one of these headers, the following 3 macros have to be defined.

SLEEF_ALWAYS_INLINE : This is the function attribute specified for helper functions
SLEEF_INLINE : This is the function attribute specified for SLEEF functions.
SLEEF_CONST : This is usually const attribute.

Each header file corresponds to one vector extension. Only one of the header files can be included from the same source file.

It also builds libsleefinline.a, that is referred from the inlined functions.

Aarch32 is not supported at this time.

--

The main change to the source codes can be summarized into the following two points.

src/libm/CMakeLists.txt : This patch adds instructions for building the header files. It applies C preprocessor and sed command multiple times to generate the header files.
Addition to helper files : This patch adds lines beginning with //@ which is specially treated during generation of the header files.

It seems that parallel build does not work even on linux computers. According to the following page, parallel build is always unsafe if multiple COMMANDs are used in add_custom_command. Accordingly, I changed the CI settings to remove parallel builds.

https://stackoverflow.com/questions/60105230/how-to-run-cmake-commands-in-add-custom-command-in-order

Currently, Jenkins servers are not available. I hope they will be available again before this patch is approved. I manually ran the builds and tests on every platform.

@colesbury Github does not allow me to add you as a reviewer, but please tell us your thoughts.

colesbury · 2020-02-24T16:46:41Z

Thanks, I will test this out today

colesbury · 2020-02-24T17:31:30Z

Thanks, this addresses the request in #230

In my limited testing, the functions are now successfully inlined at call sites. In our sigmoid function, I measured a ~14% improvement for 1 million floats by inlining the call to Sleef_expf8_u10. I haven't done measurements of other functions.

Currently, we include Sleef as a submodule and build it as part of the PyTorch build process. The changes to the build make this more difficult. I think we may switch to building Sleef separately and just committing the artifacts (e.g. sleefinline_avx2.h) to the PyTorch repo. I think that should be fine.

shibatch · 2020-02-24T23:21:18Z

Good.
Please credit SLEEF if the generated source code will be mixed into the source tree of PyTorch.

fpetrogalli · 2020-02-25T21:53:10Z

Jenkinsfile

@@ -15,10 +15,10 @@ pipeline {
 			 mkdir build
 			 cd build
 			 cmake -DCMAKE_INSTALL_PREFIX=../install -DSLEEF_SHOW_CONFIG=1 -DENFORCE_TESTER3=TRUE -DBUILD_QUAD=TRUE ..
-			 make -j 6 all
+			 make -j 1 all


Why are changes like this needed in this PR?

Because add_custom_command does not support parallel builds. In this patch, many COMMANDS are executed within one add_custom_command, and parallel builds fail even on linux computers.

https://stackoverflow.com/questions/60105230/how-to-run-cmake-commands-in-add-custom-command-in-order

fpetrogalli · 2020-02-25T21:54:01Z

src/arch/helperadvsimd.h


 #define ENABLE_DP
+//@#define ENABLE_DP


What is this syntax for? I have seen it in other places, so I suspect there is a reason for using it?

Please read my first comment. Generation of the header files requires multiple processing with cpp, and some of the macros are required in later use.

Addition to helper files : This patch adds lines beginning with //@ which is specially treated during generation of the header files.

fpetrogalli · 2020-02-25T21:56:51Z

src/arch/helperavx.h

@@ -649,6 +664,7 @@ static INLINE vargquad vcast_aq_vm2(vmask2 vm2) {
  return a;
 #endif
 }
+#endif


Suggested change

#endif

#endif // #if !defined(SLEEF_GENHEADER)

fpetrogalli · 2020-02-25T22:03:56Z

src/libm-tester/iutsimd.c


 //

 #ifdef ENABLE_DP
-int check_featureDP() {
-  if (vavailability_i(1) == 0) return 0;
+int check_featureDP(double d) {


Is this needed for the header functionality? Same question for the float version below.

With the headers, functions like this have greater tendency of being optimized away.
This change is needed to prevent optimizer from removing this function.

fpetrogalli · 2020-02-25T22:05:49Z

src/libm/CMakeLists.txt

+	  OUTPUT ${INLINE_HEADER_FILE} 
+
+	  COMMAND echo Generating sleefinline_${SIMDLC}.h
+	  COMMAND "${CMAKE_C_COMPILER}" ${FLAG_PREPROCESS} ${FLAG_PRESERVE_COMMENTS} ${FLAG_INCLUDE}${PROJECT_SOURCE_DIR}/src/common ${FLAG_INCLUDE}${PROJECT_SOURCE_DIR}/src/arch ${FLAG_INCLUDE}${CMAKE_CURRENT_BINARY_DIR}/include/ ${FLAG_DEFINE}SLEEF_GENHEADER ${FLAG_DEFINE}ENABLE_${SIMD} ${FLAG_DEFINE}DORENAME ${CMAKE_CURRENT_SOURCE_DIR}/sleefsimddp.c > ${CMAKE_CURRENT_BINARY_DIR}/sleef${SIMD}.h.tmp1


Please break long lines. It is hard to understand what is going on here. Please try not to pass the 80 chars limit.

I added comments in this section.

fpetrogalli · 2020-02-25T22:06:35Z

src/libm/rempitab.c

+#if !defined(SLEEF_GENHEADER)
+#define FUNCATR NOEXPORT ALIGNED(64)
+#else
+#define FUNCATR EXPORT ALIGNED(64)
+#endif


Why is this needed?

This table is referenced from the source code including the generated header.
So, this table has to be compiled into a library.

fpetrogalli · 2020-02-25T22:09:07Z

src/libm/sleefsimdsp.c

+#if !defined(SLEEF_GENHEADER)
 #include "helpersse2.h"
+#else
+#include "macroonlySSE2.h"
+#endif


Do we need to header files? Couldn't we conditionally compile the same header file, exposing only the macros when SLEEF_GENHEADER is defined? This could avoid the need of generating the macroonly* files.

Instead of only exposing some macros, I chose the approach to define some of the macros beginning with //@.

Below is comparison between pros and cons.
Approach 1: Exposing a part of the macros according to whether SLEEF_GENHEADER is defined.
Pros: The mechanism is straightforward.
Cons: We need to protect most of the part in the macro with large #ifdef. This is hard to read.
Generation of macros is not very understandable anyway.

Approach 2: Define some of the macros beginning with //@
Pros: Easier to read and manage the source code.
Cons: This is a dedicated way for SLEEF. and requires understanding how the macros are generated.

fpetrogalli · 2020-02-25T22:13:36Z

travis/before_script.osx-clang.sh

@@ -2,4 +2,4 @@
 set -ev
 mkdir sleef.build
 cd sleef.build
-cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../install -DSLEEF_SHOW_CONFIG=1 -DOPENSSL_ROOT_DIR=/usr/local/opt/openssl -DENFORCE_TESTER3=TRUE -DBUILD_QUAD=TRUE ..
+cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../install -DSLEEF_SHOW_CONFIG=1 -DOPENSSL_ROOT_DIR=/usr/local/opt/openssl -DENFORCE_TESTER3=TRUE -DBUILD_QUAD=TRUE -DBUILD_INLINE_HEADERS=TRUE ..


Do I understand correctly that you are always generating the header files because you want to test them on each architecture with the iuti* executables? Has this increased the running time in CI?

Yes, running time in CI is increased.
If that is a problem, we can just turn off generation.

fpetrogalli · 2020-02-25T22:15:08Z

How comes that CI hasn't completed yet?

shibatch · 2020-02-25T23:46:19Z

Because of a policy change in network security at my institute, I now have to move the CI servers to a different network segment.
I had a meeting with the people in charge of network management, and I was told to change the network settings of the servers.
However, they did not know what CI servers are, and I was asked to have a meeting with them again.
In the second meeting, I told them how the servers work, and they are now saying that additional security measures are required.

shibatch added 5 commits February 21, 2020 09:33

no message

739f049

no message

6b0e416

no message

283295c

no message

855b900

no message

6c8a6d3

shibatch requested a review from fpetrogalli February 21, 2020 04:17

shibatch mentioned this pull request Feb 24, 2020

Redundant rounding in AVX512 exp d8? #285

Closed

fpetrogalli reviewed Feb 25, 2020

View reviewed changes

shibatch and others added 10 commits February 26, 2020 09:56

no message

a8c6f53

no message

1f90c2a

Merge branch 'master' into generate_inline_headers

4be4a9d

Merge branch 'master' into generate_inline_headers

3dbddb4

Merge branch 'master' into generate_inline_headers

07ea2b0

Merge branch 'master' into generate_inline_headers

7e8ddac

no message

557aba9

no message

1467582

no message

5d6be85

no message

f91f2fa

shibatch requested a review from fpetrogalli April 7, 2020 00:15

shibatch and others added 6 commits April 7, 2020 10:08

no message

164d6e8

no message

306862c

Update Jenkinsfile

1dece23

no message

02e6974

Merge branch 'master' into generate_inline_headers

5e90166

no message

0fe6a61

shibatch and others added 3 commits June 26, 2020 13:27

no message

99af8e8

Merge branch 'master' into generate_inline_headers

8a81e0c

no message

400756a

shibatch merged commit 135e9b5 into master Jul 23, 2020

shibatch deleted the generate_inline_headers branch July 23, 2020 08:21

armanbilge mentioned this pull request Nov 7, 2021

sleef: Also build inline headers Homebrew/homebrew-core#88923

Closed

6 tasks

blapie mentioned this pull request Nov 23, 2023

Add generators of headers for inlining whole sleef functions #282

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate inline headers #283

Generate inline headers #283

shibatch commented Feb 21, 2020

colesbury commented Feb 24, 2020

colesbury commented Feb 24, 2020

shibatch commented Feb 24, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 25, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 25, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 26, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 25, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 26, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 25, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 26, 2020

fpetrogalli Feb 25, 2020

shibatch Feb 26, 2020

fpetrogalli commented Feb 25, 2020

shibatch commented Feb 25, 2020 •

edited

Loading

Generate inline headers #283

Generate inline headers #283

Conversation

shibatch commented Feb 21, 2020

colesbury commented Feb 24, 2020

colesbury commented Feb 24, 2020

shibatch commented Feb 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fpetrogalli commented Feb 25, 2020

shibatch commented Feb 25, 2020 • edited Loading

shibatch commented Feb 25, 2020 •

edited

Loading