-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRC32 Power Optimization Changes #2353
Conversation
util/crc32c_ppc.c
Outdated
* All rights reserved. | ||
* | ||
* This program is free software; you can redistribute it and/or | ||
* modify it under the terms of the GNU General Public License |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure of the answer to this, but I see that this is GPL licensed. Is introducing GPL code acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't answer it for sure either but can provide some additional info.
https://github.com/facebook/rocksdb/blob/master/COPYING
Contains the GPL v2 license.
This code originally came from
https://github.com/antonblanchard/crc32-vpmsum
Which allows license choice for either GPL or Apache 2.0. If necessary we could probably update these files to choose Apache 2.0 if that is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siying Could you please comment on the license used for the new files in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spoke to @antonblanchard who indicated you can merge this under any open source license you choose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sagar0 is going to ask our lawyer for that. In the mean time, I think these files should go under third-party/.
@grooverdan what do you mean by "you can merge this under any open source license you choose". Do you mean you can change this header if needed according to Facebook's lawyer?
@grooverdan You may want to take a look at this PR as well. |
util/ppc.cc
Outdated
int arch_ppc_crc32 = 0; | ||
|
||
#if __linux__ && __powerpc64__ | ||
#include <stdio.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdio not used.
util/ppc.cc
Outdated
#include "util/ppc.h" | ||
|
||
/* flags we export */ | ||
int arch_ppc_crc32 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be static (or local to arch_ppc_probe as its not exposed).
util/crc32c_ppc.c
Outdated
#define VMX_ALIGN 16 | ||
#define VMX_ALIGN_MASK (VMX_ALIGN-1) | ||
|
||
#ifdef REFLECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code blocks marked by CRC_XOR and REFLECT aren't needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those values (#define REFLECT and #define CRC_XOR) are defined in crc32c_ppc_constants.h and those code blocks are needed. Without them you'll get different output values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor changes only.
Apologies for the ones that go a bit off topic.
util/crc32c.cc
Outdated
} | ||
} | ||
# endif | ||
|
||
static inline Function Choose_Extend() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FB folks - what's your opinion on replacing this with an IFUNC?
ref: https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-attribute-2529 (binutils-2.20.1+ GNU C library (at least version 2.11.1).
Some old syntax may be needed to support older clang versions (added https://reviews.llvm.org/D15525)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind. saw Windows support.
util/crc32c_ppc.c
Outdated
{ | ||
unsigned char *buf2; | ||
|
||
if (!data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't see a case where data is a null pointer. Perhaps crc32c::Extend can have an attribute nonnull to enforce this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't clear to me either whether rocksdb currently has any callers passing a NULL data pointer to the crc function or not. Even if there are none now, there could be some in the future. The question is what should the crc service we're providing do about it. The way this code is dropped now, we handle it gracefully but slowly (because of the malloc/free overhead).
In a separate, future pull request we were planning to add code that will take a very fast shortcut if the data pointer is NULL (ie handle it gracefully and quickly). The fastzero experiments we've done are shown in this github repo:
https://github.com/kestrels/crc32c-fastzero/blob/master/output-data/example-output.txt
The very first part of the page shows that to actually allocate a 1GB buffer and crc it takes 137613320 nsec but using the fastzero shortcut (-d) takes only 809 nsec.
return isSSE42() ? ExtendImpl<Fast_CRC32> : ExtendImpl<Slow_CRC32>; | ||
#else | ||
return isAltiVec() ? ExtendPPCImpl : ExtendImpl<Slow_CRC32>; | ||
#endif | ||
} | ||
|
||
bool IsFastCrc32Supported() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its use by DumpSupportInfo and potential for future enhancement this probably should return a const char * describing the implementation rather than a bool. Bit outside the scope of the PR however.
HAS_ALTIVEC=1 | ||
endif | ||
|
||
ifeq (,$(shell $(CXX) -fsyntax-only -mcpu=power8 -xc /dev/null 2>&1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks rather tied to current architecture.
Can ifdef(__powerpc64__)
be used instead of HAVE_POWER8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This path is deliberately tied to current architecture, yes. We needed to disallow this code from running on POWER7 and earlier models because those older hardware models don't have the special vector instructions that are used here to compute the CRC in crc32c_ppc_asm.S. Those special vector instructions were introduced in POWER8.
So this happens both in the Makefile as we check the compiler capabilities here, and also the architecture probe code (arch_ppc_probe function) checks the hardware capability.
So that is why we couldn't use ifdef(powerpc64) in this case.
note #2330 refactors bits of the same code. util/ppc.{h,cc} could be moved into util/crc32c.cc too. |
This same optimized assembly crc algorithm was integrated into Ceph as well: We'd like to get the same performance improvement for the PowerPC architecture into rocksdb as well, since Ceph's Bluestore feature relies heavily on rocksdb. |
@kamasubb updated the pull request - view changes |
@kamasubb updated the pull request - view changes |
@kamasubb updated the pull request - view changes |
@yiwu-arbug Could you please have a look at this? We integrated this code into |
I'm probably not the best person to review. cc @siying and current oncall @maysamyabandeh |
// LICENSE file in the root directory of this source tree. An additional grant | ||
// of patent rights can be found in the PATENTS file in the same directory. | ||
// This source code is also licensed under the GPLv2 license found in the | ||
// COPYING file in the root directory of this source tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move all those files to under third-party/?
Or, if you want to put it here, is there a way to change it to the standard RocksDB header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to leave it here as this directory is where the Intel CRC32 code lives.
As for the header I think the intent was to make it be the standard header. It looks like the header on the other files with the exception of the copyright. This is following the example of OpenStack where larger LOC contributions will add copyrights to the top. For example this file is copyrighted by 5 different contributing entities: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still waiting to hear back from FB open-source legal team on the question of whether this kind of copyright header is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FB open-source legal team approved the copyright headers in this PR, and we are good to go here as far as license/copyright is concerned.
Makefile
Outdated
@@ -545,14 +557,27 @@ $(SHARED2): $(SHARED4) | |||
$(SHARED3): $(SHARED4) | |||
ln -fs $(SHARED4) $(SHARED3) | |||
endif | |||
shared_cc_objects = $(filter %.o,$(LIB_SOURCES:.cc=.cc.o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use capital letters to be consistent with the rest of Makefile?
Makefile
Outdated
LIBOBJECTS = $(LIB_SOURCES:.cc=.o) | ||
LIBOBJECTS += $(TOOL_LIB_SOURCES:.cc=.o) | ||
MOCKOBJECTS = $(MOCK_LIB_SOURCES:.cc=.o) | ||
LIBOBJECTS = $(filter %.o,$(LIB_SOURCES:.cc=.cc.o) $(LIB_SOURCES_C:.c=.c.o) $(LIB_SOURCES_ASM:.S=.S.o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the we need filter here? Is there any corner case in LIB_SOURCES that does not end with ".cc"?
Makefile
Outdated
java_libobjects = $(patsubst %,jl/%,$(LIBOBJECTS)) | ||
java_cc_objects = $(filter %.o,$(LIB_SOURCES:.cc=.cc.o)) | ||
java_c_objects = $(filter %.o,$(LIB_SOURCES_C:.c=.c.o)) | ||
java_asm_objects = $(filter %.o,$(LIB_SOURCES_ASM:.S=.S.o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not reusing shared_asm_objects variable?
Same for the two lines above.
util/crc32c.cc
Outdated
#include "util/crc32c_ppc_constants.h" | ||
#include "util/crc32c_ppc.h" | ||
|
||
#if __linux__ && __powerpc64__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is not it already inside ifdef powerpc64 clause?
util/crc32c.cc
Outdated
return false; | ||
} | ||
} | ||
# endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make sure that you have run "make format"?
@kamasubb updated the pull request - view changes |
@kamasubb updated the pull request - view changes |
@maysamyabandeh - Created a new branch in my local workspace with updated to master of rocksdb and updated with all comments addressed. |
@kamasubb updated the pull request - view changes |
Makefile
Outdated
LIBOBJECTS = $(LIB_SOURCES:.cc=.o) | ||
LIBOBJECTS += $(TOOL_LIB_SOURCES:.cc=.o) | ||
MOCKOBJECTS = $(MOCK_LIB_SOURCES:.cc=.o) | ||
#LIBOBJECTS = $(filter %.o,$(LIB_SOURCES:.cc=.cc.o) $(LIB_SOURCES_C:.c=.c.o) $(LIB_SOURCES_ASM:.S=.S.o)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented line.
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
LGTM. I let @siying land it after copyright issue is resolved. |
util/crc32c_ppc_constants.h
Outdated
// of patent rights can be found in the PATENTS file in the same directory. | ||
// This source code is also licensed under the GPLv2 license found in the | ||
// COPYING file in the root directory of this source tree. | ||
#define CRC 0x1edc6f41 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is missing an include guard.
@kamasubb updated the pull request - view changes - changes since last import |
Travis build failed due to lack of space on the file system. |
@kamasubb Re-kicked the test that failed due to lack of space, and it passed this time. |
Support for PowerPC Architecture Detecting AltiVec Support
@kamasubb updated the pull request - view changes - changes since last import |
@sagar0 Travis build failed due to lack of space on the file system. |
Travis build failed due to lack of space on the file system. |
Re-triggered the test, and it succeeded this time. |
LIBOBJECTS = $(LIB_SOURCES:.cc=.o) | ||
LIBOBJECTS += $(TOOL_LIB_SOURCES:.cc=.o) | ||
MOCKOBJECTS = $(MOCK_LIB_SOURCES:.cc=.o) | ||
LIBOBJECTS = $(LIB_SOURCES:.cc=.cc.o) $(LIB_SOURCES_C:.c=.c.o) $(LIB_SOURCES_ASM:.S=.S.o) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the ration behind changing from ".o" to ".cc.o"?
I saw in the build script both were built, so it's likely we have missed something.
Sorry but I need to revert it first. Travis build fails after this: https://travis-ci.org/facebook/rocksdb/jobs/257805952 It's likely to be related to the change in Makefile with the .o -> .cc.o change. Please resummit a pull request with this problem resolved after I revert it. Sorry about that. |
I think it's a better idea to keep the ".o" convention. If there is a file name conflict for a .c and a .cc file, it's OK to rename the RocksDB files. |
@siying We have three file extensions .cc, .c and .S |
@siying We need to differentiate these file extensions. Makefile right now does not differentiate it as LIBSOURCES has only files with .cc extension. LIBOBJECTS will have object list of .cc files with extension .o. If we rename the .S file to .cc, compiler will not recognize the assembly instructions and report error. |
@kamasubb is there a way to make sure that all the file names are different for .cc, .c, and .S? Just rename one of them if they are the same name. Another thing to consider is to roll forward and make everything consistent in this Makefile to avoid we build the same file twice, one to ".o" and one to ".cc.o" (which was what was happening). I tried to fix it here: #2651 . It became better but there may be bugs there. It's something can be considered too if renaming is not possible because of some reason. |
@siying, @sagar0 I have been trying to re-create the build failure on the PowerPC system and haven't been successful. The CRC32 patch build did not fail. Can you share details regarding the build environment and build tools of Travis? |
@kamasubb what I saw previously was that some of the .cc files were compiled twice, for example, one for threadpool_imp.cc.o and one for threadpool_impl.cc.o. This didn't cause problem on my previous environment, but seem to have some problem with ccache in Travis (just a speculation of the reason of ccache failure). It's anyway not efficient to have the same .cc file to be compiled twice. If you are able to fix this double compiling, we are good to go and we can merge it. Sorry for the inconvenience of the reverting. |
@siying Thank you for pointing out ccache. I am able to re-create the issue after I installed ccache. |
…acebook#2353 is not compatible with Clang 11. The code relies on a GCC header file ppc-asm.h. To solve this, I have taken the same approach as the the upstream project from which the CRC code came antonblanchard/crc32-vpmsum@ffc8018#diff-ec3e62c56fbcddeb07230f2a4673c1abd7f0f1cc8e48a2aa560056cfc1b25d60 and imported a copy of the GCC header file into our code-base which will be used when Clang is the compiler. NOTE: This may have licensing implications which need to be approved by FB before this is merged.
Summary: Closes #7691 The optimised CRC code for PPC64le which was originally imported in #2353 is not compatible with Clang 11. It looks like the code most likely originated from https://github.com/antonblanchard/crc32-vpmsum. The code relied on a GCC header file `ppc-asm.h` which is not available in Clang. To solve this, I have taken the same approach as the the upstream project from which the CRC code came antonblanchard/crc32-vpmsum@ffc8018#diff-ec3e62c56fbcddeb07230f2a4673c1abd7f0f1cc8e48a2aa560056cfc1b25d60 and simply imported a copy of the GCC header file into our code-base which will be used when Clang is the compiler on pcc64le. **NOTE**: The new file `util/ppc-asm.h` may have licensing implications which I guess need to be approved by RocksDB/Facebook before this is merged Pull Request resolved: #7713 Reviewed By: jay-zhuang Differential Revision: D25222645 Pulled By: pdillinger fbshipit-source-id: e3fec9136f26ce1eb7a027048bcf77a6cb3c769c
Summary: Closes facebook#7691 The optimised CRC code for PPC64le which was originally imported in facebook#2353 is not compatible with Clang 11. It looks like the code most likely originated from https://github.com/antonblanchard/crc32-vpmsum. The code relied on a GCC header file `ppc-asm.h` which is not available in Clang. To solve this, I have taken the same approach as the the upstream project from which the CRC code came antonblanchard/crc32-vpmsum@ffc8018#diff-ec3e62c56fbcddeb07230f2a4673c1abd7f0f1cc8e48a2aa560056cfc1b25d60 and simply imported a copy of the GCC header file into our code-base which will be used when Clang is the compiler on pcc64le. **NOTE**: The new file `util/ppc-asm.h` may have licensing implications which I guess need to be approved by RocksDB/Facebook before this is merged Pull Request resolved: facebook#7713 Reviewed By: jay-zhuang Differential Revision: D25222645 Pulled By: pdillinger fbshipit-source-id: e3fec9136f26ce1eb7a027048bcf77a6cb3c769c
Summary: Closes facebook#7691 The optimised CRC code for PPC64le which was originally imported in facebook#2353 is not compatible with Clang 11. It looks like the code most likely originated from https://github.com/antonblanchard/crc32-vpmsum. The code relied on a GCC header file `ppc-asm.h` which is not available in Clang. To solve this, I have taken the same approach as the the upstream project from which the CRC code came antonblanchard/crc32-vpmsum@ffc8018#diff-ec3e62c56fbcddeb07230f2a4673c1abd7f0f1cc8e48a2aa560056cfc1b25d60 and simply imported a copy of the GCC header file into our code-base which will be used when Clang is the compiler on pcc64le. **NOTE**: The new file `util/ppc-asm.h` may have licensing implications which I guess need to be approved by RocksDB/Facebook before this is merged Pull Request resolved: facebook#7713 Reviewed By: jay-zhuang Differential Revision: D25222645 Pulled By: pdillinger fbshipit-source-id: e3fec9136f26ce1eb7a027048bcf77a6cb3c769c
Support for PowerPC Architecture
Detecting AltiVec Support