Skip to content

Commit

Permalink
Enable the use of [SU]Int32Size and EnumSize templates for AArch64
Browse files Browse the repository at this point in the history
When benchmarking proto_benchmark from fleetbench on an AArch64 target we found
that clang is able to vectorize these functions and they offer better
performance than the scalar alternative.
  • Loading branch information
avieira-arm committed Jan 6, 2023
1 parent de5fae0 commit cf77f0e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/google/protobuf/wire_format_lite.cc
Original file line number Diff line number Diff line change
Expand Up @@ -704,7 +704,7 @@ static size_t VarintSize64(const T* data, const int n) {
// and other platforms are untested, in those cases using the optimized
// varint size routine for each element is faster.
// Hence we enable it only for clang
#if defined(__SSE__) && defined(__clang__)
#if (defined(__SSE__) || defined(__aarch64__)) && defined(__clang__)
size_t WireFormatLite::Int32Size(const RepeatedField<int32_t>& value) {
return VarintSize<false, true>(value.data(), value.size());
}
Expand All @@ -722,7 +722,7 @@ size_t WireFormatLite::EnumSize(const RepeatedField<int>& value) {
return VarintSize<false, true>(value.data(), value.size());
}

#else // !(defined(__SSE4_1__) && defined(__clang__))
#else // !((defined(__SSE4_1__) || defined(__aarch64__) && defined(__clang__))

size_t WireFormatLite::Int32Size(const RepeatedField<int32_t>& value) {
size_t out = 0;
Expand Down

0 comments on commit cf77f0e

Please sign in to comment.