-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
utlity: Add benchmark for string-utility functions #2348
Conversation
to the other handlers with a browser. Add a private mechanism to have admin handlers supply arbitrary headers. Note that the handler callback doesn't expose a header structure, so the public v2 API does not provide this header-control. However, it was specifying text/html as content-type by default, so I changed the default to text/plain (with nosniff). Added sanitization of both the prefix string and the help text, which need to be safely injected into the HTML doc. Added some styling and a reference to the existing Envoy favicon as well. Note that no existing handlers get any response bytes changed as a result of this commit, just their content-type. Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ms in particular. Signed-off-by: Joshua Marantz <jmarantz@google.com>
… can set the content-type and/or cache-control. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Adds an alternative string right-trimming function based on absl::string_view, which improves on the existing std::string-trimming function by about 3x: 2018-01-11 00:45:49 Run on (12 X 3800 MHz CPU s) CPU Caches: L1 Data 32K (x6) L1 Instruction 32K (x6) L2 Unified 256K (x6) L3 Unified 15360K (x1) ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. avg trimmed=3 of 1 iters. avg trimmed=3 of 10 iters. avg trimmed=3 of 100 iters. avg trimmed=3 of 1000 iters. avg trimmed=3 of 10000 iters. avg trimmed=3 of 100000 iters. avg trimmed=3 of 1000000 iters. avg trimmed=3 of 8390885 iters. avg trimmed=3 of 18864031 iters. ---------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------- BM_RTrimString 34 ns 34 ns 18864031 avg trimmed=3 of 1 iters. avg trimmed=3 of 10 iters. avg trimmed=3 of 100 iters. avg trimmed=3 of 1000 iters. avg trimmed=3 of 10000 iters. avg trimmed=3 of 100000 iters. avg trimmed=3 of 1000000 iters. avg trimmed=3 of 10000000 iters. avg trimmed=3 of 64745962 iters. BM_RTrimStringView 11 ns 11 ns 64745962 Signed-off-by: Joshua Marantz <jmarantz@google.com>
Rewrites the rightTrim function to coincide with the impl in https://github.com/envoyproxy/envoy/pull/2087/files which is shorter (although it benchmarks out to be the same speed). Signed-off-by: Joshua Marantz <jmarantz@google.com>
source/common/common/utility.cc
Outdated
return absl::string_view(source.data(), pos + 1); | ||
} | ||
return absl::string_view(""); | ||
absl::string_view StringUtil::rightTrim(absl::string_view source) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz I created a new class StringViewUtil with similar method/implementation in my last commit. Check this out: https://github.com/gsagula/envoy/blob/02ba40dd236645b89aec7689a81c9ff89f71dfc2/source/common/common/utility.cc#L222
Comment:
I think it's important to explicitly differentiate methods that use string view. Let me know what you think, so I might just move everything under StringUtil instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In absl -- and in earlier versions of similar libraries in Google & Chromium, they are co-mingled. Note that absl::string_view is I think intended as a stop-gap until C++17, when std::string_view (http://en.cppreference.com/w/cpp/string/basic_string_view) becomes available.
The reason -- I think -- that they are co-mingled is that there's no reason for the ones that take std::string arguments to exist at all. E.g. in my example if you want to right-strip a std::string you can just pass in it, because absl::string_view has an implicit constructor from std::string, though it requires an explicit std::string out-conversion, e.g.:
std::string foo;
std::string bar = std::string(StringUtil::rightStrip(foo));
Ordinarily, though, if the backing-store stays alive along enough, you can just live in the string_view world and never make copies.
See https://github.com/abseil/abseil-cpp/blob/master/absl/strings/strip.h where the doc speaks in terms of std::string but actually all the functions take string_view.
So my preference is that you replace the ones in StringUtil:: rather than adding new alternatives. This will help spread the knowledge about string_view's performance by eliminating the slower equivalent.
Having said that, I knew your PR was coming, and the only reason I added my own version of rightStrip was as a testing vehicle for the benchmarking integration. And I want the benchmarking for something unrelated (faster startup in the presence of giant configs). Do you think you could split your StringUtil stuff in your PR into a smaller one, in which you also eliminate entirely the old rtrim and fix its call-sites? Unfortunately the rtrim function's signature (modifying a std::string&) can't be fixed in place.
What I'll do in any case is move my new utility function into the benchmark code itself, and your version can be authoritative, and I'll have a TODO on mine to delete it it and call yours instead once it's in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…rt it. The authoritative work for this is in envoyproxy#2087 . Another speed-test case I thought might be worth covering, where no trimming was necessary, and the result needs to be converted back to a std::string. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat.
|
||
int main(int argc, char** argv) { | ||
benchmark::Initialize(&argc, argv); | ||
if (benchmark::ReportUnrecognizedArguments(argc, argv)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: braces on even single line if
preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done; this was just a c&p from the example in benchmark.h, but I added the braces.
for (auto _ : state) { | ||
absl::string_view text(AlreadyTrimmed, AlreadyTrimmedLength); | ||
std::string string_copy = std::string(rightTrim(text)); | ||
accum += AlreadyTrimmedLength - string_copy.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how small the trim text is, I think the other loop overheads might diminish the actual speedup from the optimization, could be worth using much larger inner loop text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still showed a consistent 3x improvement.
@@ -0,0 +1,100 @@ | |||
// Note: this should be run with --compilation_mode=opt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on a some experience doing this kind of microbenchmark work on Google workstations, I think the caveats are wider than just opt
. For example, you want to disable cstate power management, ideally have a quiescent system and/or use CPU affinity with taskset to carve off some CPU to work on. Ideally we would run these in something equivalent to perflab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some more text to that effect, though my results were pretty stable after multiple iterations. I guess a good place to put the results is the PR description.
|
||
set -x | ||
|
||
export COMMIT="e1c3a83b8197cf02e794f61228461c27d4e78cfb" # benchmark @ Jan 11, 2018 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the new preferred way to add external deps is https://github.com/envoyproxy/envoy/blob/master/bazel/EXTERNAL_DEPS.md#adding-external-dependencies-to-envoy-genrule-repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll continue to iterate on that.
test/common/common/BUILD
Outdated
envoy_cc_binary( | ||
name = "utility_speed_test", | ||
deps = ["utility_speed_test_lib"], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come the binary/lib split here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
envoy_cc_binary does not allow an external lib. I'll add that as a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also extend envoy_cc_binary
to do this, it's only a couple of lines in the .bzl
if you look at what envoy_cc_library
does.
Signed-off-by: Joshua Marantz <jmarantz@google.com>
…ate library for the speed_test. Signed-off-by: Joshua Marantz <jmarantz@google.com>
This is extracted from envoyproxy#2348 and needs to go in first -- with no code depdning on it -- so the CI works in the PR which uses the new dependency. Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff. @jmarantz as a drive by comment, as long as you are doing stuff like this, can you audit all the string utility functions we have and potentially see which ones should be outright swapped for absl versions? For sure split
should be replaced with the string view one. Maybe others.
test/common/common/BUILD
Outdated
@@ -69,3 +71,17 @@ envoy_cc_test( | |||
srcs = ["callback_impl_test.cc"], | |||
deps = ["//source/common/common:callback_impl_lib"], | |||
) | |||
|
|||
# This is broken out as a separate library because envoy_cc_binary does | |||
# not allow for external_deps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stale comment.
test/common/common/BUILD
Outdated
@@ -2,6 +2,8 @@ licenses(["notice"]) # Apache 2 | |||
|
|||
load( | |||
"//bazel:envoy_build_system.bzl", | |||
"envoy_cc_binary", | |||
"envoy_cc_library", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed.
This is a prereq for running CI on #2348 Description: Integrates github.com/google/benchmark so it can be used in future PRs for demonstrating improvement. Note this PR does not actually depend on it, so it can pass CI. Risk Level: Low Tested: //test/... Signed-off-by: Joshua Marantz <jmarantz@google.com>
…not ready yet. Signed-off-by: Joshua Marantz <jmarantz@google.com>
CI fails due to precompiled benchmark library not being available on asan/tsan/release. Format is probably something else :) I don't really need to the speed_tests to be built with asan or tsan, but release-mode binaries would be good. How do I:
Note that libbenchmark is a 'prebuilt' at this point. I could try to do this the "right" way as a resolution, but other prebuilts must of the same issue? |
@jmarantz for third party libraries like the one you added, you need to bump the CI build image SHA: https://github.com/envoyproxy/envoy/blob/master/.circleci/config.yml#L3 https://hub.docker.com/r/envoyproxy/envoy-build/tags/ That should work for all builds. |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Thanks @mattklein123 ! That helped. I'm having a little trouble now understanding the format failure, as the one .cc file I modified does not seem affected running fix_format locally. WDYT of a PR to check_format.py to remove the >/dev/null redirection on the check commands, so we can see what's wrong in the CI log? |
It's interesting to see the overhead of creating the split-vector per-token, which resolved by just looking for '=' and calling substr manually. Signed-off-by: Joshua Marantz <jmarantz@google.com>
@jmarantz are you running fix_format under Docker or locally? It's basically useless if you run it outside of Docker, unless you have precisely the same version of clang-format (and potentially other tools) as those used under CI. Best practice is to run under Docker IMHO. I agree that better debugging in check_format would be useful, +1 to any change here. |
Signed-off-by: Joshua Marantz <jmarantz@google.com>
Signed-off-by: Joshua Marantz <jmarantz@google.com>
@htuch I sorted out my format issues; it was a script problem in my own setup. PTAL. |
Previously, 265MB release packages with debug symbols were published instead of 9MB release packages without them. Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Description:
Adds a benchmarking for new string utility functions added in #2368 using the google microbenchmarking library (https://github.com/google/benchmark), which was added in #2350. This serves as an example and proof of concept on how to use the benchmarking library, and as a reference point for the approximate performance of some string-parsing functions.
Risk Level: Very Low
Testing:
This is just a new speed-test, so running it on its own is sufficient to test it.
Results:
In practice I did not find this benchmark to be noisy on repeated runs.