Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instrumentation: add timers and warnings to platform callbacks #2300

Merged
merged 6 commits into from
May 23, 2022

Conversation

goaway
Copy link
Contributor

@goaway goaway commented May 19, 2022

Description: Adds Envoy histograms (timers) to most platform-provided callbacks in the Http::Client and in PlatformBridge Filters. Beyond a hardcoded threshold (1s), a warn-level LOG_EVENT will be emitted for slow-to-execute callbacks.
Risk Level: Low
Testing: Manual & Existing Coverage

Signed-off-by: Mike Schore mike.schore@gmail.com

Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com>
@goaway goaway marked this pull request as ready for review May 20, 2022 17:48
@goaway goaway requested review from Augustyniak and jpsim May 20, 2022 17:48
@@ -23,6 +23,9 @@ namespace HttpFilters {
namespace PlatformBridge {

namespace {

constexpr auto SlowCallbackWarningThreshold = std::chrono::seconds(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to starting at 1 second here. Ideally we roll this out at scale and see that it's rarely hit, if ever, and we can reduce it accordingly. Ideally pt2 is that we find a good way to make this configurable.

@jpsim
Copy link
Contributor

jpsim commented May 20, 2022

Since this is a user-facing change, can you please add an entry to the version history doc?

const std::string& filter_name() { return filter_name_; }
const envoy_http_filter* platform_filter() const { return platform_filter_; }

private:
static PlatformBridgeFilterStats generateStats(const std::string& stat_prefix,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have the stat prefix if we only set it to the empty string?

Copy link
Contributor Author

@goaway goaway May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good question, it's
a) because I Considered making it the filter name and think that still might be worth considering, but I held off because I'm not sure how consistently/well those are formatted for use in stats,

and
b) because this is the signature used everywhere in Envoy for this pattern.

jpsim
jpsim previously approved these changes May 20, 2022
Copy link
Contributor

@jpsim jpsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving now so we can observe the stats over the weekend, but I'll review in more detail next week.

Signed-off-by: Mike Schore <mike.schore@gmail.com>
@goaway goaway merged commit 2afb13e into main May 23, 2022
@goaway goaway deleted the ms/callback-timers branch May 23, 2022 18:01
jpsim added a commit that referenced this pull request May 26, 2022
* origin/main: (32 commits)
  Compress xcframework release asset (#2324)
  bazel: update rules_apple (#2326)
  Merge `android_dist` with `android_dist_ci` (#2323)
  kotlin: fix flaky receive error test (#2317)
  kotlin: fix flaky grpc test (#2316)
  build(deps): bump pyjwt from 2.1.0 to 2.4.0 in /.github/actions/pr_notifier (#2314)
  test: making C++ integration test more authentic (#2315)
  reverting override to override_request_timeout_by_gateway_timeout (#2296)
  Update Envoy (#2309)
  release: use `CREDENTIALS_GITHUB_RELEASE_DEPLOY_KEY`
  Use `CREDENTIALS_GITHUB_PUSH_TOKEN`
  Push to branch before tagging the release
  Cronvoy: preparation to unittest certificate verification JNI (#2251)
  Corrected typo in documentation for Android building requirements (#2313)
  instrumentation: add timers and warnings to platform callbacks (#2300)
  Bump Lyft Support Rotation (#2310)
  Revert "android: use local addresses as opposed to prefix (#2081)" (#2307)
  Fix release GitHub workflow (#2306)
  mobile: moving the c++ integration test to use default config (#2293)
  config: cleaning up deprecated configs (#2295)
  ...

Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants