Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

router: defer per try timeout until downstream request is done #6643

Merged
merged 6 commits into from
Apr 20, 2019

Conversation

snowp
Copy link
Contributor

@snowp snowp commented Apr 18, 2019

This defers starting the per try timeout timer until onRequestComplete
to ensure that it is not started before the global timeout. This ensures
that the per try timeout will not take into account the time spent
reading the downstream, which should be responsibility of the HCM level
timeouts.

Signed-off-by: Snow Pettersen snowp@squareup.com

Risk Level: Medium, changes to router logic
Testing: Added new UT, updated existing ones
Docs Changes: n/a
Release Notes: n/a
Fixes #6624

This defers starting the per try timeout timer until onRequestComplete
to ensure that it is not started before the global timeout. This ensures
that the per try timeout will not take into account the time spent
reading the downstream, which should be responsibility of the HCM level
timeouts.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
@htuch htuch requested review from mattklein123 and removed request for mattklein123 April 18, 2019 18:38
Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
@@ -434,6 +435,9 @@ class Filter : Logger::Loggable<Logger::Id::router>,
bool is_retry_ : 1;
bool include_attempt_count_ : 1;
bool attempting_internal_redirect_with_complete_stream_ : 1;
// Tracks whether we deferred a per try timeout because the downstream request
// had not been completed yet.
bool pending_per_try_timeout_ : 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put this on the UpstreamRequest instead of on the parent, that will hold up better once we have multiple upstream requests

@@ -521,6 +521,10 @@ void Filter::onRequestComplete() {
response_timeout_ = dispatcher.createTimer([this]() -> void { onResponseTimeout(); });
response_timeout_->enableTimer(timeout_.global_timeout_);
}

if (pending_per_try_timeout_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be better as a for loop over upstream requests to check their pending_per_try_timeout_ flag

Signed-off-by: Snow Pettersen <snowp@squareup.com>
mpuncel
mpuncel previously approved these changes Apr 18, 2019
Copy link
Contributor

@mpuncel mpuncel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! thanks for taking this

@snowp
Copy link
Contributor Author

snowp commented Apr 18, 2019

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: tsan (failed build)

🐱

Caused by: a #6643 (comment) was created by @snowp.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks good. This is a subtle change but I think it's probably worth adding a release note also?

/wait

@@ -355,6 +355,9 @@ class Filter : Logger::Loggable<Logger::Id::router>,
bool upstream_canary_ : 1;
bool encode_complete_ : 1;
bool encode_trailers_ : 1;
// Tracks whether we deferred a per try timeout because the downstream request
// had not been completed yet.
bool pending_per_try_timeout_ : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would probably name this something like create_per_try_timeout_on_request_complete_ or something like that to differentiate between waiting for the per try timeout to elapse which is what this sounds like to me.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small typos

/wait

@@ -11,6 +11,9 @@ Version history
* redis: added :ref:`prefix routing <envoy_api_field_config.filter.network.redis_proxy.v2.RedisProxy.prefix_routes>` to enable routing commands based on their key's prefix to different upstream.
* redis: add support for zpopmax and zpopmin commands.
* router: added ability to control retry back-off intervals via :ref:`retry policy <envoy_api_msg_route.RetryPolicy.RetryBackOff>`.
* router: per try timeouts will no longer start before the downstream request has been received
in full by the router. This ensure that the per try timeout does not account for slow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ensure/ensures

@@ -11,6 +11,9 @@ Version history
* redis: added :ref:`prefix routing <envoy_api_field_config.filter.network.redis_proxy.v2.RedisProxy.prefix_routes>` to enable routing commands based on their key's prefix to different upstream.
* redis: add support for zpopmax and zpopmin commands.
* router: added ability to control retry back-off intervals via :ref:`retry policy <envoy_api_msg_route.RetryPolicy.RetryBackOff>`.
* router: per try timeouts will no longer start before the downstream request has been received
in full by the router. This ensure that the per try timeout does not account for slow
downstreams and that will not not start before the global timeout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not not (not)

Signed-off-by: Snow Pettersen <snowp@squareup.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@snowp
Copy link
Contributor Author

snowp commented Apr 20, 2019

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: tsan (failed build)

🐱

Caused by: a #6643 (comment) was created by @snowp.

see: more, trace.

@snowp snowp merged commit 92963f9 into envoyproxy:master Apr 20, 2019
mpuncel added a commit to mpuncel/envoy that referenced this pull request Apr 22, 2019
* master:
  thread: remove ThreadFactorySingleton (envoyproxy#6658)
  router: support offseting downstream provided grpc timeout (envoyproxy#6628)
  router: defer per try timeout until downstream request is done (envoyproxy#6643)
  update bazel readme for clang-format-8 on mac (envoyproxy#6660)
  Implement some TODOs in quic_endian_impl.h (envoyproxy#6644)
  docs: add aspell to mac dependencies to fix check format script (envoyproxy#6661)
  config: fix delta xDS's use of (un)subscribe fields, more explicit protocol spec (envoyproxy#6545)

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

router filter's global timeout timer starts later than the per try timeout one
3 participants