-
Notifications
You must be signed in to change notification settings - Fork 5.2k
conn_pool_grid: try TCP immediately with h3 state FailedRecently #20722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
|
failed thrift_proxy:integration_test on Windows seems unrelated. |
|
/assign @RyanTheOptimist |
|
/retest |
|
Retrying Azure Pipelines: |
RyanTheOptimist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
source/common/http/conn_pool_grid.cc
Outdated
| auto attempt = std::make_unique<ConnectionAttemptCallbacks>(*this, current_); | ||
| LinkedList::moveIntoList(std::move(attempt), connection_attempts_); | ||
| if (!next_attempt_timer_->enabled()) { | ||
| if (next_attempt_timer_ != nullptr && !next_attempt_timer_->enabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that next_attempt_timer_ was initialized in the constructor and could never be null. Is this check needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. It's not needed
| EXPECT_FALSE(tracker_.isHttp3Confirmed()); | ||
|
|
||
| EXPECT_CALL(*timer_, enabled()).WillOnce(Return(false)); | ||
| EXPECT_CALL(*timer_, enableTimer(std::chrono::milliseconds(5 * 60 * 1000), nullptr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also test what happens when the timer fires? Or, maybe we tested that earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tested in MarkBrokenThenExpires that when timer fires, isHttp3Broken() returns false and hasHttp3FailedRecently() returns true
source/common/http/conn_pool_grid.cc
Outdated
| PoolIterator pool = pools_.begin(); | ||
| if (!shouldAttemptHttp3() || !options.can_use_http3_) { | ||
| Instance::StreamOptions overriding_options(options); | ||
| bool delay_tcp_attempt{true}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we have two initializations, one uses {} and the other (). Could they both use {} (or even = to be more idiomatic)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| EXPECT_NE(grid_->first(), nullptr); | ||
| // The 2nd pool should be TCP pool and it should have been created together with h3 pool. | ||
| EXPECT_NE(grid_->second(), nullptr); | ||
| EXPECT_EQ(2u, grid_->callbacks_.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I understand how this test that the TCP attempt is not delayed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By checking EXPECT_NE(grid_->second(), nullptr). The 2nd pool (TCP) is created immediately with the quic pool without waiting for the alarm to fire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. I see.
| } | ||
| if (!delay_tcp_attempt) { | ||
| // Immediately start TCP attempt if HTTP/3 failed recently. | ||
| wrapped_callbacks_.front()->tryAnotherConnection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we stop the failover timer at this point since there is nothing else to try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next_attempt_timer_ is not accessible from the grid, so I left as is. The alarm should trigger another call to tryAnotherConnection() and early returns if there is no next pool. We definitely can cancel it explicitly by adding a getter interface, but I'm wondering if it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fair enough!
| EXPECT_NE(grid_->first(), nullptr); | ||
| // The 2nd pool should be TCP pool and it should have been created together with h3 pool. | ||
| EXPECT_NE(grid_->second(), nullptr); | ||
| EXPECT_EQ(2u, grid_->callbacks_.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. I see.
| } | ||
| if (!delay_tcp_attempt) { | ||
| // Immediately start TCP attempt if HTTP/3 failed recently. | ||
| wrapped_callbacks_.front()->tryAnotherConnection(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fair enough!
|
FYI: I've enabled auto-merge so once CI passes it should land automatically. |
|
/retest |
|
Retrying Azure Pipelines: |
…oyproxy#20722) Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing. Risk Level: low, grid only Testing: added unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Part of envoyproxy#18715 Signed-off-by: Dan Zhang danzh@google.com Signed-off-by: Andre Vehreschild <vehre@x41-dsec.de>
…oyproxy#20722) Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing. Risk Level: low, grid only Testing: added unit tests Docs Changes: N/A Release Notes: N/A Platform Specific Features: N/A Part of envoyproxy#18715 Signed-off-by: Dan Zhang danzh@google.com
Signed-off-by: Dan Zhang danzh@google.com
Commit Message: Add a new state FailedRecently to the status tracker. And make the connectivity grid check this state in newStream(). If the h3 pool is in this state, do not delay TCP racing.
Risk Level: low, grid only
Testing: added unit tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A
Part of #18715