Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(transport): return Poll::ready until error is consumed #495

Conversation

tl-helge-hoff
Copy link
Contributor

When a lazy connection fails to connect the first time it returns Poll::ready and saves the error in the Reconnect service.
Subsequent calls to Reconnect::poll_ready return Poll::Pending which makes tower_balance loop forever as it will never get a ready service.
This is due to Reconnect::poll_ready being called twice by tower_balance and returning first Poll::Ready and then
Poll::Pending.

Motivation

We bumped into the infinite-looping issue while using the Channel::balance_channel together with tls,
when our tls config was incorrect and triggering an error during the handshake.

Solution

Reconnect::poll_ready will consistently return Poll::Ready until the error is consumed in Reconnect::call.
After the error is consumed, and the error is surfaced through a client call, the Reconnect service will try to connect again.
The PR also adds trace logging in the Reconnect service.

When a lazy connection fails to connect it first
returns Poll::ready from the reconnect service,
yet the subsequent call returns Poll::pending
making tower_balance loop forever.
Instead, on error we return Ready
until the error is consumed in the
call method.
Copy link
Member

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on this! I'd like to see that one trait bound change, then we can merge it.

tonic/src/transport/service/reconnect.rs Outdated Show resolved Hide resolved
@@ -13,6 +14,7 @@ use tracing::trace;
pub(crate) struct Reconnect<M, Target>
where
M: Service<Target>,
M::Error: Debug,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we actually want to use Into<Box<dyn Error + Send + Sync>> here and then convert it. If it impls that trait and you call into() then it will automatically impl Debug because of the Error trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've done what you've suggested in the last comment.

@@ -52,6 +54,7 @@ where
M::Future: Unpin,
Error: From<M::Error> + From<S::Error>,
Target: Clone,
<M as tower_service::Service<Target>>::Error: fmt::Debug,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be Into<Error> or else it might break the stack down the line.

@LucioFranco
Copy link
Member

Closing in favor of #536

@LucioFranco
Copy link
Member

Thanks @tl-helge-hoff for getting this started!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants