-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Service Bus] Port fix to use new rhea connection object for connection recovery #8580
[Service Bus] Port fix to use new rhea connection object for connection recovery #8580
Conversation
…andler (Azure#8401) * initial fix * [service-bus] surface disconnect errors to streaming receiver error handler * remove unnecessary flag from clientEntityContext.onDetached * address feedback * improve logs when calling onDetached concurrently * make logs less noisy * remove log * Apply suggestions from code review Co-Authored-By: Ramya Rao <ramya.rao.a@outlook.com> * reduce 3 conditionals to 2 that exit early Co-authored-by: Ramya Rao <ramya.rao.a@outlook.com>
* [service-bus] recreate Connection object on disconnect * update amqp-common * fix onDetached to be ran in parallel instead of serial, adds tests * update changelog * add tests for management client/link * address feedback * Update sdk/servicebus/service-bus/CHANGELOG.md Co-Authored-By: Ramya Rao <ramya.rao.a@outlook.com> Co-authored-by: Ramya Rao <ramya.rao.a@outlook.com>
…to harshan/sb/port-8447
…ru/azure-sdk-for-js into harshan/sb/port-8444
…ru/azure-sdk-for-js into harshan/sb/port-8401
…to harshan/sb/port-8447
await serviceBusClient.test.afterEach(); | ||
}); | ||
|
||
it("can receive and settle messages after a disconnect", async function(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test times out. Not able to receive the second message.
Worked fine in track 1. Yet to find the root cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Findings
Upon simulating the disconnect, the error thrown is being perceived as non-retryable by the core-amqp. Which is why the receiver never came back up for receiving the second message.
This is not the case with amqp-common, this error is treated as a generic-error and is wrapped as a MessagingError(and by default, the retryable flag is set to true).
This is only the difference b/w track 1 and track 2 and is the cause behind the test failure.
Hence, in track 2, when the error is caused by a disconnect and if the error is not translated as a messaging-error, we don't care about the retryable flag on the error and re-open anyway because we don't want to miss out on errors caused by disconnect.
…to harshan/sb/port-8447
/azp run js - servicebus - tests |
Pull request contains merge conflicts. |
…to harshan/sb/port-8401
/azp run js - servicebus - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
Co-authored-by: Ramya Rao <ramya.rao.a@outlook.com>
…ByDisconnect can be undefined
/azp run js - servicebus - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
if (isMessagingError(translatedError)) { | ||
shouldReopen = translatedError.retryable; | ||
} else { | ||
shouldReopen = !translatedError || causedByDisconnect || false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What case are we trying to handle with false
in the end here?
Non messaging error not caused by disconnect? Why would they be not retryable?
Maybe this should distill down to
const shouldReopen = isMessagingError(translatedError) ? translatedError.retryable : true;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good simplification...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a false
since typescript was complaining here about the "undefined" being assigned to the boolean value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What case are we trying to handle with false in the end here?
Adding || false
won't affect anything. No additional case is being handled other than fixing the typescript complaint.
Non messaging error not caused by disconnect? Why would they be not retryable?
They are not retryable because the translate()
method did not consider them retryable.
const shouldReopen = isMessagingError(translatedError) ? translatedError.retryable : true;
Are you suggesting that we should allow all non-messaging errors to be retried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding || false won't affect anything. No additional case is being handled other than fixing the typescript complaint.
Thats not true. There are errors for which translate() will return as is and not change it to a MessagingError
. For these, you will be setting shouldReopen
to false, but in Track 1 these will be true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I said it "won't affect anything", I meant that this code is in the same state as the code from your previous suggestion(from the other PR).
There are errors for which translate() will return as is and not change it to a MessagingError. For these, you will be setting shouldReopen to false, but in Track 1 these will be true
I thought that's what we agreed upon and mentioned in the comments too.
So finally, Going with re-open if
- No error
- Messaging-error that is retryable
- Non-messaging-error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
More info - #8447 and #8401
This PR attempts to port the changes from