Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Long running message bus receivers have intermittent errors #8884

Closed
dandenton opened this issue Nov 22, 2019 · 8 comments
Closed

[BUG] Long running message bus receivers have intermittent errors #8884

dandenton opened this issue Nov 22, 2019 · 8 comments
Assignees
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Service Bus

Comments

@dandenton
Copy link

Describe the bug
Occasionally while using long running code listening to subscriptions on service bus topics we get the errors shown below sent to the registered exceptionReceivedHandler. We have 3 services all running against the same message bus but sometimes only 1 or 2 will get the errors at the same time. Error 1 was received last night by 2 of the services for different subscriptions but at the same time. All 3 errors were received when there was no one using the application and nothing was going over the message bus. Sometimes we go several days without errors and some days we have several hundred. They don't seem to affect functionality and we're already using the RetryExponential policy. We did seem to get the errors more frequently on v4.0.0 but still receive them after upgrading to v4.1.1.

Are these exceptions safe to ignore or are we doing something wrong?

Exception or Stack Trace
Error 1:

Microsoft.Azure.ServiceBus.ServiceBusCommunicationException: The link '2341f6c8-d9b1-4510-aa85-99b15249e8dc;270:275:276:source(address:auth.runmigration/Subscriptions/Auth):MessageReceiver14auth.runmigration/Subscriptions/Auth' is force detached. Code: RenewToken. Details: Unauthorized access. 'Listen' claim(s) are required to perform this operation. Resource: 'sb://safari-prd-eastus.servicebus.windows.net/auth.runmigration/subscriptions/auth'.. TrackingId:d706dde4a00f411d836605d4476a3cab_G0, SystemTracker:gateway7, Timestamp:2019-11-22T08:03:06
 ---> Microsoft.Azure.ServiceBus.ServiceBusException: The link '2341f6c8-d9b1-4510-aa85-99b15249e8dc;270:275:276:source(address:auth.runmigration/Subscriptions/Auth):MessageReceiver14auth.runmigration/Subscriptions/Auth' is force detached. Code: RenewToken. Details: Unauthorized access. 'Listen' claim(s) are required to perform this operation. Resource: 'sb://safari-prd-eastus.servicebus.windows.net/auth.runmigration/subscriptions/auth'.. TrackingId:d706dde4a00f411d836605d4476a3cab_G0, SystemTracker:gateway7, Timestamp:2019-11-22T08:03:06
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   --- End of inner exception stack trace ---
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.MessageReceivePump.b__11_0()

Error 2:

Microsoft.Azure.ServiceBus.ServiceBusException: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101 TrackingId:2ead8826195f40108e1cf2af1e3c62af_G25, SystemTracker:gateway7, Timestamp:2019-11-22T06:56:29
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.MessageReceivePump.b__11_0()

Error 3:

Microsoft.Azure.ServiceBus.ServiceBusTimeoutException: The operation did not complete within the allocated time 00:00:59.9999140 for object tls70.
 ---> System.TimeoutException: The operation did not complete within the allocated time 00:00:59.9999140 for object tls70.
   at Microsoft.Azure.Amqp.AsyncResult.End[TAsyncResult](IAsyncResult result)
   at Microsoft.Azure.Amqp.AmqpObject.OpenAsyncResult.End(IAsyncResult result)
   at Microsoft.Azure.Amqp.AmqpObject.EndOpen(IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.TlsTransportInitiator.HandleTransportOpened(IAsyncResult result)
   at Microsoft.Azure.Amqp.Transport.TlsTransportInitiator.OnTransportOpened(IAsyncResult result)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.ServiceBus.ServiceBusConnection.CreateConnectionAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1.OnCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.ServiceBus.Amqp.AmqpLinkCreator.CreateAndOpenAmqpLinkAsync()
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.CreateLinkAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.FaultTolerantAmqpObject`1.OnCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.Amqp.Singleton`1.GetOrCreateAsync(TimeSpan timeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   --- End of inner exception stack trace ---
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func`1 operation, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(Int32 maxMessageCount, TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.ReceiveAsync(TimeSpan operationTimeout)
   at Microsoft.Azure.ServiceBus.MessageReceivePump.b__11_0()

To Reproduce
Create a subscription to a topic open for receiving and let it run for a few days. Sorry... I'd hate to get a bug like this.

Setup (please complete the following information):
The listener is running in an Azure App Service with Always On enabled
Standard tier Service Bus
.Net Core 3.0
Microsoft.Azure.ServiceBus v4.1.1

@triage-new-issues triage-new-issues bot added the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Nov 22, 2019
@maggiepint maggiepint added customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Service Bus labels Nov 23, 2019
@triage-new-issues triage-new-issues bot removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Nov 23, 2019
@ghost
Copy link

ghost commented Nov 23, 2019

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl

1 similar comment
@ghost
Copy link

ghost commented Nov 23, 2019

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl

@nemakam
Copy link
Contributor

nemakam commented Nov 25, 2019

@dandenton ,
ServiceBusCommunicationException is an exception that happens when there is some network issue. It could be that connection dropped and had to be restarted. It can happen for multiple reasons, including some node in between (front-end/ backend) had to restart and or load-balancer had to move containers around, or service upgrade happened.

ServiceBusTimeoutException and ServiceBusCommunicationException are transient exceptions and you could give less importance to them as they can definitely happen once in a while. If you feel retry doesn't solve that, then that would be definitely concerning.

ServiceBusException: The service was unable to process the request; - This looks like an error that happened on service side, or might not have been classified appropriately. Please open a support ticket through Azure portal to learn more details about it if required. That doesn't look like a client-side bug, and hence Azure portal support ticket is a better place to get help for it.

@nemakam nemakam closed this as completed Nov 25, 2019
@ghost
Copy link

ghost commented Nov 25, 2019

Thanks for working with Microsoft on GitHub! Tell us how you feel about your experience using the reactions on this comment.

@cpunella
Copy link

Hi, We are experiencing the same error if we use a SAS token generated instead of the RootManageSharedAccessKey one. Do you have any update on this?

Seems like a breaking changes between the old client (.net full) and new one. We are getting this error after the migration.

Thanks,
Carmine

@nemakam
Copy link
Contributor

nemakam commented May 26, 2020

@cpunella , could you share the exact error you are facing?

@nemakam
Copy link
Contributor

nemakam commented May 26, 2020

@anqyan , this exception looks similar to the other issue. Could you please take a look?

 ---> Microsoft.Azure.ServiceBus.ServiceBusException: The link '2341f6c8-d9b1-4510-aa85-99b15249e8dc;270:275:276:source(address:auth.runmigration/Subscriptions/Auth):MessageReceiver14auth.runmigration/Subscriptions/Auth' is force detached. Code: RenewToken. Details: Unauthorized access. 'Listen' claim(s) are required to perform this operation. Resource: 'sb://safari-prd-eastus.servicebus.windows.net/auth.runmigration/subscriptions/auth'.. TrackingId:d706dde4a00f411d836605d4476a3cab_G0, SystemTracker:gateway7, Timestamp:2019-11-22T08:03:06
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.OnReceiveAsync(Int32 maxMessageCount, TimeSpan serverWaitTime)
   --- End of inner exception stack trace ---```

@anqyan
Copy link

anqyan commented May 27, 2020

@nemakam Yes will take a look.
@cpunella Hi is this the same issue you mentioned in #11619 ? If so, let's track it in an opened GitHub Issue, the former one or a new one. Please feel free to send related logs to anqyan@microsoft.com Thanks.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Attention Workflow: This issue is responsible by Azure service team. Service Bus
Projects
None yet
Development

No branches or pull requests

5 participants