New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QUERY] Receiving Unauthorized Access Error on RenewToken Periodically with Azure Service Bus Queue Listener #11619
Comments
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jfggdl. |
@3437CasaVerde Please take a look at this issue. |
Hi @tstepanski , is it possible to share us the detail of |
@3437CasaVerde For security purposes, would I be able to PM those details to you? |
@tstepanski Yes please. My work email is anqyan@microsoft.com. |
@3437CasaVerde I have sent you some examples. I will relay any details regarding a diagnosis and solution shared via email with this GitHub issue after removing any sensitive data. |
Hey @tstepanski I checked a few examples you emailed me and logs on my end. It looks like that the SAS token expired and Service Bus was trying to re-authorize the connection then got this exception. I can have a wider and deeper investigation. Meanwhile, could you please:
|
@3437CasaVerde Thank you for checking on that, very odd. As you likely saw, I am using a shared access policy I created with Listen/Read privileges, but I can use RootManageSharedAccessKey for the time being. I will do that and check after a few hours to see if any more exceptions have been caught. |
@3437CasaVerde After the move to RootManageSharedAccessKey, I have not gotten anymore renewal exceptions, but I am seeing ServiceBusCommunicationExceptions with an inner exception of OperationCancelledExceptions. I don't know if this is related or not? I have multiple receivers (several topics/queues) and they all seem to disconnect at roughly the same time. If it is an expiration problem, they likely all expire and then reconnect together. This is obviously problematic, I would like that to be isolated to the SDK if it is a known condition of Azure Service Bus. In addition, using RootManageSharedAccessKey is also a security risk long term, so your investigation is still invaluable. I don't know if this helps or increases confusion. |
@AlexGhiondea Thanks for that! @anqyan Any updates? |
Hi all, we have the same issue, but in our case we cannot use the RootManageSharedAccessKey for security purpose! What we can do? It seems a breaking changes between old client (.net full) and the new one. We are getting this error after the upgrading. @anqyan can I send you the TrackingID to verify that also on you side you see the same error of this topic? Thanks |
Hi, @cpunella yes please send me related client side logs to anqyan@microsoft.com thanks. |
Hi, we are receiving the same error. I've sent related logs per mail. We are not using the |
Some of our devs are experiencing the same issue. I would be very interested in any updates on this issue. |
Our customer send Schedule Message in Azure function app and met similar issue " Unauthorized access. 'Send' claim(s) are required to perform this operation ". But after reboot the function app, the issue mitigated. Hello @anqyan , could you please help advise? is it SDK issue? thanks. |
Hi, i am having the same issue, but it happens between some days (1 to 2). When it happens, it show more then 20 lines on application Insights. (screen below). Another curiosity, I have seven queues in this ServiceBus, but only two of them are logging the issue. @edit: |
Hi @wiliambuzatto @yuhaii @slandsaw @marcelvwe how often do you/the customers encounter this exception recently? |
Hi Same here My console .exe uses:
Below: Logs:
I see nothing in the dead letter queue though. |
Hi all, The exception is caused by cbs(claim-based security) token re-authorization failures in Service Bus. How cbs re-authorization works in Service Bus: Client sends a cbs token to put security related access info in the service side cache. On service side there is a timer which schedules re-authorization on the access info of the client. Normally client always send the token before the timer fires, so the AMQP link between client and Service Bus can continue working. So, in this case, there could be failures or errors on the client side (inside Microsoft.Azure.ServiceBus SDK) that prevented the token from being sent to the service side before the timer fired. Hence the listen claim of the AMQP link became expired and got deleted by the service. To know more about what happens in the client regarding the exception, we need to enable tracing in the SDK, so that MessagingEventSource can log useful information:
Meanwhile please treat this exception as not productivity impacting in your services. |
@anqyan I pm'd you last week with traces I collected using PerfView. Regarding your comment 'Meanwhile please treat this exception as not productivity impacting in your services.', my experience is that once the error occurs the client is unusable and my application becomes unavailable. I would treat this as a critical issue. |
Hi, we are also experiencing this exception. We are using the RootManageSharedAccessKey for the connection. I see it in multiple environments, but more often in our QA environment instead of PROD environment. So maybe it happens more often if the environment is not as busy? A quick test showed me that the subscription was still active and picking up messages after receiving this error. So for us it doesn't look like it is critical. Maybe it is the way how we handle the exception? |
In the last days this error occures often on all of our 3 systems. |
I've just updated to netcore3.1 on the queue services we run in AKS and our functions apps, I see this all the time since updating to the new Servicebus library. I also use the RootManageSharedAccessKey as we use MassTransit in most of our system and then just use the new management client to create some subscriptions and queues for the function apps and bit's that don't use MassTransit. I get 3 "errors" now that I didn't see before. These don't as far as I can tell stop messages getting processed though, but make your logs look like something is going wrong. And doesn't fill me with confidence that there isn't a problem:
|
@anqyan Have you managed to see anything with the traces others have sent you already regarding this? My experience so far has been that I updated from netcore2.2 to netcore3.1 and updated the associated ServiceBus libraries. I also updated my function apps to v3 and then reprovisioned with our existing provisioning mechanism. I also trashed our web apps, function apps, aks pods and bus, as we repopulate the SAS token as part of the provision mechanism. This worked as I've seen it successfully process messages and all of our smoke tests have passed, but I now see these intermittent issues. To me it feels very much like something has changed perhaps around the token refresh or connection in the SDK which is making it flake intermittently like we are seeing, or you are now surfacing errors that were being swallowed previously. Perhaps it's around some longer timeout scenario as others have pointed out, my finding also support this as some of the message handlers on our bus only run very intermittently i.e. with gaps of days in between them. Here is an example of one of my AKS pods it successfully receives a message then we see numerous errors in the following hours:
|
As of Sunday 2020-12-13, we started seeing this as well. We are using root managed shared access key. |
@bandmask could you please send your namespace and detailed exception stack trace to anqyan@microsoft.com ? |
Hi all, I have followed this issue for a while. Some of the errors reported here are the same that we are getting with our application. I have opened another github issue (#18035). Consider reading the discussion there if you are having similar errors. The good news is that these errors seems to be transient errors automatically handled by the retry policies of the .NET sdk for Azure Service Bus. |
Closing the Issue since not enough data was gathered from client SDK. If you are experiencing this again, please follow the instructions to capture SDK level logs and reopen this issue. |
We have just experience the very same issue for the first time:
not clear why this happens. We are not using tokens to authorise, so not sure why there should be a renew token error. |
Hi @andrea-cassioli-maersk are you using the root access key (RootManageAccessKey in the Azure portal) in order to authenticate with the service bus ? |
@EnricoMassone we use the primary connection string from RootManageSharedAccessKey |
That's the same for us. That error, based on my understanding, does not make sense because by definition the root access key should have all the permission claims. We are getting several strange and unclear errors from the .NET sdk from an ASP.NET core 2.2 application. Our errors are described in this issue. We are not currently able to constantly reproduce them. I have maybe a good news. We have rewritten our code to interact with the Azure service bus by using the new .NET sdk named Azure.Messaging.ServiceBus. We are still testing the new implementation, but it seems that the errors have disappeared. So I would suggest you to evaluate its adoption as a mitigation strategy. If you still want to use the Microsoft.Azure.ServiceBus package (instead of the new one mentioned above) you can follow this guide to programmatically get all the logs raised from the Microsoft.Azure.ServiceBus code. By doing so, you can perform a verbose logging of all the operations done by the library. I have used these logs to open my own issue. I haven't understood the root cause of the errors, but at least now I have plenty of exception stacktraces. You can use them to open a ticket to the Azure support if you like. From our side, we hope that the new Azure.Messaging.ServiceBus SDK has finally solved our troubles. Some colleagues from other teams have suggested us to give it a shot, because they have experienced several improvements in their own projects by adopting it. |
I am receiving a Microsoft.Azure.ServiceBus.ServiceBusException (message below with sensitive information removed) periodically in my queue receiver. The SAS key has send/listen access and the error seems inconsequential as processing continues as normal. However, the message is creating a signal to noise problem in my dashboards (receiving 10-70 errors per day). Any ideas on why this is happening? The listener is running in an Azure App Service, but I don't think that matters. I have adjusted my retry logic to use a RetryExponential with a 1 second to 1 minute backoff with 5 retries.
Original StackOverflow Question
Packages
Net Core 3.1
Microsoft.Azure.ServiceBus, Version=4.1.3.0, Culture=neutral, PublicKeyToken=7e34167dcc6d6d8c
Error Message
Source
The text was updated successfully, but these errors were encountered: