New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
com.microsoft.azure.CloudException: A retryable error occurred. #1924
Comments
And we also create new security groups that are being attached to the network associated with the LB. |
we'd need to see a repro code sample. |
There is not much to show... just a plain define()...create() call... azure.loadBalancers().define(name).withRegion(region).withExistingResourceGroup(rgName) |
It happens intermittently. And the question is really more about what can
cause this kind of exception...?
…On Wed, Sep 27, 2017 at 5:13 AM, Martin Sawicki ***@***.***> wrote:
we'd need to see a repro code sample.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1924 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHRQFWc2fUiwYWeAMcDcJEIFwj9m5ULyks5smjvTgaJpZM4PjpyU>
.
|
yep, there is nothing unique about the sample code shown here, it looks much like our test cases and samples. I haven't seen this error in this specific context, but the best guess right now is that there may be some transient issue in Azure. In general, SDK issues don't manifest themselves as retryable errors from the service, but more commonly as very consistent NPEs, or the like. Especially in cases like this, where the shown code sample doesn't do anything concurrently - this should result in just a single simple REST call to Azure networking. Other possibilities include: your app is getting throttled because of too many requests going into Azure within a sort time window. Just a guess - it'd not be related to this API specifically, but maybe there is a bunch of other things your app is going concurrently that could trigger throttling on the service side? |
Whatever the reason may be, Microsoft Azure needs to take a very close look
at fundamentally and address their architecture. We're a ISV providing
products on Azure environment and needs to be lots of explanation to
customers about the reason for all these failure.
Another issue, not related is different response to get() and list()
APIs... the IDs of objects are not the same (!)... one is cased, the other
lower-cased... applies to many APIs, namely VMs, Route Tables, Subnets,
VNETs, etc.
…On Wed, Oct 4, 2017 at 3:46 PM, Martin Sawicki ***@***.***> wrote:
yep, there is nothing unique about the sample code shown here, it looks
much like our test cases and samples. I haven't seen this error in this
specific context, but the best guess right now is that there may be some
transient issue in Azure. In general, SDK issues don't manifest themselves
as retryable errors from the service, but more commonly as very consistent
NPEs, or the like. Especially in cases like this, where the shown code
sample doesn't do anything concurrently - this should result in just a
single simple REST call to Azure networking. Other possibilities include:
your app is getting throttled because of too many requests going into Azure
within a sort time window. Just a guess - it'd not be related to this API
specifically, but maybe there is a bunch of other things your app is going
concurrently that could trigger throttling on the service side?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1924 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHRQFXRFtKHiSPsll2c2QJ4WXIvXsFURks5spArTgaJpZM4PjpyU>
.
|
yes, resource ID casings in Azure REST APIs may vary for the same resource, depending on how they are fetched. Strictly speaking, resource IDs are case-insensitive from the point of view of the service (therefore comparisons should ideally be done in case-insensitive ways), but ideally the casing should be consistent and preserved. This is a known issue. We've been advised not to try to work around that on the SDK side, but rely on the Azure backend addressing this issue eventually -- it's being looked at... Btw, in case of such intermittent issues with backend services, rather than the SDK, which looks like the most likely case here, I'd recommend submitting them to Azure Support, since they could investigate a specific failure in a specific subscription based on service logs, and could get a deeper understanding of the root cause. |
Not ideally... but literally... would be good to at list document this.
Other side comments on the design...
1) It is uncommon to use user defined names in IDs... it suggest the names
are immutable.
2) It is uncommon to be use the unique object identifier in case
insensitive.
I think Microsoft Azure should probably reconsider and either use something
like UUID which is most common in the industry. This will kill two birds in
one stone...
…On Wed, Oct 11, 2017 at 3:35 PM, Martin Sawicki ***@***.***> wrote:
yes, resource ID casings in Azure REST APIs may vary for the same
resource, depending on how they are fetched. Strictly speaking, resource
IDs are case-insensitive from the point of view of the service (therefore
comparisons should ideally be done in case-insensitive ways), but ideally
the casing should be consistent and preserved. This is a known issue. We've
been advised not to try to work around that on the SDK side, but rely on
the Azure backend addressing this issue eventually -- it's being looked
at...
Btw, in case of such intermittent issues with backend services, rather
than the SDK, which looks like the most likely case here, I'd recommend
submitting them to Azure Support, since they could investigate a specific
failure in a specific subscription based on service logs, and could get a
deeper understanding of the root cause.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1924 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHRQFVnPKGanX1r8bm8TwX92ACDx8wyGks5srUKqgaJpZM4PjpyU>
.
|
@martinsawicki , was this fixed on #1971? can we close this issue as well? |
Closing out (several year) old issues in the repo |
I observed a different flavor of this issue today. The load balancer was deployed already through the portal, and this error showed because the backend pool was still being deployed. I had to wait for it to finish before I could create the health probe without this problem. |
This happens when creating a load balancer. Second call work just fine. Seems some sort of concurrency, timing, or resource readiness constraint. Does any of the arguments for loadBalancers().define() has state requirement? We're creating resource group right before LB creation.
The text was updated successfully, but these errors were encountered: