-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] ExpiredAuthenticationToken happened when deleting resource groups #33112
Comments
@weidongxu-microsoft could you please follow up with @waynewang1989 |
@XiaofeiCao for investigation. It might relate to or caused by azure-identity. azure-identity should have refreshed the token 5min to 10min before expiration time. |
@waynewang1989 Which version of |
@XiaofeiCao, Please check: |
Hi @waynewang1989 , I tried expiring the token, but current SDK successfully refreshed it. To help nagivate where the bug could be, I went through the auth flow in our SDK and summarized a very simple version of it: flowchart LR
A[HttpPipeline] --> B{Local token cache exists?}
B -- Yes --> C{Should refresh cache?}
C -- Yes --> D[Get token and cache it]
C -- No ----> E
D --> E[Execute request using token]
E --> F[Get response]
B -- No ----> D
And in // compares local time to the time of expiry minus REFRESH_OFFSET(5min)
OffsetDateTime.now()
.isAfter(accessToken.getExpiresAt().minus(REFRESH_OFFSET) There maybe a chance that your local time deviates from the server, e.g. your local time is 5:00 pm while server is 5:24, and the token expire time is 5:10. We can first starting by eliminating this possibility. Could you also log the time the exception is thrown? You can use the following code: AzureResourceManager azureResourceManager = AzureResourceManager
.configure()
.withLogOptions(new HttpLogOptions()
.setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS)
.setResponseLogger((logger, loggingOptions) -> {
final HttpResponse response = loggingOptions.getHttpResponse();
String contentLengthString = response.getHeaderValue("Content-Length");
String bodySize = (CoreUtils.isNullOrEmpty(contentLengthString))
? "unknown-length body"
: contentLengthString + "-byte body";
StringBuilder responseLogMessage = new StringBuilder();
responseLogMessage
// log the time of the response
.append("[")
.append(OffsetDateTime.now().format(DateTimeFormatter.RFC_1123_DATE_TIME))
.append("]")
.append("<-- ")
.append(response.getStatusCode())
.append(" ")
.append(response.getRequest().getUrl())
.append(" (")
.append(loggingOptions.getResponseDuration().toMillis())
.append(" ms, ")
.append(bodySize)
.append(")")
.append(System.lineSeparator());
HttpResponse bufferedResponse = response.buffer();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
WritableByteChannel bodyContentChannel = Channels.newChannel(outputStream);
return bufferedResponse.getBody()
.flatMap(byteBuffer -> {
try {
bodyContentChannel.write(byteBuffer.duplicate());
return Mono.just(byteBuffer);
} catch (IOException ex) {
return Mono.error(ex);
}
})
.doFinally(ignored -> {
responseLogMessage.append(", Response body:")
.append(System.lineSeparator())
.append(outputStream.toString(StandardCharsets.UTF_8))
.append(System.lineSeparator())
.append(" <-- END HTTP");
logger.info(responseLogMessage.toString());
}).then(Mono.just(bufferedResponse));
})
)
.authenticate(credential, profile) |
Thanks, Xiaofei, This method could generate too many logs which are too heavy for the logging system. All our services are deployed in Azure AKS clusters and I checked that the time is synced. I don't think there exists any difference between the Azure compute service and the k8s pods. CMIIAW, I think from the Azure SDK client side,
One question: |
Thanks @waynewang1989 for the confirmation and proposal, they makes total sense to me. We may not rush to the retry logic before we get to know the root cause of the issue. There is even a chance that service backend return an expired token. As to your question, I haven't found a way to refresh the token myself... Will get back to you once I do. |
Thanks, @XiaofeiCao Looking forward to hearing from you! |
Hi @waynewang1989 ,@wangwenbj , You could configure SDK to retry on // configure AzureResourceManager
AzureResourceManager
.configure()
.withLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS))
.withHttpClient(httpClient)
// retry policy
.withRetryPolicy(new RetryPolicy(new ExponentialBackoff() {
@Override
public boolean shouldRetry(HttpResponse httpResponse) {
boolean isExpiredToken = isExpiredAuthenticationToken(httpResponse);
if (isExpiredToken) {
// Do some log here
LOGGER.error("Token expired. \nMessage: {}.\nCurrent UTC time: {}",
httpResponse.getBodyAsString().block(),
OffsetDateTime.now().atZoneSameInstant(ZoneOffset.UTC)
.format(DateTimeFormatter.ofPattern("d/M/yyyy h:mm:ss a")));
} else if (httpResponse.getHeaderValue(WWW_AUTHENTICATE) != null) {
// in case a Conditional Access policy change, log it:
// https://learn.microsoft.com/en-us/azure/active-directory/conditional-access/concept-continuous-access-evaluation
LOGGER.warning("Conditional Access state changed, header: {}", httpResponse.getHeaderValue(WWW_AUTHENTICATE));
}
return super.shouldRetry(httpResponse)
|| isExpiredToken;
}
}))
// check
private boolean isExpiredAuthenticationToken(HttpResponse httpResponse) {
return
// 401
httpResponse.getStatusCode() == HttpURLConnection.HTTP_UNAUTHORIZED
// no ARM Challenge present
&& httpResponse.getHeaderValue("WWW-Authenticate") == null
// contains error code "ExpiredAuthenticationToken"
&& httpResponse.getBodyAsBinaryData() != null
&& httpResponse.getBodyAsBinaryData().toString().contains("ExpiredAuthenticationToken");
} Mock test with retry: https://github.com/XiaofeiCao/ioexception_repro/blob/main/src/test/java/com/azure/resourcemanager/repro/ioexception/test/expiredtoken/LongRunningOperationTokenExpiredTests.java#L96 Mock test to reproduce with 100 parallelism but failed: |
Hi wen, Let's wait for more logs from above retry. Thing is, if service decides to return an expired token to us, manual or auto refresh from SDK is meaningless. |
Thank you for your feedback. This has been routed to the support team for assistance. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @armleads-azure. Issue DetailsDescribe the bug Exception or Stack Trace To Reproduce Code Snippet Expected behavior Information Checklist
|
@armleads-azure Could you please look into this once you get a chance ? Thanks in advance. |
@navba-MSFT ARM doesn't generate auth tokens, AAD does. If the assumption here is that an expired JWT token is being provided to a user then AAD/identity is the right service contact. I don't know how they are represented in github. |
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @adamedx. Issue DetailsDescribe the bug Exception or Stack Trace To Reproduce Code Snippet Expected behavior Information Checklist
|
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @adamedx. Issue DetailsDescribe the bug Exception or Stack Trace To Reproduce Code Snippet Expected behavior Information Checklist
|
Adding @adamedx team to look in to this. |
Describe the bug
"code":"ExpiredAuthenticationToken","message":"The access token expiry UTC time '1/18/2023 5:10:42 PM' is earlier than current UTC time '1/18/2023 5:24:09 PM'."}}": The access token expiry UTC time '1/18/2023 5:10:42 PM' is earlier than current UTC time '1/18/2023 5:24:09 PM'
Exception or Stack Trace
The access token expiry UTC time '1/18/2023 5:10:42 PM' is earlier than current UTC time '1/18/2023 5:24:09 PM'."}}": The access token expiry UTC time '1/18/2023 5:10:42 PM' is earlier than current UTC time '1/18/2023 5:24:09 PM
To Reproduce
Deleting resource group encountering this issue once.
Code Snippet
azureResourceManager.resourceGroups().deleteByName(name)
Expected behavior
No access token issue populated.
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: