New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-13603: do not propagate ExecutionException and add maxRetries li… #6774
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
…mit to NameNode edek cache warmup
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned will changing the contract of warmUpEncryptedKeys
from
exception if any key isn't initialized
to
exception only if all keys can't be initialized
You can still like the number of retries, without changing the above.
@@ -269,12 +269,23 @@ public ValueQueue(final int numValues, final float lowWaterMark, long expiry, | |||
* Initializes the Value Queues for the provided keys by calling the | |||
* fill Method with "numInitValues" values | |||
* @param keyNames Array of key Names | |||
* @throws ExecutionException executionException. | |||
* @throws IOException if no successful initialization for any key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording here is confusing. One way to read this is if any key. fails to initialize, then an except will be thrown. But IIUC an exception will be thrown if all keys fail to initialize.
@@ -537,12 +537,12 @@ static boolean isInAnEZ(final FSDirectory fsd, final INodesInPath iip) | |||
* then launch up a separate thread to warm them up. | |||
*/ | |||
static void warmUpEdekCache(final ExecutorService executor, | |||
final FSDirectory fsd, final int delay, final int interval) { | |||
final FSDirectory fsd, final int delay, final int interval, final int maxRetries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you edit a comment in the function documentation to indicate that the warm up is best effort.
success = true; | ||
break; | ||
} catch (IOException ioe) { | ||
lastSeenIOE = ioe; | ||
if (sinceLastLog >= logCoolDown) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sinceLastLog
is no longer really used now. You can just print the failure since the retry count is limited.
|
||
verify(kpMock, times(maxRetries)).warmUpEncryptedKeys(any()); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test cache in which one or some of the keys are successfully warmed up, while others aren't.
} | ||
|
||
if (keyNames.length > 0 && successfulInitializations == 0) { | ||
throw new IOException("Failed to initialize any queue for the provided keys.", lastException); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you've made warm up a best error operation. If so, there should be no need to through an exception here. Just logging a warning should be enough.
} | ||
sinceLastLog += retryInterval; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not get updated since initial settings. Shall we add it back or remove its usages completely as Simba said?
Description of PR
JIRA = HDFS-13603
The ekek cache warm up thread should not fail the whole warmup of other keys if an invalid key is encountered.
We have observed infinite retries to KMS if one of Encryption Key is not available.
Change it to
How was this patch tested?
Added unit test TestFSDirEncryptionZoneOp for retry behavior
Related unit tests
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files? NA