New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files #7702
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tisonkun @wangyang0918 @tillrohrmann could you guys take a look please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added preliminary comments about config names. Could you please rebase this on top of master and add a proper commit title/message?
@@ -225,6 +226,22 @@ | |||
.noDefaultValue() | |||
.withDescription("Specify YARN node label for the YARN application."); | |||
|
|||
public static final ConfigOption<Boolean> REQUIRE_LOCALIZE_KEYTAB = | |||
key("yarn.security.kerberos.require.localize.keytab") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be called yarn.security.kerberos.ship-client-keytab
or yarn.security.kerberos.ship-local-keytab
" resource bucket with the specified relative path defined in 'yarn.security.kerberos.keytab.path'."); | ||
|
||
public static final ConfigOption<String> LOCALIZED_KEYTAB_PATH = | ||
key("yarn.security.kerberos.keytab.path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more in line with existing keytab options this could be simplified to yarn.security.kerberos.keytab
f238728
to
6849601
Compare
6849601
to
0974d56
Compare
@aljoscha thanks for the suggestions, the config key changes makes much more sense. I have adjusted them and rebased the commits. Please kindly take another look :-) |
public static final String KEYTAB_PATH = "_KEYTAB_PATH"; | ||
public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it actually necessary to store these keys in the environment, can't we directly use the Flink Configuration
to retrieve these settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch (to also response to other comments related) --> technically speaking there should only be one ConfigKey in the JM/TM perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH" - as you mentioned, it doesn't matter whether it is originated from a remote ship location or pre-installed.
I would revert the changes I've done here.
fs, | ||
appId, | ||
new Path(keytab), | ||
localResources, | ||
homeDir, | ||
"", | ||
fileReplication); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we doing duplicate work here? If we don't have a local keytab and we don't ship it, won't the configuration be already in place and the cluster entrypoint can use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Utils.KEYTAB_FILE_NAME, | ||
boolean requireLocalizedKeytab = flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB); | ||
localizedKeytabPath = flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH); | ||
if (requireLocalizedKeytab) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be moved to the outer if, i.e. keytab != null && requireLocalizedKeytab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -941,10 +950,13 @@ private ApplicationReport startAppMaster( | |||
// https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md#identity-on-an-insecure-cluster-hadoop_user_name | |||
appMasterEnv.put(YarnConfigKeys.ENV_HADOOP_USER_NAME, UserGroupInformation.getCurrentUser().getUserName()); | |||
|
|||
if (remotePathKeytab != null) { | |||
appMasterEnv.put(YarnConfigKeys.KEYTAB_PATH, remotePathKeytab.toString()); | |||
if (localizedKeytabPath != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we wouldn't have to do this if we simply used the Flink configuration for getting the config values. I think we don't need the environment variables at all and can thus simplify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, reverting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor question here. It seems user have to config KERBEROS_LOGIN_KEYTAB
. Does it make sense? If the keytab is pre-installed, I think there is no need to define KERBEROS_LOGIN_KEYTAB
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not operating under the premise that only one Keytab file is pre-installed. multiple keytab files could be pre-installed and thus user would need to choose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems user could choose it by LOCALIZED_KEYTAB_PATH
. Am I understand it correctly? If SHIP_LOCAL_KEYTAB
is false, then we do not care about the value of KERBEROS_LOGIN_KEYTAB
but we still need to set it?
if (f.exists()) { // keytab file exist in working directory. | ||
keytabPath = f.getAbsolutePath(); | ||
} else { // fall back to default keytab file | ||
f = new File(workingDirectory, Utils.DEFAULT_KEYTAB_FILE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we duplicating the default logic here? If nothing is set the default setting in the Flink Configuration
should already be DEFAULT_KEYTAB_FILE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. good catch. this piece of code was there before we use the YarnConfigOption approach. I would refine this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @aljoscha and good catch on the redundant info shipped to Flink cluster. I would revise the PR soon.
if (f.exists()) { // keytab file exist in working directory. | ||
keytabPath = f.getAbsolutePath(); | ||
} else { // fall back to default keytab file | ||
f = new File(workingDirectory, Utils.DEFAULT_KEYTAB_FILE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. good catch. this piece of code was there before we use the YarnConfigOption approach. I would refine this.
public static final String KEYTAB_PATH = "_KEYTAB_PATH"; | ||
public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good catch (to also response to other comments related) --> technically speaking there should only be one ConfigKey in the JM/TM perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH" - as you mentioned, it doesn't matter whether it is originated from a remote ship location or pre-installed.
I would revert the changes I've done here.
@@ -941,10 +950,13 @@ private ApplicationReport startAppMaster( | |||
// https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md#identity-on-an-insecure-cluster-hadoop_user_name | |||
appMasterEnv.put(YarnConfigKeys.ENV_HADOOP_USER_NAME, UserGroupInformation.getCurrentUser().getUserName()); | |||
|
|||
if (remotePathKeytab != null) { | |||
appMasterEnv.put(YarnConfigKeys.KEYTAB_PATH, remotePathKeytab.toString()); | |||
if (localizedKeytabPath != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, reverting
Utils.KEYTAB_FILE_NAME, | ||
boolean requireLocalizedKeytab = flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB); | ||
localizedKeytabPath = flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH); | ||
if (requireLocalizedKeytab) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
fs, | ||
appId, | ||
new Path(keytab), | ||
localResources, | ||
homeDir, | ||
"", | ||
fileReplication); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
…ytab files This is to change how keytab files are discovered, * remove hard-coded keytab filenames * extended YARNSessionFIFOITCase to accommodate different types of security setting test cases regenerate configuration documentations
@walterddr I have two more suggestions on top of your PR: https://github.com/aljoscha/flink/tree/pr-7702-yarn-keytab. Could you please take a look there. |
public static String resolveKeytabPath(String workingDir, String keytabPath) { | ||
String keytab = null; | ||
if (keytabPath == null) { // keytab not exist | ||
LOG.info("keytab path isn't configured!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shouldn't log here, because now all users will always get this log message, even if they don't do anything with Kerberos, that might be confusing.
@flinkbot run travis |
@flinkbot run azure |
@flinkbot run azure |
@walterddr Ah, so we were trying to read the actual config key and not the path stored in the environment under that key, right? |
good find! 👌 |
yes. surprise it didn't triggered any issue originally in the ITCase. I guess that's because there's really no changes in the file when trying to |
I merged this. Thanks a lot for the good collaboration on this. 😃 |
What is the purpose of the change
There're 2 ways of utilizing Kerberos keytab files:
Previously Flink only support method #1. This PR introduces two new configuration keys in the YARN configurations to support method #2.
Brief change log
yarn.security.kerberos.keytab
indicates where the keytab file should be on YARN container.yarn.security.kerberos.ship-local-keytab
: If set to true, Flink will upload the client keytab used in its own section to YARN local resource bucket. if set to false, Flink will assume the path configured inyarn.security.kerberos.keytab
already exists and with proper permission.Verifying this change
This change is already covered by existing tests in
flink-yarn-test
component.This change also added tests
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation