Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-11088][Security][YARN] Allow YARN to discover pre-installed keytab files #7702

Closed
wants to merge 7 commits into from

Conversation

walterddr
Copy link
Contributor

@walterddr walterddr commented Feb 14, 2019

What is the purpose of the change

There're 2 ways of utilizing Kerberos keytab files:

  1. Flink client upload Kerberos keytab files through Yarn local resource bucket.
  2. Flink YARN containers directly load pre-installed Kerberos keytab files from local file system.

Previously Flink only support method #1. This PR introduces two new configuration keys in the YARN configurations to support method #2.

Brief change log

  • Added two new key with default values
    • yarn.security.kerberos.keytab indicates where the keytab file should be on YARN container.
    • yarn.security.kerberos.ship-local-keytab: If set to true, Flink will upload the client keytab used in its own section to YARN local resource bucket. if set to false, Flink will assume the path configured in yarn.security.kerberos.keytab already exists and with proper permission.
  • Changed YarnClusterDescriptor to work with 2 options above.
  • Changed the YarnTaskExecutorRunner to load keytab configurations differently according to 2 options above.

Verifying this change

This change is already covered by existing tests in flink-yarn-test component.

This change also added tests

  • added additional test config parsing in YarnTaskExecutorRunnerTest.
  • Modified YARNSessionFIFOITCase and YARNSessionFIFOSecuredITCase to allow dynamic properties loading during test sections.
  • Added specific section for pre-installed YARN Kerberos keytab file.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: yes
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? document regenerated

@flinkbot
Copy link
Collaborator

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❌ 1. The [description] looks good.
  • ❌ 2. There is [consensus] that the contribution should go into to Flink.
  • ❔ 3. Needs [attention] from.
  • ❌ 4. The change fits into the overall [architecture].
  • ❌ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot approve description to approve the 1st aspect (similarly, it also supports the consensus, architecture and quality keywords)
  • @flinkbot approve all to approve all aspects
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 8, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor Author

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tisonkun @wangyang0918 @tillrohrmann could you guys take a look please?

@aljoscha aljoscha self-requested a review February 10, 2020 16:38
@aljoscha aljoscha self-assigned this Feb 10, 2020
Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added preliminary comments about config names. Could you please rebase this on top of master and add a proper commit title/message?

@@ -225,6 +226,22 @@
.noDefaultValue()
.withDescription("Specify YARN node label for the YARN application.");

public static final ConfigOption<Boolean> REQUIRE_LOCALIZE_KEYTAB =
key("yarn.security.kerberos.require.localize.keytab")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be called yarn.security.kerberos.ship-client-keytab or yarn.security.kerberos.ship-local-keytab

" resource bucket with the specified relative path defined in 'yarn.security.kerberos.keytab.path'.");

public static final ConfigOption<String> LOCALIZED_KEYTAB_PATH =
key("yarn.security.kerberos.keytab.path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more in line with existing keytab options this could be simplified to yarn.security.kerberos.keytab

@walterddr
Copy link
Contributor Author

@aljoscha thanks for the suggestions, the config key changes makes much more sense. I have adjusted them and rebased the commits. Please kindly take another look :-)

public static final String KEYTAB_PATH = "_KEYTAB_PATH";
public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually necessary to store these keys in the environment, can't we directly use the Flink Configuration to retrieve these settings?

Copy link
Contributor Author

@walterddr walterddr Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch (to also response to other comments related) --> technically speaking there should only be one ConfigKey in the JM/TM perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH" - as you mentioned, it doesn't matter whether it is originated from a remote ship location or pre-installed.

I would revert the changes I've done here.

fs,
appId,
new Path(keytab),
localResources,
homeDir,
"",
fileReplication);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we doing duplicate work here? If we don't have a local keytab and we don't ship it, won't the configuration be already in place and the cluster entrypoint can use it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Utils.KEYTAB_FILE_NAME,
boolean requireLocalizedKeytab = flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB);
localizedKeytabPath = flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH);
if (requireLocalizedKeytab) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be moved to the outer if, i.e. keytab != null && requireLocalizedKeytab

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -941,10 +950,13 @@ private ApplicationReport startAppMaster(
// https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md#identity-on-an-insecure-cluster-hadoop_user_name
appMasterEnv.put(YarnConfigKeys.ENV_HADOOP_USER_NAME, UserGroupInformation.getCurrentUser().getUserName());

if (remotePathKeytab != null) {
appMasterEnv.put(YarnConfigKeys.KEYTAB_PATH, remotePathKeytab.toString());
if (localizedKeytabPath != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we wouldn't have to do this if we simply used the Flink configuration for getting the config values. I think we don't need the environment variables at all and can thus simplify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, reverting

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor question here. It seems user have to config KERBEROS_LOGIN_KEYTAB. Does it make sense? If the keytab is pre-installed, I think there is no need to define KERBEROS_LOGIN_KEYTAB anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not operating under the premise that only one Keytab file is pre-installed. multiple keytab files could be pre-installed and thus user would need to choose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems user could choose it by LOCALIZED_KEYTAB_PATH. Am I understand it correctly? If SHIP_LOCAL_KEYTAB is false, then we do not care about the value of KERBEROS_LOGIN_KEYTAB but we still need to set it?

if (f.exists()) { // keytab file exist in working directory.
keytabPath = f.getAbsolutePath();
} else { // fall back to default keytab file
f = new File(workingDirectory, Utils.DEFAULT_KEYTAB_FILE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we duplicating the default logic here? If nothing is set the default setting in the Flink Configuration should already be DEFAULT_KEYTAB_FILE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. good catch. this piece of code was there before we use the YarnConfigOption approach. I would refine this.

Copy link
Contributor Author

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @aljoscha and good catch on the redundant info shipped to Flink cluster. I would revise the PR soon.

if (f.exists()) { // keytab file exist in working directory.
keytabPath = f.getAbsolutePath();
} else { // fall back to default keytab file
f = new File(workingDirectory, Utils.DEFAULT_KEYTAB_FILE);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. good catch. this piece of code was there before we use the YarnConfigOption approach. I would refine this.

public static final String KEYTAB_PATH = "_KEYTAB_PATH";
public static final String REMOTE_KEYTAB_PATH = "_REMOTE_KEYTAB_PATH";
Copy link
Contributor Author

@walterddr walterddr Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch (to also response to other comments related) --> technically speaking there should only be one ConfigKey in the JM/TM perspective: the "REMOTE_KEYTAB_PATH", or previously known as "KEYTAB_PATH" - as you mentioned, it doesn't matter whether it is originated from a remote ship location or pre-installed.

I would revert the changes I've done here.

@@ -941,10 +950,13 @@ private ApplicationReport startAppMaster(
// https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md#identity-on-an-insecure-cluster-hadoop_user_name
appMasterEnv.put(YarnConfigKeys.ENV_HADOOP_USER_NAME, UserGroupInformation.getCurrentUser().getUserName());

if (remotePathKeytab != null) {
appMasterEnv.put(YarnConfigKeys.KEYTAB_PATH, remotePathKeytab.toString());
if (localizedKeytabPath != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, reverting

Utils.KEYTAB_FILE_NAME,
boolean requireLocalizedKeytab = flinkConfiguration.getBoolean(YarnConfigOptions.SHIP_LOCAL_KEYTAB);
localizedKeytabPath = flinkConfiguration.getString(YarnConfigOptions.LOCALIZED_KEYTAB_PATH);
if (requireLocalizedKeytab) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

fs,
appId,
new Path(keytab),
localResources,
homeDir,
"",
fileReplication);
} else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Rong Rong and others added 3 commits March 11, 2020 15:52
…ytab files

This is to change how keytab files are discovered,
    * remove hard-coded keytab filenames
    * extended YARNSessionFIFOITCase to accommodate different types of security setting test cases regenerate configuration documentations
@aljoscha
Copy link
Contributor

@walterddr I have two more suggestions on top of your PR: https://github.com/aljoscha/flink/tree/pr-7702-yarn-keytab. Could you please take a look there.

public static String resolveKeytabPath(String workingDir, String keytabPath) {
String keytab = null;
if (keytabPath == null) { // keytab not exist
LOG.info("keytab path isn't configured!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't log here, because now all users will always get this log message, even if they don't do anything with Kerberos, that might be confusing.

@walterddr
Copy link
Contributor Author

@flinkbot run travis

@walterddr
Copy link
Contributor Author

@flinkbot run azure

@walterddr
Copy link
Contributor Author

@flinkbot run azure

@aljoscha
Copy link
Contributor

@walterddr Ah, so we were trying to read the actual config key and not the path stored in the environment under that key, right?

@aljoscha
Copy link
Contributor

good find! 👌

@walterddr
Copy link
Contributor Author

walterddr commented Mar 13, 2020

yes. surprise it didn't triggered any issue originally in the ITCase. I guess that's because there's really no changes in the file when trying to setupLocalResource with miniYarn on the exact same local file system.

@aljoscha
Copy link
Contributor

I merged this. Thanks a lot for the good collaboration on this. 😃

@aljoscha aljoscha closed this Mar 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants