Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-12869] Add yarn acls capability to flink containers #8760

Closed
wants to merge 3 commits into from

Conversation

ashangit
Copy link
Contributor

What is the purpose of the change

Provide yarn application acls mechanism on flink containers to be able to provide specific rights to other users than the one running the job (view logs through the resourcemanager/job history, kill the application)

Brief change log

  • Add 2 parameter (yarn.view.acls and yarn.admin.acls and set ApplicationACLs for all containers

Verifying this change

This change added tests and can be verified as follows:

  • Manually verified the change by running a WordCount job on a yarn cluster with acls enabled. Verifying that without yarn.admin.acls a user not running the app can't kill the app. Then relaunching the app providing rights to an other user (setting yarn.admin.acls) and check that the specified user can kill the app. The same can be done for yarn.view.acls by checking that the user can now access the application overview from the resourcemanager (<resourcemanager_url>/cluster/app/<app_id>)

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: Affect acls on YARN (currently not managed so no rights can be added)
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 17, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit f0cc6b6 (Wed Dec 04 14:57:15 UTC 2019)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Sep 18, 2019

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build

Copy link
Member

@GJL GJL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution to Apache Flink @ashangit! Sorry that the review took so long.

Currently, the change is untested. Do you think it makes sense to add a new test to YARNSessionFIFOSecuredITCase?

@GJL
Copy link
Member

GJL commented Sep 25, 2019

@ashangit Do you still want to work on this?

@ashangit
Copy link
Contributor Author

@GJL will look at how I can add some test this week or the next

@GJL
Copy link
Member

GJL commented Sep 26, 2019

@ashangit Alright, thanks for following up.

@GJL
Copy link
Member

GJL commented Oct 2, 2019

Thanks for updating the PR. I'll have another look

@GJL
Copy link
Member

GJL commented Oct 4, 2019

@flinkbot run travis

Copy link
Member

@GJL GJL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some minor comments. Please let me know what you think.

for (int nmId = 0; nmId < NUM_NODEMANAGERS; nmId++) {
NodeManager nm = yarnCluster.getNodeManager(nmId);
nm.getNMContext().getContainers().forEach((k, v) -> {
containers.put(k, v.getLaunchContext().getApplicationACLs());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should go into YARNSessionFIFOSecuredITCase. It's not likely that it will be used elsewhere.

Moreover, you are using a functional style (forEach) with side effects. However, I think when programming in a functional style, one should just return the result. For example:

	private static Map<ContainerId, Map<ApplicationAccessType, String>> getRunningContainersAcls() {
		return nodeManagersStream()
			.flatMap(toContainersStream())
			.collect(Collectors.toMap(
				Map.Entry::getKey,
				entry -> getApplicationACLs(entry.getValue())));
	}

	private static Stream<NodeManager> nodeManagersStream() {
		return IntStream
			.range(0, NUM_NODEMANAGERS)
			.mapToObj(i -> yarnCluster.getNodeManager(i));
	}

	private static Function<NodeManager, Stream<Map.Entry<ContainerId, Container>>> toContainersStream() {
		return nodeManager -> nodeManager.getNMContext().getContainers().entrySet().stream();
	}

	private static Map<ApplicationAccessType, String> getApplicationACLs(final Container container) {
		return container.getLaunchContext().getApplicationACLs();
	}

@@ -263,6 +265,13 @@ private static LocalResource registerLocalResource(FileSystem fs, Path remoteRsr
return localResource;
}

public static void setAclsFor(ContainerLaunchContext amContainer, org.apache.flink.configuration.Configuration flinkConfig) {
amContainer.setApplicationACLs(new HashMap<ApplicationAccessType, String>(){{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because double brace initialization has some caveats, I wouldn't use it.

@@ -263,6 +265,13 @@ private static LocalResource registerLocalResource(FileSystem fs, Path remoteRsr
return localResource;
}

public static void setAclsFor(ContainerLaunchContext amContainer, org.apache.flink.configuration.Configuration flinkConfig) {
amContainer.setApplicationACLs(new HashMap<ApplicationAccessType, String>(){{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would feel more comfortable if we only set ApplicationAccessType.VIEW_APP and ApplicationAccessType.MODIFY_APP, if the user actually configured application ACLs. Also, can we make the default values of the new config options null? What do you think?

/**
* Users and groups to give MODIFY access.
*/
public static final ConfigOption<String> APPLICATION_ADMIN_ACLS =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about renaming this to APPLICATION_MODIFY_ACLS so that it is aligned with Hadoop. What do you think?

@@ -578,6 +579,17 @@ public static int getRunningContainers() {
return count;
}

public static HashMap<ContainerId, Map<ApplicationAccessType, String>> getRunningContainersAcls() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to declare Map<...> as return type.

@GJL
Copy link
Member

GJL commented Oct 25, 2019

@ashangit Can I get feedback on the comments I have left? I can finish the remaining work, if you do not have time to work on it.

@GJL
Copy link
Member

GJL commented Nov 4, 2019

Closing due to inactivity. Feel free to reopen if you find time to work on this. As written above, I can also finalize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants