Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-31162][yarn] Use currUsr.getCredentials.getTokens instead of currUsr.getTokens #21985

Merged
merged 1 commit into from
Feb 23, 2023

Conversation

venkata91
Copy link
Contributor

@venkata91 venkata91 commented Feb 21, 2023

... to avoid including private tokens in AM container context

What is the purpose of the change

Avoid setting private tokens to AM container context when kerberos delegation token fetch is disabled and DTs are managed.

Brief change log

Currently while setting user credentials to the AM container context, ugi.getTokens() is used which also returns the private tokens along with UGI tokens. But it should not be passed to the AM container context. This causes the launch of YARN RM app to fail in some cases for example when Consistent Reads from HDFS Observer NameNode feature is enabled.

Instead, change it to ugi.getCredentials().getAllTokens() to only get user credentials tokens. Spark uses similar way of setting the user credentials to AM container context as well.

Verifying this change

Tested this changed internally in our environment as this requires a managed way of Delegation token fetch. Also tested enabling kerberos delegation token fetch feature to make sure it doesn't regress.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

…ntials to avoid including private tokens in AM container context
@flinkbot
Copy link
Collaborator

flinkbot commented Feb 21, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@venkata91
Copy link
Contributor Author

Please take a look. cc @gaborgsomogyi @MartijnVisser @becketqin

@mridulm
Copy link

mridulm commented Feb 21, 2023

Re title: You mean 'Use currUsr.getCredentials().getAllTokens() instead of currUsr.getTokens() to avoid including private tokens' ?

Copy link

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good to me. Not sure how testable this is ...

@venkata91 venkata91 changed the title [FLINK-31162][yarn] Use currUsr.getTokens instead of currUsr.getCrede… [FLINK-31162][yarn] Use currUsr.getCredentials.getTokens instead of currUsr.getTokens Feb 21, 2023
@venkata91
Copy link
Contributor Author

Re title: You mean 'Use currUsr.getCredentials().getAllTokens() instead of currUsr.getTokens() to avoid including private tokens' ?

Oh. Yes. Thanks for pointing out.

@venkata91
Copy link
Contributor Author

Change looks good to me. Not sure how testable this is ...

Yeah not sure how we can add either unit tests or integration tests.

@MartijnVisser
Copy link
Contributor

@venkata91 This PR should first be targeted towards master before creating backports.

@venkata91
Copy link
Contributor Author

venkata91 commented Feb 21, 2023

@venkata91 This PR should first be targeted towards master before creating backports.

@MartijnVisser The issue here is different and is only specific to 1.16 and the code in master has changed quite a lot.
Since we also discussed about https://issues.apache.org/jira/browse/FLINK-31109, I am working on that as well and would raise a PR with the fix targeted towards master.

@MartijnVisser
Copy link
Contributor

The issue here is different and is only specific to 1.16

So this issue only occurs in 1.16 and not in 1.17 and later? Then we're good yes :)

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for fixing it.

@venkata91
Copy link
Contributor Author

The issue here is different and is only specific to 1.16

So this issue only occurs in 1.16 and not in 1.17 and later? Then we're good yes :)

There is some what a bigger issue in 1.17 which I am looking at as part of https://issues.apache.org/jira/browse/FLINK-31109 and address this smaller issue as well as part of it.

Copy link
Contributor

@MartijnVisser MartijnVisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MartijnVisser MartijnVisser merged commit d5c0944 into apache:release-1.16 Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants