Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use Kerberos when LIGHTER_YARN_KERBEROS environment variables not provided #39

Closed
EmilK322 opened this issue Apr 5, 2022 · 4 comments

Comments

@EmilK322
Copy link
Contributor

EmilK322 commented Apr 5, 2022

When passing LIGHTER_YARN_KERBEROS_PRINCIPAL and LIGHTER_YARN_KERBEROS_KEYTAB
Lighter pass spark.kerberos.principal and spark.kerberos.keytab configs to spark which is expected.
Those environment variables allow running Spark with Kerberos but with only a single user whose credentials were provided during the startup of Lighter.

I tried to run Lighter without those environment variables and provide spark.kerberos.principal and spark.kerberos.keytab
during the HTTP Batch request but it fails with a message that only TOKEN or KERBEROS can be used instead of PLAIN.
The error message: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
The only difference in the code is that when env vars provided lighter set hadoop.kerberos.keytab.login.autorenewal.enabled to true.

I didn't try to change this yet but my proposal is:
Allow submitting applications with different users provided in the request body if Kerberos env vars are not provided,
when provided the credential in the env vars will take precedence over the credentials in the request body.

This can be a great feature provided by Lighter.

@EmilK322 EmilK322 changed the title Cannot use Kerberos when LIGHTER_YARN_ENABLED_KERBEROS environment variables not provided Cannot use Kerberos when LIGHTER_YARN_KERBEROS environment variables not provided Apr 5, 2022
@pdambrauskas
Copy link
Collaborator

Submit properties from API request, overrides those added by Lighter Execution backend, now that you mentioned it, I think it should work in a different way, user should not be able to override spark.yarn.tags and similar things, since we use those for job status tracking and other job management tasks. We should change the order of these lines https://github.com/exacaster/lighter/blob/master/server/src/main/java/com/exacaster/lighter/spark/SparkApp.java#L39...L40.

In that case Kerberos env variables will take precedence over the ones provided in request.

However, Kerberos configs are used not only for spark submits, but also for job management through YarnClient. Now we use the same credentials when creating Yarn Client, and when submit spark application.

We should probably make it possible to use different credentials for Yarn job management and Spark application properties, or add some configuration property for user to choose, if provided Kerberos configuration should be used for Spark applications as-well?

@Minutis
Copy link
Member

Minutis commented Apr 6, 2022

@EmilK322 Thank you for bringing this issue.

Lighter itself needs to have service level authentification in order to track jobs. These settings should be set during Lighter startup. This is why the @EmilK322 proposal is not optimal.

At the moment Lighter does not have an option to provide different auth properties for jobs. Maybe we should add a different set of properties to have static job authentication creds (loaded during Lighter startup) for jobs. These props could be:

  • overwritten with payload properties - this would allow submitting different jobs with different authentification properties.
  • optional - if job auth properties are not set lighter service authentification will be used (unless overwritten in the payload)

@pdambrauskas @EmilK322 what do you, guys, think about this approach?

@pdambrauskas
Copy link
Collaborator

Sounds ok to me.

@EmilK322
Copy link
Contributor Author

EmilK322 commented Apr 7, 2022

Sounds good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants